Job Robots Troubleshooting
Description
All jobRobots are aborted in our farm, looking at the page
http://jobrobot.web.cern.ch/JobRobot/aborted_081019.html#T2_BR_SPRACE
BrokerHelper: no compatible resources
request expired
First we checked some corruption in our CMSSW installation, running a crab using the same version of CMSSW pointed in
http://jobrobot.web.cern.ch/JobRobot/summary_081019.html
following instructions at
http://www.sprace.org.br/Twiki/bin/view/Main/EntryDescriptionNo53
May be this error is relatade with an ambiguous BDII publication due a requirement that makes the matchmaking to fail, actually
Member("osg-se.sprace.org.br",other.GlueCESEBindGroupSEUniqueID)
In our BDII was:
objectClass: GlueSchemaVersion
GlueCESEBindGroupCEUniqueID: osg-ce.sprace.org.br:2119/jobmanager-condor-cms
GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br
GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br
GlueSchemaVersionMajor: 1
Note that
GlueCESEBindGroupSEUniqueID: osg-se.sprace.org.br appears twice. To remove it, we need to fix GIP that collects information to CEMon.
Changing directly the file
/OSG/gip/var/ldif/osg-info-static-cesebind.ldif
seems that it doesn't work.
So, we changed the gip-attributes that is read by configure_gip to make this file
vim /OSG/monitoring/gip-attributes.conf
OSG_GIP_DISK="0"
/OSG/vdt/setup/configure_gip
And it can be checked to work with
ldapsearch -x -LLL -p 2170 -h lcg-bdii.cern.ch -b mds-vo-name=SPRACE,mds-vo-name=local,o=grid > jobrobot.txt
Updates
Fulano em dd/mm/aaaa
Coloca o que fez.
Ciclano em dd/mm/aaaa
Mais comentarios
--
MarcoAndreFerreiraDias - 19 Oct 2008