Tags:
create new tag
,
view all tags
---+ Upgrade to OSG 1.0 We started our upgrade to 1.0: * stoping our gatekeeper, in our computing element <pre> /etc/init.d/xinetd stop </pre> * In our Computing element, getting pacman <pre> cp -a /opt/osg-0.8.0 /home/mdias/osg_backup0.8 cd /opt wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.26.tar.gz tar --no-same-owner -xzvf pacman-3.26.tar.gz cd pacman-3.26 source setup.sh </pre> now, <pre> cd /opt mkdir osg-1.0.0 cd /opt/osg-1.0.0 vdt-control --off disabling init service tomcat-55... ok disabling cron service gratia-condor... ok disabling init service condor... ok disabling init service syslog-ng... ok disabling init service tomcat-5... ok disabling init service osg-rsv... ok disabling init service apache... ok disabling init service condor-devel... ok disabling cron service vdt-update-certs... ok disabling init service MLD... ok disabling cron service gums-host-cron... ok disabling cron service edg-mkgridmap... ok disabling init service globus-ws... ok disabling init service mysql... ok disabling inetd service gsiftp... ok disabling inetd service globus-gatekeeper... ok disabling init service gris... ok disabling cron service vdt-rotate-logs... ok disabling cron service fetch-crl... ok export OLD_VDT_LOCATION=/opt/osg-0.8.0 pacman -get OSG:ce source setup.sh pacman -get OSG:Globus-Condor-Setup </pre> the last step install condor batch. We have to inspect <pre> vim /opt/osg-1.0.0/condor/etc/condor_config </pre> Managed fork issues: installing and configuring it <pre> pacman -get OSG:ManagedFork $VDT_LOCATION/vdt/setup/configure_globus_gatekeeper --managed-fork y --server y </pre> Gums server: <pre> pacman -get OSG:gums </pre> Monalisa <pre> vim /opt/osg-1.0.0/MonaLisa/Service/VDTFarm/ml.properties $VDT_LOCATION/vdt/setup/configure_monalisa --prompt vim /opt/osg-1.0.0/MonaLisa/Service/CMD/ml_env JAVA_HOME=/opt/osg-1.0.0/jdk1.5 <-------was jdk1.4 FARM_NAME=SPRACE </pre> (Look this after services are on. It was changed.) Gums host cron <pre> vdt-register-service -name gums-host-cron --enable vdt-register-service: updated cron service 'gums-host-cron' vdt-register-service: desired state = enable vdt-register-service: cron time = '9 8,14,20,2 * * *' vdt-register-service: cron command = '/opt/osg-1.0.0/gums/scripts/gums-host-cron' vdt-control --enable gums-host-cron $VDT_LOCATION/gums/scripts/gums-host-cron </pre> some gums configuration <pre> cp /opt/osg-1.0.0/post-install/prima-authz.conf /etc/grid-security/. cp /opt/osg-1.0.0/post-install/gsi-authz.conf /etc/grid-security/. /opt/osg-1.0.0/tomcat/v55/webapps/gums/WEB-INF/scripts/gums-add-mysql-admin "/DC=org/DC=doegrids/OU=People/CN=Marco Dias 280904" </pre> We have to inspect these files <pre> $VDT_LOCATION/vdt-app-data/gums/config/gums.config $VDT_LOCATION/gums/config/gums-client.properties </pre> Reusing the old configuration <pre> cd /opt/osg-1.0.0/monitoring source ../setup.sh export OLD_VDT_LOCATION=/opt/osg-0.8.0/ ./configure-osg.py -e vim extracted-config.ini </pre> we can test it and repair <pre> ./configure-osg.py -v -f ./extracted-config.ini cp /opt/osg-0.8.0/monitoring/grid3-user-vo-map.txt /opt/osg-1.0.0/monitoring/. vim /opt/osg-1.0.0/monitoring/extracted-config.ini </pre> installing it. <pre> ./configure-osg.py -c -f ./extracted-config.ini </pre> removing and creating simbolic links <pre> clcmd umount /OSG/ unlink /OSG ln -s /opt/osg-1.0.0 /OSG vim /etc/exports /etc/init.d/nfs restart </pre> Set Globus-Base-WSGRAM-Server <pre> visudo /opt/osg-1.0.0/vdt/setup/configure_prima_gt4 --enable --gums-server osg-ce.sprace.org.br /etc/init.d/globus-ws stop /etc/init.d/globus-ws start </pre> Condor-cron <pre> vim /opt/osg-1.0.0/condor-cron/local.osg-ce/condor_config.local chown condor: /var/lock/subsys/condor-cron/ </pre> we also add in our <pre> vim $VDT_LOCATION/condor-cron/etc/condor_config file NETWORK_INTERFACE = 192.168.1.150 </pre> Turning on services <pre> vdt-control --on enabling cron service fetch-crl... ok enabling cron service vdt-rotate-logs... ok enabling cron service vdt-update-certs... ok skipping init service 'gris' -- marked as disabled enabling inetd service globus-gatekeeper... ok enabling inetd service gsiftp... ok enabling init service mysql... ok enabling init service globus-ws... ok skipping cron service 'edg-mkgridmap' -- marked as disabled enabling cron service gums-host-cron... ok enabling init service MLD... ok enabling init service condor-cron... ok enabling init service apache... ok skipping init service 'osg-rsv' -- marked as disabled enabling init service tomcat-55... ok enabling init service syslog-ng-sender... ok enabling init service condor... ok enabling cron service gratia-condor... ok </pre> Configurating CEmon and seeing if GIP is working <pre> $VDT_LOCATION/vdt/setup/configure_cemon --consumer https://osg-ress-1.fnal.gov:8443/ig/services/CEInfoCollector --topic OSG_CE $VDT_LOCATION/verify-gip-for-cemon/verify-gip-for-cemon-wrapper </pre> Some notes: gums used the old configuration, but gets the old password also, in your gums.config <pre> hibernate.connection.password= </pre> so we had to inspect the vdt-install.log, looking where did it set the password and insert in that field. An issue with condor: condor_status ok, but condor_q freezes. We suspected a port conflict with condor-cron. We configurated: <pre> COLLECTOR_HOST = $(CONDOR_HOST):9619 #was 9618 HIGHPORT = 65100 #commentaded 9700 LOWPORT = 65001 #commented 9600 CREDD_PORT = 9622 #was 9620 STORK_PORT = 9623 #was 9621 </pre> and in our condor <pre> HIGHPORT = 65000 LOWPORT = 63001 </pre> but the real problem was a bug in 7.0.2 version. We installed again <pre> cd /OSG/ cp -a condor condor.old cd condor rm -rf * cd .. wget http://parrot.cs.wisc.edu//symlink/20080702041503/7/7.0/7.0.3/19811b56762d4e4ed3ea991b9a341232/condor-7.0.3-linux-x86-rhel3.tar.gz tar -xvzf condor-7.0.3-linux-x86-rhel3.tar.gz cd condor-7.0.3 ./condor_configure --install --maybe-daemon-owner --make-personal-condor --install-log ../post-install/README --install-dir /OSG/condor cd /OSG/condor/etc/ mv condor_config condor_config.bck cp /OSG/condor.old/etc/condor_config . </pre> * Installing RSV <pre> cd /opt/osg-1.0.0/monitoring/ vim extracted-config.ini [RSV] enabled = True rsv_user = mdias enable_ce_probes = True ce_hosts = osg-ce.sprace.org.br enable_gridftp_probes = True gridftp_dir = /tmp enable_gums_probes = False gums_hosts = osg-ce.sprace.org.br enable_srm_probes = True srm_hosts = osg-se.sprace.org.br srm_dir = /pnfs/sprace.org.br/data/mdias use_service_cert = False proxy_file = /tmp/x509up_u537 enable_gratia = True print_local_time = True setup_rsv_nagios = False setup_for_apache = True </pre> now start services <pre> vdt-control --off ./configure-osg.py -c -f ./extracted-config.ini vdt-control --on </pre> and check it <pre> condor_cron_q tail -f $VDT_LOCATION/osg-rsv/logs/consumers/gratia-script-consumer.out </pre> also looking at https://osg-ce.sprace.org.br:8443/rsv * In our nodes, stop condor <pre> clcmd /etc/init.d/condor stop clcmd mount /OSG </pre> in a NFS shared directory, <pre> cd /home/mdias wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.26.tar.gz tar --no-same-owner -xzvf pacman-3.26.tar.gz </pre> and then We've created a script like that: <pre> source /opt/OSG-wn-client/setup.sh vdt-control --off cd /home/mdias/pacman-3.26 source setup.sh mv /opt/OSG-wn-client /opt/OSG-wn-client.old mkdir /opt/OSG-wn-client cd /opt/OSG-wn-client VDTSETUP_AGREE_TO_LICENSES=y export VDTSETUP_AGREE_TO_LICENSES VDTSETUP_INSTALL_CERTS=l export VDTSETUP_INSTALL_CERTS VDTSETUP_EDG_CRL_UPDATE=n export VDTSETUP_EDG_CRL_UPDATE VDTSETUP_ENABLE_ROTATE=y export VDTSETUP_ENABLE_ROTATE VDTSETUP_CA_CERT_UPDATER=n export VDTSETUP_CA_CERT_UPDATER pacman -trust-all-caches -get OSG:wn-client mv /var/log/glexec /var/log/glexec.old mv /etc/glexec /etc/glexec.old mkdir /opt/glexec cd /opt/glexec pacman -trust-all-caches -get http://vdt.cs.wisc.edu/vdt_181_cache:Glexec sed -i 's/yourmachine.yourdomain/osg-ce.sprace.org.br/g' /etc/glexec/contrib/gums_interface/getmapping.cfg source setup.sh vdt-control --on mkdir /opt/glexec/glite/etc cp /OSG/glite/etc/vomses /opt/glexec/glite/etc/. </pre> so, We start to run it in our nodes: <pre> /root/bin/clcmd /home/mdias/worknodeinstall.sh /root/bin/clcmd rm -rf /opt/OSG-wn-client.old </pre> we have to edit one /etc/glexec/glexec.conf to put "linger =no" line in [glexec] section and copy to the other nodes. <pre> clcmd cp -f /home/mdias/glexec.conf /etc/glexec/. </pre> Also we have to edit our computer element , *before* reusing old configuration: <pre> vim /opt/osg-1.0.0/monitoring/extracted-config.in glexec_location = /opt/glexec/glexec-osg </pre> Testing glexec <pre> cd /etc/glexec source setup.sh voms-proxy-init -voms cms:/cms export GLEXEC_CLIENT_CERT=/tmp/x509up_u537 cd /opt/glexec/ glexec-osg/sbin/glexec /usr/bin/id </pre> A srm error has forced us to upgrade our dcache to make srm compatible. So in our storage element (osg-se): <pre> /opt/init.d/dcache-pool stop /opt/init.d/dcache-core stop wget http://www.dcache.org/downloads/1.8.0/dcache-server-1.8.0-15p7.noarch.rpm wget http://www.dcache.org/downloads/1.8.0/dcache-srmclient-1.8.0-15p7.noarch.rpm rpm -Uvh dcache-server-1.8.0-15p7.noarch.rpm dcache-srmclient-1.8.0-15p7.noarch.rpm </pre> Restarting pools and dcache-core : <pre> /opt/d-cache/install/install.sh /opt/d-cache/bin/dcache-core start /opt/d-cache/bin/dcache-pool start </pre> Restarting like that <pre> [root@spraid02 ~]# /opt/d-cache/bin/dcache-core start This script is deprecated and will be removed in a future release. Please use /opt/d-cache/bin/dcache start instead. Starting dcache services: Starting gridftp-spraid02Domain Done (pid=1147) [root@spraid02 ~]# /opt/d-cache/bin/dcache-pool start This script is deprecated and will be removed in a future release. Please use /opt/d-cache/bin/dcache start pool instead. WARNING: the variable DCACHE_HOME is not set. WARNING: Using deprecated value of DCACHE_BASE_DIR as DCACHE_HOME start dcache pool: Starting spraid02Domain Done (pid=1271) </pre> * In our storage elements <pre> unlink /opt/osg-0.8.0 ln -s /OSG /opt/osg-1.0.0 </pre> configuring dcache information provider: <pre> yum install postgresql.i386 $GIP_LOCATION/conf/configure_gip_dcache What is the hostname of your dCache admin interface? osg-se.sprace.org.br Configuration saved. If you would like to alter any choices without re-running this configuration script, you may find these answers in: $VDT_LOCATION/gip/etc/dcache_storage.conf chown daemon:root $VDT_LOCATION/gip/etc/dcache_storage.conf </pre> interesting links * http://twiki.mwt2.org/bin/view/Main/DeployingOSG1d0 * https://twiki.grid.iu.edu/twiki/bin/view/ReleaseDocumentation/ComputeElementInstall ---++Updates ---+++Marco at 07/10/2008 Apply a GIP patch <pre> --- gip/libexec/services_info_provider.py (revision 2030) +++ gip/libexec/services_info_provider.py (working copy) @@ -102,6 +102,16 @@ def print_srm(cp, admin): sename = cp.get("se", "unique_name") sitename = cp.get("site", "unique_name") + # BUGFIX: Resolve the IP address of srm host that the admin specifies. + # If this IP address matches the IP address given by dCache, then we will + # print out the admin-specified hostname instead of looking it up. This + # is for sites where the SRM host is a CNAME instead of the A name. + srm_host = cp_get(cp, "se", "srm_host", None) + if srm_host: + try: + srm_ip = socket.gethostbyname(srm_host) + except: + srm_ip = None #vos = [i.strip() for i in cp.get("vo", "vos").split(',')] vos = voListStorage(cp) ServiceTemplate = getTemplate("GlueService", "GlueServiceUniqueID") @@ -121,8 +131,11 @@ hostname = hostname.split(',')[0] try: hostname = socket.getfqdn(hostname) + hostname_ip = socket.gethostbyname(hostname) except: - pass + hostname_ip = None + if hostname_ip != None and hostname_ip == srm_ip and srm_host != None: + hostname = srm_host info = { "serviceType" : "SRM", "acbr" : acbr[1:], </pre> one more patch <pre> --- vdt/setup/configure_gip (revision 2047) +++ vdt/setup/configure_gip (working copy) @@ -1337,6 +1337,19 @@ } + # BUGFIX: se_access is not filled in if you are using dynamic_dcache, but + # this global variable must be populated for the CESE bind to work right + # later on + if ($srm && $dynamic_dcache==1) { + + foreach $vo (@vo_list){ + my $se_access_root = $vo_access_roots{$vo}; + if ($se_access_root!~/^$/) { + $se_access{$vo} = $se_access_root; + } + } + } + my $service_config_file = "$gip_location/etc/osg-info-gip-config/osg-info-static-service.conf"; safe_write($service_config_file, $service_contents); vdt_install_log("===== BEGIN osg-info-static-service.conf =====\n"); @@ -1536,8 +1549,9 @@ } if ($srm && exists $se_access{$vo} && defined $sa_path) { + # BUGFIX: use $se_access{$vo} instead of $sa_path/$se_access{$vo}. $contents = $contents."\ndn: GlueCESEBindSEUniqueID=$se_host, GlueCESEBindGroupCEUniqueID=$fullhost:2119/jobmanager-$batch-$vo\n" - ."GlueCESEBindCEAccesspoint: $sa_path/$se_access{$vo}\n" + ."GlueCESEBindCEAccesspoint: $se_access{$vo}\n" ."GlueCESEBindCEUniqueID: $fullhost:2119/jobmanager-$batch-$vo\n"; } } </pre> now run configure_gip again. Another patch solves a error runing $GIP_LOCATION/libexec/token_info_provider.py <pre> --- gip/lib/python/gip_cese_bind.py (revision 2047) +++ gip/lib/python/gip_cese_bind.py (working copy) @@ -55,6 +55,8 @@ ce_list = getCEList(cp) se_list = getSEList(cp) access_point = cp_get(cp, "vo", "default", "/") + if not access_point: + access_point = "/UNAVAILABLE" for ce in ce_list: for se in se_list: info = {'ceUniqueID' : ce, </pre> It's not necessary run configure_gip again ---+++Marco at 07/16/2008 Cemon not reporting to BDII database. Parag Mhashilkar gently helped us with it <pre> Coincidentally, I have seen similar error in catalina.out. The admin their claimed that they managed to fix the problem by putting xercesImpl.jar in $VDT_LOCATION/tomcat/v55/common/endorsed. The claim is that, this missing jar file is resulting in some strange interference between gums and cemon installation. You can find this jar at couple of places within vdt itself. Once you have this file in above dir, stop tomcat, make sure that everything in $GLITE_LOCATION/var/cemonitor is deleted (manually) and start tomcat. </pre> ---+++ Marco at 07/18/2008 in order to our dcache GIP works, we havet to edit: <pre> more /OSG/monitoring/extracted-config.ini dynamic_dcache = 1 cd /OSG/monitoring ./configure-osg.py -c -f ./extracted-config.ini </pre> -- Main.MarcoAndreFerreiraDias - 23 Jun 2008
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r12
<
r11
<
r10
<
r9
<
r8
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r12 - 2008-07-03
-
MarcoAndreFerreiraDias
Home
Site map
Main web
Sandbox web
TWiki web
Main Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback