Condor no Servidor Alice

Description

Começando a instalação no servidor:
[root@sprace-ws0 ~]# cd /opt/
[root@sprace-ws0 opt]# mkdir condor
Como o sistema operacional instalado era um SL 4.7, a escolha recaiu sobre uma versão para o condor para RedHat 3, static linked, como sugerida pelo site, com a arquitetura apropriada. O pacote é obtido através da página: http://www.cs.wisc.edu/condor/downloads-v2/download.pl

Sugiro usar o parâmetro -r no adduser e groupadd, para criar contas de sistema. (winckler)

[root@sprace-ws0 tmp]# wget http://teal.cs.wisc.edu//symlink/20090119101502/7/7.2/7.2.0/ad43271277869306f4631e5a45a09907/condor-7.2.0-linux-x86_64-rhel3.tar.gz
[root@sprace-ws0 tmp]# tar -xvzf condor-7.2.0-linux-x86_64-rhel3.tar.gz
[root@sprace-ws0 tmp]# cd condor-7.2.0
[root@sprace-ws0 condor-7.2.0]# groupadd condor; adduser condor -g condor -d /home/condor
[root@sprace-ws0 condor-7.2.0]# HOSTNAME=sprace-ws0.sprace.org.br
[root@sprace-ws0 condor-7.2.0]# ./condor_configure --install --maybe-daemon-owner --make-personal-condor --install-log /opt/condor/post_install  --install-dir /opt/condor/

Começamos o trabalho da configuração do condor.

[root@sprace-ws0 condor-7.2.0]# cd /opt/condor/
[root@sprace-ws0 condor]# vi /opt/condor/etc/condor_config
Os parâmetros alterados são:
CONDOR_HOST             = 192.168.1.1
RELEASE_DIR             = /opt/condor
LOCAL_DIR               = $(RELEASE_DIR)/hosts/$(HOSTNAME)
LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local
CONDOR_ADMIN            = mafd@mail.cern.ch
UID_DOMAIN              = local
FILESYSTEM_DOMAIN       = local
COLLECTOR_NAME          = ALICE
HOSTALLOW_READ = *.sprace.org.br, *.local
HOSTALLOW_WRITE = *.local, *.sprace.org.br

Criando os diretórios necessários, onde ficará a configuração

[root@sprace-ws0 condor]# mkdir hosts
[root@sprace-ws0 condor]# mkdir hosts/`hostname -s`
[root@sprace-ws0 condor]# mkdir hosts/sprace-ws0/{log,execute,spool}
[root@sprace-ws0 condor]# chown condor: hosts/sprace-ws0/*
[root@sprace-ws0 condor]# vi hosts/sprace-ws0/condor_config.local
Este arquivo deve conter, para o servidor somente
NETWORK_INTERFACE=192.168.1.1
DAEMON_LIST                     = MASTER, STARTD, SCHEDD, COLLECTOR, NEGOTIATOR
Agora iremos preparar o script para inicialização do condor:
[root@sprace-ws0 condor]# vi /etc/init.d/condor

%CODE{"sh"}% # chkconfig: 345 99 99 # description: Condor batch system ### BEGIN INIT INFO # Provides: condor # Required-Start: $network # Required-Stop: # Default-Start: 3 4 5 # Default-Stop: 1 2 6 # Description: Condor batch system ### END INIT INFO

# Determine if we're superuser case `id` in "uid=0("* ) vdt_is_superuser=y ;; * ) vdt_is_superuser=n ;; esac source /opt/condor/condor.sh CONDOR_SBIN=/opt/condor/sbin MASTER=$CONDOR_SBIN/condor_master CONDOR_OFF=$CONDOR_SBIN/condor_off PS="/bin/ps auwx" case $1 in 'start') if [ -x $MASTER ]; then echo "Starting up Condor" $MASTER else echo "$MASTER is not executable. Skipping Condor startup." exit 1 fi ;;

'stop') pid=`$PS | grep $MASTER | grep -v grep | awk '{print $2}'` if [ -n "$pid" ]; then echo "Shutting down Condor" $CONDOR_OFF -master else echo "Condor not running" fi

;;

*) echo "Usage: condor {start|stop}" ;; esac %ENDCODE%

Logo

[root@sprace-ws0 condor]# chmod +x /etc/init.d/condor
[root@sprace-ws0 condor]# chkconfig --add condor
Lembre-se sempre de primeiramente setar as variáveis de ambiente:
[root@sprace-ws0 condor]# . /opt/condor/condor.sh
para que os comando subsequentes, pertinentes à administração possam funcionar. Prepare o servidor para a montagem nfs dos diretórios necessários :
[root@sprace-ws0 ~]# vi /etc/exports
/opt/condor             192.168.1.0/24(rw,async,no_root_squash)
[root@sprace-ws0 ~]# exportfs -a
[root@sprace-ws0 ~]# exportfs
/opt/condor     192.168.1.0/24
/home           192.168.1.0/24
Crie o grupo e usuário "condor", respeitando o mesmo gid/uid deste no seu servidor:
[root@sprace-ws1 ~]# groupadd condor -g 501
[root@sprace-ws1 ~]# adduser condor -g condor -d /home/condor -u 501
No seu node, configure primeiramente o ponto de montagem dos arquivos de configuração e binários (além do diretório home dos usuários):
[root@sprace-ws1 ~]# vi /etc/fstab
spracews0:/opt/condor    /opt/condor           nfs      rw,hard,bg,rsize=32768,wsize=32768,udp,nfsvers=3
spracews0:/home    /home           nfs  rw,hard,bg,rsize=32768,wsize=32768,udp,nfsvers=3
[root@sprace-ws1 ~]# mkdir /opt/condor
[root@sprace-ws1 ~]# mount /opt/condor/

O mesmo script de inicialização é utilizado pelos node, então é suficiente copiá-lo so servidor:

[root@sprace-ws1 ~]# scp spracews0:/etc/init.d/condor /etc/init.d/condor
[root@sprace-ws1 ~]# chkconfig --add condor
Retorne ao servidor, agora prepare o local onde ficarão a configuração local para o node e seus logs de execução:
[root@sprace-ws0 ~]# mkdir /opt/condor/hosts/sprace-ws1
[root@sprace-ws0 ~]# mkdir /opt/condor/hosts/sprace-ws1/{execute,log,spool}
[root@sprace-ws0 ~]# vi /opt/condor/hosts/sprace-ws1/condor_config.local
NETWORK_INTERFACE=192.168.1.2
[root@sprace-ws0 ~]# chown condor:  /opt/condor/hosts/sprace-ws1/*

Agora, inicie o condor em seu node:

[root@sprace-ws1 ~]# /etc/init.d/condor start
Starting up Condor

A partir do seu servidor, você deve ver alguma coisa como:

[root@sprace-ws0 ~]# source /opt/condor/condor.sh
[root@sprace-ws0 ~]# condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem ActvtyTime
slot1@sprace-ws0.s LINUX      X86_64 Owner     Idle     0.000  8018
0+00:40:04
slot2@sprace-ws0.s LINUX      X86_64 Unclaimed Idle     0.000  8018
1+09:50:47
slot1@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.070   493
0+00:00:10
slot2@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:11
slot3@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:12
slot4@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:13
slot5@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:14
slot6@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:15
slot7@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:16
slot8@sprace-ws1.s LINUX      X86_64 Owner     Idle     0.000   493
0+00:00:09

                     Total Owner Claimed Unclaimed Matched Preempting Backfill

      X86_64/LINUX    10     9       0         1     0          0        0
             Total    10     9       0         1     0          0        0

Para testar se o condor está efetivamente rodando, crie um job simple e acompanhe sua execução:

[root@sprace-ws0 ~]# su - mdias
[mdias@sprace-ws0 ~]$ vi submit
Universe   = vanilla
Executable =  /bin/sleep
Arguments  = 30
Log        = simple.log
Output     = simple.$(Process).out
Error      = simple.$(Process).error
Queue

Arguments = 30
Queue

Arguments =  30
Queue

Arguments =  30
Queue
[mdias@sprace-ws0 ~]$ condor_submit submit
O resultado pode ser visto desta forma:
[mdias@sprace-ws0 ~]$ condor_q
-- Submitter: sprace-ws0.sprace.org.br : <192.168.1.1:32847> :
sprace-ws0.sprace.org.br
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
 7.0   mdias           1/21 11:03   0+00:00:04 R  0   0.0  sleep 30
 7.1   mdias           1/21 11:03   0+00:00:04 R  0   0.0  sleep 30
 7.2   mdias           1/21 11:03   0+00:00:04 R  0   0.0  sleep 30
 7.3   mdias           1/21 11:03   0+00:00:04 R  0   0.0  sleep 30
4 jobs; 0 idle, 4 running, 0 held
[mdias@sprace-ws0 ~]$ more simple.1.error

Updates

Fulano em dd/mm/aaaa

Coloca o que fez.

Ciclano em dd/mm/aaaa

Mais comentarios

-- MarcoAndreFerreiraDias - 20 Jan 2009

Topic revision: r5 - 2009-02-21 - GabrielWinckler
 

This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback

antalya escort bursa escort eskisehir escort istanbul escort izmir escort