Compressão dos Logs da SPDC00.


Iremos liberar mais espaço no /var da SPDC00. Parando o Phedex
 [mdias@spdc00 mdias]$ su -
[root@spdc00 root]# su - phedex
[phedex@spdc00 phedex]$ grid-proxy-info
subject  : /DC=org/DC=doegrids/OU=People/CN=Eduardo Gregores 407221/CN=proxy/CN=proxy/CN=proxy
issuer   : /DC=org/DC=doegrids/OU=People/CN=Eduardo Gregores 407221/CN=proxy/CN=proxy
identity : /DC=org/DC=doegrids/OU=People/CN=Eduardo Gregores 407221
type     : full legacy globus proxy
strength : 1024 bits
path     : /home/phedex/gridcert/proxy.cert
timeleft : 0:00:00
[phedex@spdc00 phedex]$  Master -config ~/SITECONF/local/PhEDEx/Config.Prod stop
Parando o dCache
[root@spdc00 root]#  /opt/pnfs/bin/pnfs stop
Shutting down pnfs services (PostgreSQL version):
 Stopping Heartbeat ....  Ready
 Killing pnfsd  Done
 Killing pmountd  Done
 Killing dbserver . Done
 Removing 8 Clients  0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+
 Removing 8 Servers  0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+
 Removing main switchboard ... O.K.

[root@spdc00 root]#  /opt/d-cache/bin/dcache-core stop
Shutting down dcache services:
Pid File (/opt/d-cache/config/lastPid.gridftp-spdc00) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.gsidcap-spdc00) doesn't contain valid PID
Stopping srm-spdc00Domain (pid=22268) 0 1 2 3 4 5 6 7 Done
Pid File (/opt/d-cache/config/lastPid.replica) doesn't contain valid PID
Stopping utilityDomain (pid=21995) 0 1 2 3 4 5 6 7 8 Done
Stopping httpdDomain (pid=21909) 0 1 2 3 4 5 6 7 Done
Stopping infoProviderDomain (pid=22171) 0 1 2 3 4 5 6 7 8 Done
Stopping pnfsDomain (pid=22083) 0 1 2 3 4 5 6 7 Done
Stopping adminDoorDomain (pid=21830) 0 1 2 3 4 5 6 7 8 Done
Stopping doorDomain (pid=21755) 0 1 2 3 4 5 6 7 Done
Stopping dirDomain (pid=21682) 0 1 2 3 4 5 6 7 8 Done
Stopping dCacheDomain (pid=21605) 0 1 2 3 4 5 6 7 Done
Stopping lmDomain (pid=21542) 0 1 2 3 4 5 6 7 8 Done
agora na SPRaid
[root@spraid root]#  /opt/d-cache/bin/dcache-core stop
Shutting down dcache services:
Stopping gridftp-spraidDomain (pid=26016) 0 1 2 3 4 5 6 7 Done
Pid File (/opt/d-cache/config/lastPid.gsidcap-spraid) doesn't contain valid PID
Stopping srm-spraidDomain (pid=26106) 0 1 2 3 4 5 6 7 Done
Pid File (/opt/d-cache/config/lastPid.replica) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.utility) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.httpd) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.infoProvider) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.pnfs) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.adminDoor) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.door) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.dir) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.dCache) doesn't contain valid PID
Pid File (/opt/d-cache/config/lastPid.lm) doesn't contain valid PID

[root@spraid root]#  /opt/d-cache/bin/dcache-pool stop

Shutting down dcache pool: Stopping spraidDomain (pid=26283) 0 1 2 3 4 5 6 7 Done
Fazendo a compressão dos logs
[root@spdc00 log]# cd /var/log
[root@spdc00 log]# gzip srm-spdc00Domain.log
[root@spdc00 log]# mv srm-spdc00Domain.log.gz srm-spdc00Domain.log.gz.3
Agora vamos ligar novamente os serviços. Primeiro o Phedex
[root@spdc00 root]# su - phedex
[phedex@spdc00 phedex]$  Master -config ~/SITECONF/local/PhEDEx/Config.Prod start
FileDownload: pid 4069 started in /home/phedex/state/download-master-prod
FileDiskExport: pid 4075 started in /home/phedex/state/exp-disk-prod
InfoDropStatus: pid 4081 started in /home/phedex/state/info-ds-prod
FilePFNExport: pid 4087 started in /home/phedex/state/exp-pfn-prod
como root
[root@spdc00 root]#  /opt/pnfs/bin/pnfs start
Starting pnfs services (PostgreSQL version):
 Shmcom : Installed 8 Clients and 8 Servers
 Starting database server for admin (/opt/pnfsdb/pnfs/databases/admin) ... O.K.
 Starting database server for data1 (/opt/pnfsdb/pnfs/databases/data1) ... O.K.
 Waiting for dbservers to register ... Ready
 Starting Mountd : pmountd
 Starting nfsd : pnfsd

[root@spdc00 root]#  /opt/d-cache/bin/dcache-core start
Starting dcache services:
Starting lmDomain  6 5 4 3 2 1 0 Done (pid=4302)
Starting dCacheDomain  6 5 4 3 2 1 0 Done (pid=4365)
Starting dirDomain  6 5 4 3 2 1 0 Done (pid=4442)
Starting doorDomain  6 5 4 3 2 1 0 Done (pid=4515)
Starting adminDoorDomain  6 5 4 3 2 1 0 Done (pid=4591)
Starting httpdDomain  6 5 4 3 2 1 0 Done (pid=4677)
Starting utilityDomain  6 5 4 3 2 1 0 Done (pid=4768)
Starting pnfsDomain  6 5 4 3 2 1 0 Done (pid=4862)
Starting infoProviderDomain  6 5 4 3 2 1 0 Done (pid=4952)
Starting srm-spdc00Domain  6 5 4 3 2 1 0 Done (pid=5049)
voltando à SPRaid
[root@spraid root]#  /opt/d-cache/bin/dcache-core start
Starting dcache services:
Starting gridftp-spraidDomain  6 5 4 3 2 1 0 Done (pid=12023)
Starting srm-spraidDomain  6 5 4 3 2 1 0 Done (pid=12113)

[root@spraid root]#  /opt/d-cache/bin/dcache-pool start

Starting dcache pool: Starting spraidDomain  6 5 4 3 2 1 0 Done (pid=12290)
Vamos verificar o espaço:
[root@spdc00 log]# df -h
/dev/sda5             2.0G  1.8G   72M  97% /var
Piorou! Melhor mover os logs comprimidos para outro lugar:
[root@spdc00 log]# mv /var/log/srm-spdc00Domain.log.gz.* /home/mdias/.
[root@spdc00 log]# df -h
/dev/sda5             2.0G  1.6G  276M  86% /var
ainda é muito. entretanto
[root@spdc00 log]# lsof /dev/sda5
tail     1635  mdias    3r   REG    8,5 1278886709 131165 /var/log/srm-spdc00Domain.log (deleted)
[root@spdc00 log]# kill -9 1635
[root@spdc00 log]# df -h
/dev/sda5             2.0G  417M  1.5G  22% /var
[root@spdc00 log]# mv /home/mdias/srm-spdc00Domain.log.gz.* /var/log/.
[root@spdc00 log]# df -h
[root@spdc00 log]# mv /home/mdias/srm-spdc00Domain.log.gz.* /var/log/.
[root@spdc00 log]# df -h
Sorry! deveria ter fechado o tail de monitoramento antes. Burrice!

No monitoramento do phedex estamos UP em tudo na instância production. No dCache OK tanto em Cell service quanto Pooll usage


Topic revision: r1 - 2006-10-11 - MarcoAndreFerreiraDias

