Jobsub ID 263020.0@justin-prod-sched01.dune.hep.ac.uk
Jobsub ID | 263020.0@justin-prod-sched01.dune.hep.ac.uk |
Workflow Testing | Yes |
Workflow ID | 1 |
Stage ID | 1 |
User name | amcnab@fnal.gov |
HTCondor Group | group_dune.prod_mcsim |
Requested | Processors | 1 |
RSS bytes | 1073741824 (1024 MiB) |
Wall seconds limit | 3600 (1 hours) |
Submitted time | 2024-09-26 07:17:54 |
Site | UK_Brunel |
Entry | CMSHTPC_T2_UK_London_Brunel_dc2_22 |
Last heartbeat | 2024-09-26 08:32:14 |
From worker node | Hostname | wn-b8-29-00.brunel.ac.uk |
cpuinfo | Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 1073741824 (1024 MiB) |
Wall seconds limit | 171000 (47 hours) |
Inner Apptainer? | True |
Job state | finished |
Allocator name | justin-allocator-pro.dune.hep.ac.uk |
Started | 2024-09-26 07:19:02 |
Input files | |
Jobscript | Exit code | 0 |
Real time | 1h (3740s) |
CPU time | 0m (10s = 0%) |
Outputting started | 2024-09-26 08:32:03 |
Output files | |
Finished | 2024-09-26 08:32:14 |
Saved logs | justin-logs:263020.0-justin-prod-sched01.dune.hep.ac.uk.logs.tgz |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
ped connection: dune-rucio.fnal.gov
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /dids/testpro/awt-uploads-202439/dids HTTP/1.1" 201 7
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443
[31;1m2024-09-26 09:17:01,795 ERROR ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))[0m
ERROR:baseclient:ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /replicas/list HTTP/1.1" 503 299
[33;1m2024-09-26 09:17:17,301 WARNING Waiting 0.5s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 0.5s due to reason: server returned 503
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (3): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /replicas/list HTTP/1.1" 503 299
[33;1m2024-09-26 09:17:33,439 WARNING Waiting 1.0s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 1.0s due to reason: server returned 503
--- Upload try 1/1
--- Rucio upload 1/1 returns 0
--- Replica check try 1/1
--- Rucio list_replicas call fails: An unknown exception occurred.
Details: no error information passed (http status code: 503)
--- No replica in Rucio, exit 98
'justin-rucio-upload --rse RAL-PP --protocol davs --scope testpro --dataset awt-uploads-202439 awt-1727335145-to60HFkFJ3 --timeout 1200' returns 98
---------------------------------------------------------------------
UK_Brunel RAL_ECHO davs root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt
'xrdcp --force --nopbar --verbose root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt downloaded.txt' returns 0
GFAL_CONFIG_DIR: GFAL_PLUGIN_DIR:
justin-rucio-upload attempt 1
DEBUG:root:Num. of files that upload client is processing: 1
DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:NeedRegenerationException
DEBUG:dogpile.lock:no value, waiting for create lock
DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x1480cd8478e0> acquired
DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:Calling creation function for not-yet-present value
DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:Released creation lock
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=RAL_ECHO HTTP/1.1" 503 299
[33;1m2024-09-26 09:17:51,498 WARNING Waiting 0.25s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 0.25s due to reason: server returned 503
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=RAL_ECHO HTTP/1.1" 503 299
[33;1m2024-09-26 09:18:07,294 WARNING Waiting 0.5s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 0.5s due to reason: server returned 503
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (3): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=RAL_ECHO HTTP/1.1" 503 299
[33;1m2024-09-26 09:18:23,756 WARNING Waiting 1.0s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 1.0s due to reason: server returned 503
--- Upload try 1/1
--- Rucio upload 1/1 fails: An unknown exception occurred.
Details: no error information passed (http status code: 503)
--- Exit with 99
'justin-rucio-upload --rse RAL_ECHO --protocol davs --scope testpro --dataset awt-uploads-202439 awt-1727335145-3t86zLbVqe --timeout 1200' returns 99
---------------------------------------------------------------------
UK_Brunel SURFSARA davs root://penguin12.grid.surfsara.nl:21094/pnfs/grid.sara.nl/data/dune/disk/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt
'xrdcp --force --nopbar --verbose root://penguin12.grid.surfsara.nl:21094/pnfs/grid.sara.nl/data/dune/disk/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt downloaded.txt' returns 0
GFAL_CONFIG_DIR: GFAL_PLUGIN_DIR:
justin-rucio-upload attempt 1
DEBUG:root:Num. of files that upload client is processing: 1
DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:NeedRegenerationException
DEBUG:dogpile.lock:no value, waiting for create lock
DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x14ecbbbc78e0> acquired
DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:Calling creation function for not-yet-present value
DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']"
DEBUG:dogpile.lock:Released creation lock
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=SURFSARA HTTP/1.1" 503 299
[33;1m2024-09-26 09:20:56,111 WARNING Waiting 0.25s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 0.25s due to reason: server returned 503
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): dune-rucio.fnal.gov:443
DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=SURFSARA HTTP/1.1" 503 299
[33;1m2024-09-26 09:21:12,044 WARNING Waiting 0.5s due to reason: server returned 503 [0m
WARNING:baseclient:Waiting 0.5s due to reason: server returned 503
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (3): dune-rucio.fnal.gov:443
[31;1m2024-09-26 09:21:23,479 ERROR ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))[0m
ERROR:baseclient:ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
--- Upload try 1/1
--- Rucio upload 1/1 fails: An unknown exception occurred.
Details: no error information passed (http status code: 503)
--- Exit with 99
'justin-rucio-upload --rse SURFSARA --protocol davs --scope testpro --dataset awt-uploads-202439 awt-1727335145-JG7QqNQqEs --timeout 1200' returns 99
subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=219829630/CN=172733514290
issuer : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=219829630
identity : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=219829630
type : RFC compliant proxy
strength : 2048 bits
path : /home/awt-proxy.pem
timeleft : 166:57:39
key usage : Digital Signature, Key Encipherment, Key Agreement
=== VO dune extension information ===
VO : dune
subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk
issuer : /DC=org/DC=incommon/C=US/ST=Illinois/O=Fermi Research Alliance/CN=voms1.fnal.gov
attribute : /dune/Role=Production/Capability=NULL
attribute : /dune/Role=NULL/Capability=NULL
timeleft : 162:57:40
uri : voms1.fnal.gov:15042
===== Results =====
Download/upload commands:
xrdcp --force --nopbar --verbose $read_pfn downloaded.txt
justin-rucio-upload --rse $rse_name --protocol $write_protocol --scope testpro --dataset --timeout 1200 FILENAME
Use the wrapper job link on the page for the job on the justIN Dashboard to find the full log file, with errors from these commands
Each line: $JUSTIN_SITE_NAME $rse_name $download_retval $upload_retval $read_pfn $write_protocol
==awt== UK_Brunel DUNE_CERN_EOS 0 99 root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel DUNE_ES_PIC 51 99 root://xrootd.pic.es:1094/pnfs/pic.es/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel DUNE_FR_CCIN2P3_DISK 0 99 root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel DUNE_UK_LANCASTER_CEPH 0 99 root://xgate.hec.lancs.ac.uk:1094//cephfs/grid/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel DUNE_US_BNL_SDCC 0 99 root://dcdndoor.sdcc.bnl.gov:1094//pnfs/sdcc.bnl.gov/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel DUNE_US_FNAL_DISK_STAGE 0 99 root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel MANCHESTER 54 99 root://bohr3226.tier2.hep.manchester.ac.uk:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt root
==awt== UK_Brunel NIKHEF 0 99 root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel PRAGUE 0 99 root://golias100.farm.particle.cz:1094/dpm/farm.particle.cz/home/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel QMUL 51 99 root://xrootd01.esc.qmul.ac.uk:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel RAL-PP 0 98 root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel RAL_ECHO 0 99 root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs
==awt== UK_Brunel SURFSARA 0 99 root://penguin12.grid.surfsara.nl:21094/pnfs/grid.sara.nl/data/dune/disk/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs