21 July 2025: This instance at RAL is read-only. Please do not try submitting new workflows for now.
Jobsub ID 211618.0@justin-prod-sched02.dune.hep.ac.uk
Jobsub ID | 211618.0@justin-prod-sched02.dune.hep.ac.uk | |
Workflow Testing | Yes | |
Workflow ID | 1 | |
Stage ID | 1 | |
User name | amcnab@fnal.gov | |
HTCondor Group | group_dune.prod_mcsim | |
Requested | Processors | 1 |
GPU | No | |
RSS bytes | 1073741824 (1024 MiB) | |
Wall seconds limit | 3600 (1 hours) | |
Submitted time | 2025-05-29 22:25:15 | |
Site | UK_Imperial | |
Entry | DUNE_T2_UK_London_IC_ceprod01 | |
Last heartbeat | 2025-05-29 23:44:22 | |
From worker node | Hostname | wj58.grid.hep.ph.ic.ac.uk |
cpuinfo | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | |
OS release | Scientific Linux release 7.9 (Nitrogen) | |
Processors | 1 | |
RSS bytes | 1073741824 (1024 MiB) | |
Wall seconds limit | 171000 (47 hours) | |
GPU | ||
Inner Apptainer? | True | |
Job state | finished | |
Allocator name | justin-allocator-pro.dune.hep.ac.uk | |
Started | 2025-05-29 22:39:28 | |
Input files | ||
Jobscript | Exit code | 0 |
Real time | 1h (3708s) | |
CPU time | 0m (55s = 1%) | |
Max RSS bytes | 64204800 (61 MiB) | |
Outputting started | 2025-05-29 23:41:17 | |
Output files | ||
Finished | 2025-05-29 23:44:22 | |
Saved logs | justin-logs:211618.0-justin-prod-sched02.dune.hep.ac.uk.logs.tgz | |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
G Waiting 0.5s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.5s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (3): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /dids/testpro/awt-uploads-202521/files HTTP/1.1" 200 None --- Upload try 1/1 --- Rucio upload 1/1 returns 0 --- Replica check try 1/1 --- Dataset awt-uploads-202521 check try 1/1 --- Upload, replicas, and datasets checks passed 'justin-rucio-upload --rse SURFSARA --protocol davs --scope testpro --dataset awt-uploads-202521 awt-1748558384-GpA68lTJ4X --timeout 1200' returns 0 --------------------------------------------------------------------- UK_Imperial T3_US_NERSC davs root://dtn14.nersc.gov:1094//global/cfs/cdirs/m3249/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt 'xrdcp --force --nopbar --verbose root://dtn14.nersc.gov:1094//global/cfs/cdirs/m3249/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt downloaded.txt' returns 0 { "created_timestamp": null, "creator": "dunepro", "fid": "DbCGZ26DSsWbXHGs", "metadata": {}, "name": "awt-1748558384-HqQ3FooANG", "namespace": "testpro", "retired": false, "retired_by": null, "retired_timestamp": null, "size": 0, "updated_by": null, "updated_timestamp": null } metacat file declare returns 0 GFAL_CONFIG_DIR: GFAL_PLUGIN_DIR: justin-rucio-upload attempt 1 DEBUG:root:Num. of files that upload client is processing: 1 DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:NeedRegenerationException DEBUG:dogpile.lock:no value, waiting for create lock DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x14c819bc02b0> acquired DEBUG:dogpile.cache.region:No value present for key: "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:Calling creation function for not-yet-present value DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:Released creation lock DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443 [31;1m2025-05-30 00:38:46,914 ERROR ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))[0m ERROR:baseclient:ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/?expression=T3_US_NERSC HTTP/1.1" 200 None DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/T3_US_NERSC HTTP/1.1" 200 1240 DEBUG:root:Input validation done. INFO:root:Preparing upload for file awt-1748558384-HqQ3FooANG DEBUG:urllib3.connectionpool:Resetting dropped connection: dune-rucio.fnal.gov DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/T3_US_NERSC/attr/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:39:54,779 WARNING Waiting 0.25s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.25s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (3): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/T3_US_NERSC/attr/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:40:10,667 WARNING Waiting 0.5s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.5s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (4): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /rses/T3_US_NERSC/attr/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:40:26,934 WARNING Waiting 1.0s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 1.0s due to reason: server returned 503 WARNING:root:Attributes of the RSE: T3_US_NERSC not available. DEBUG:root:wan domain is used for the upload DEBUG:root:Registering file DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x14c819bf7520> acquired DEBUG:dogpile.lock:Calling creation function for previously expired value DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:Released creation lock DEBUG:urllib3.connectionpool:Resetting dropped connection: dune-rucio.fnal.gov DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /accounts/dunepro/scopes/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:40:43,531 WARNING Waiting 0.25s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.25s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (5): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /accounts/dunepro/scopes/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:40:59,507 WARNING Waiting 0.5s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.5s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (6): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "GET /accounts/dunepro/scopes/ HTTP/1.1" 503 299 [33;1m2025-05-30 00:41:15,797 WARNING Waiting 1.0s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 1.0s due to reason: server returned 503 --- Upload try 1/1 --- Rucio upload 1/1 fails: An unknown exception occurred. Details: no error information passed (http status code: 503) --- Exit with 99 'justin-rucio-upload --rse T3_US_NERSC --protocol davs --scope testpro --dataset awt-uploads-202521 awt-1748558384-HqQ3FooANG --timeout 1200' returns 99 subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=2563265061/CN=174855836856 issuer : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=2563265061 identity : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=2563265061 type : RFC compliant proxy strength : 2048 bits path : /home/awt-proxy.pem timeleft : 166:58:12 key usage : Digital Signature, Key Encipherment, Key Agreement === VO dune extension information === VO : dune subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk issuer : /DC=org/DC=incommon/C=US/ST=Illinois/O=Fermi Research Alliance/CN=voms2.fnal.gov attribute : /dune/Role=Production/Capability=NULL attribute : /dune/Role=NULL/Capability=NULL timeleft : 148:05:46 uri : voms2.fnal.gov:15042 ===== Results ===== Download/upload commands: xrdcp --force --nopbar --verbose $read_pfn downloaded.txt echo '{"namespace":"testpro","name":"FILENAME","size":0}' >tmp.json metacat file declare --json -f tmp.json "dune:all" justin-rucio-upload --rse $rse_name --protocol $write_protocol --scope testpro --dataset awt-uploads-202521 --timeout 1200 FILENAME Use the wrapper job link on the page for the job on the justIN Dashboard to find the full log file, with errors from these commands Each line: $JUSTIN_SITE_NAME $rse_name $download_retval $upload_retval $read_pfn $write_protocol ==awt== UK_Imperial DUNE_CA_SFU 0 0 root://lcg-dunese1.sfu.computecanada.ca:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_CERN_EOS 0 0 root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_ES_PIC 0 0 root://xrootd.pic.es:1094/pnfs/pic.es/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_FR_CCIN2P3_DISK 0 0 root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_IT_INFN_CNAF 51 99 root://xrootd-archive.cr.cnaf.infn.it:1096//dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_UK_GLASGOW 52 99 root://cephc02.gla.scotgrid.ac.uk:1094//cephfs/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_UK_LANCASTER_CEPH 0 0 root://xgate.hec.lancs.ac.uk:1094//cephfs/grid/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_UK_MANCHESTER_CEPH 0 99 root://meitner.tier2.hep.manchester.ac.uk:1094//cephfs/experiments/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_US_BNL_SDCC 0 99 root://dcdndoor.sdcc.bnl.gov:1094//pnfs/sdcc.bnl.gov/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial DUNE_US_FNAL_DISK_STAGE 0 0 root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial NIKHEF 0 0 root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial PRAGUE 0 0 root://golias100.farm.particle.cz:1094/dpm/farm.particle.cz/home/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial QMUL 52 99 root://xrootd1.esc.qmul.ac.uk:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial RAL-PP 0 0 root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial RAL_ECHO 0 99 root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial SURFSARA 0 0 root://otter12.grid.surfsara.nl:21094/pnfs/grid.sara.nl/data/dune/disk/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== UK_Imperial T3_US_NERSC 0 99 root://dtn14.nersc.gov:1094//global/cfs/cdirs/m3249/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs