21 July 2025: This instance at RAL is read-only. Please do not try submitting new workflows for now.
Jobsub ID 418908.0@justin-prod-sched01.dune.hep.ac.uk
Jobsub ID | 418908.0@justin-prod-sched01.dune.hep.ac.uk | |
Workflow Testing | Yes | |
Workflow ID | 1 | |
Stage ID | 1 | |
User name | amcnab@fnal.gov | |
HTCondor Group | group_dune.prod_mcsim | |
Requested | Processors | 1 |
GPU | No | |
RSS bytes | 1073741824 (1024 MiB) | |
Wall seconds limit | 3600 (1 hours) | |
Submitted time | 2025-06-20 16:36:21 | |
Site | CH_UNIBE-LHEP | |
Entry | OSG_CH_UNIBE_LHEP_ce02 | |
Last heartbeat | 2025-06-20 17:52:50 | |
From worker node | Hostname | wn-1-36.local |
cpuinfo | Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz | |
OS release | Scientific Linux release 7.9 (Nitrogen) | |
Processors | 1 | |
RSS bytes | 1073741824 (1024 MiB) | |
Wall seconds limit | 86400 (24 hours) | |
GPU | ||
Inner Apptainer? | True | |
Job state | stalled | |
Allocator name | justin-allocator-pro.dune.hep.ac.uk | |
Started | 2025-06-20 16:46:32 | |
Input files | ||
Outputting started | 2025-06-20 17:52:50 | |
Output files | ||
Finished | 2025-06-20 18:54:50 | |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
th/x509_proxy HTTP/1.1" 200 0 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): dune-rucio.fnal.gov:443 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /replicas HTTP/1.1" 201 7 INFO:root:Successfully added replica in Rucio catalogue at T3_US_NERSC DEBUG:root:gfal.NoRename: connecting to storage DEBUG:root:Checking if davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u exists DEBUG:root:gfal.NoRename: checking if file exists davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:gfal.NoRename: closing protocol connection DEBUG:root:[{'hostname': 'dtn14.nersc.gov', 'scheme': 'root', 'port': 1094, 'prefix': '//global/cfs/cdirs/m3249/dune/RSE', 'impl': 'rucio.rse.protocols.gfal.NoRename', 'domains': {'lan': {'read': 10, 'write': 10, 'delete': 10}, 'wan': {'read': 10, 'write': 10, 'delete': 10, 'third_party_copy_read': 0, 'third_party_copy_write': 0}}, 'extended_attributes': None}, {'hostname': 'dtn14.nersc.gov', 'scheme': 'davs', 'port': 1094, 'prefix': '/global/cfs/cdirs/m3249/dune/RSE', 'impl': 'rucio.rse.protocols.gfal.NoRename', 'domains': {'lan': {'read': 1, 'write': 1, 'delete': 1}, 'wan': {'read': 1, 'write': 1, 'delete': 1, 'third_party_copy_read': 1, 'third_party_copy_write': 1}}, 'extended_attributes': None}] INFO:root:Trying upload with davs to T3_US_NERSC DEBUG:root:Processing upload with the domain: wan DEBUG:root:gfal.NoRename: connecting to storage DEBUG:root:The PFN created from the LFN: davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:gfal.NoRename: checking if file exists davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:gfal.NoRename: checking if file exists davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:put: Attempt 1 DEBUG:root:gfal.NoRename: uploading file from awt-1750437998-XrcuPs6s1u to davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u INFO:root:Successful upload of temporary file. davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:skip_upload_stat=False DEBUG:root:stat: pfn=davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:gfal.NoRename: getting stats of file davs://dtn14.nersc.gov:1094/global/cfs/cdirs/m3249/dune/RSE/testpro/83/4f/awt-1750437998-XrcuPs6s1u DEBUG:root:Filesize: Expected=26 Found=26 DEBUG:root:Checksum: Expected=6371077a Found=6371077a DEBUG:root:gfal.NoRename: closing protocol connection DEBUG:root:Upload done. INFO:root:Successfully uploaded file awt-1750437998-XrcuPs6s1u DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): dune-rucio.fnal.gov:443 /cvmfs/dune.opensciencegrid.org/products/dune/rucio/v37_1_0_post1/NULL/lib/python3.9/site-packages/urllib3/connectionpool.py:1061: InsecureRequestWarning: Unverified HTTPS request is being made to host 'dune-rucio.fnal.gov'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings warnings.warn( DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /traces/ HTTP/1.1" 404 207 DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x2b1067b4d220> acquired DEBUG:dogpile.lock:Calling creation function for previously expired value DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:Released creation lock DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "PUT /replicas HTTP/1.1" 504 247 [33;1m2025-06-20 19:50:11,247 WARNING Waiting 0.25s due to reason: server returned 504 [0m WARNING:baseclient:Waiting 0.25s due to reason: server returned 504 DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "PUT /replicas HTTP/1.1" 200 0 DEBUG:dogpile.lock:value creation lock <dogpile.cache.region.CacheRegion._LockWrapper object at 0x2b106447f640> acquired DEBUG:dogpile.lock:Calling creation function for previously expired value DEBUG:dogpile.cache.region:Cache value generated in 0.000 seconds for key(s): "host_to_choose_choice['https://dune-rucio.fnal.gov']" DEBUG:dogpile.lock:Released creation lock DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /dids/testpro/awt-uploads-202524/dids HTTP/1.1" 503 299 [33;1m2025-06-20 19:51:16,498 WARNING Waiting 0.25s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.25s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Resetting dropped connection: dune-rucio.fnal.gov DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /dids/testpro/awt-uploads-202524/dids HTTP/1.1" 503 299 [33;1m2025-06-20 19:51:32,608 WARNING Waiting 0.5s due to reason: server returned 503 [0m WARNING:baseclient:Waiting 0.5s due to reason: server returned 503 DEBUG:urllib3.connectionpool:Resetting dropped connection: dune-rucio.fnal.gov DEBUG:urllib3.connectionpool:https://dune-rucio.fnal.gov:443 "POST /dids/testpro/awt-uploads-202524/dids HTTP/1.1" 504 247 [33;1m2025-06-20 19:52:46,951 WARNING Waiting 1.0s due to reason: server returned 504 [0m WARNING:baseclient:Waiting 1.0s due to reason: server returned 504 ERROR:root:Failed to attach file to the dataset DEBUG:root:Attaching to dataset An unknown exception occurred. Details: no error information passed (http status code: 504) --- Upload try 1/1 --- Rucio upload 1/1 fails: None of the given files have been uploaded. --- Exit with 99 'justin-rucio-upload --rse T3_US_NERSC --protocol davs --scope testpro --dataset awt-uploads-202524 awt-1750437998-XrcuPs6s1u --timeout 1200' returns 99 subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=3207056340/CN=175043799217 issuer : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=3207056340 identity : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk/CN=3207056340 type : RFC compliant proxy strength : 2048 bits path : /home/awt-proxy.pem timeleft : 166:53:44 key usage : Digital Signature, Key Encipherment, Key Agreement === VO dune extension information === VO : dune subject : /C=UK/O=eScience/OU=Manchester/L=HEP/CN=justin-jobs-production.dune.hep.ac.uk issuer : /DC=org/DC=incommon/C=US/ST=Illinois/O=Fermi Research Alliance/CN=voms2.fnal.gov attribute : /dune/Role=Production/Capability=NULL attribute : /dune/Role=NULL/Capability=NULL timeleft : 153:53:14 uri : voms2.fnal.gov:15042 ===== Results ===== Download/upload commands: xrdcp --force --nopbar --verbose $read_pfn downloaded.txt echo '{"namespace":"testpro","name":"FILENAME","size":0}' >tmp.json metacat file declare --json -f tmp.json "dune:all" justin-rucio-upload --rse $rse_name --protocol $write_protocol --scope testpro --dataset awt-uploads-202524 --timeout 1200 FILENAME Use the wrapper job link on the page for the job on the justIN Dashboard to find the full log file, with errors from these commands Each line: $JUSTIN_SITE_NAME $rse_name $download_retval $upload_retval $read_pfn $write_protocol ==awt== CH_UNIBE-LHEP DUNE_CA_SFU 0 0 root://lcg-dunese1.sfu.computecanada.ca:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_CERN_EOS 0 98 root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_ES_PIC 51 99 root://xrootd.pic.es:1094/pnfs/pic.es/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_FR_CCIN2P3_DISK 0 1 root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_IT_INFN_CNAF 51 99 root://xrootd-archive.cr.cnaf.infn.it:1096//dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_UK_GLASGOW 0 0 root://cephc02.gla.scotgrid.ac.uk:1094//cephfs/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_UK_LANCASTER_CEPH 0 0 root://xgate.hec.lancs.ac.uk:1094//cephfs/grid/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_UK_MANCHESTER_CEPH 0 99 root://meitner.tier2.hep.manchester.ac.uk:1094//cephfs/experiments/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_US_BNL_SDCC 0 0 root://dcdndoor.sdcc.bnl.gov:1094//pnfs/sdcc.bnl.gov/data/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP DUNE_US_FNAL_DISK_STAGE 0 0 root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP NIKHEF 0 0 root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP PRAGUE 0 0 root://golias100.farm.particle.cz:1094/dpm/farm.particle.cz/home/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP QMUL 0 0 root://xrootd1.esc.qmul.ac.uk:1094//dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP RAL-PP 0 0 root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP RAL_ECHO 0 0 root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP SURFSARA 0 0 root://otter12.grid.surfsara.nl:21094/pnfs/grid.sara.nl/data/dune/disk/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs ==awt== CH_UNIBE-LHEP T3_US_NERSC 0 99 root://dtn14.nersc.gov:1094//global/cfs/cdirs/m3249/dune/RSE/testpro/bb/7f/awt-download-2023-03-07-01.txt davs