Jobsub ID 201113.35@justin-prod-sched02.dune.hep.ac.uk
Jobsub ID | 201113.35@justin-prod-sched02.dune.hep.ac.uk |
Workflow ID | 6731 |
Stage ID | 1 |
User name | calcuttj@fnal.gov |
HTCondor Group | group_dune |
Requested | Processors | 1 |
GPU | No |
RSS bytes | 4193255424 (3999 MiB) |
Wall seconds limit | 80000 (22 hours) |
Submitted time | 2025-05-08 18:54:57 |
Site | UK_Edinburgh |
Entry | DUNE_UK_SGridECDF_ce1 |
Last heartbeat | 2025-05-08 18:58:28 |
From worker node | Hostname | node2b23.ecdf.ed.ac.uk |
cpuinfo | Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 4193255424 (3999 MiB) |
Wall seconds limit | 171000 (47 hours) |
GPU | |
Inner Apptainer? | True |
Job state | jobscript_error |
Allocator name | justin-allocator-pro.dune.hep.ac.uk |
Started | 2025-05-08 18:56:37 |
Input files | monte-carlo-006731-000392
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Max RSS bytes | 0 (0 MiB) |
Outputting started | |
Output files | |
Finished | 2025-05-08 18:58:28 |
Saved logs | justin-logs:201113.35-justin-prod-sched02.dune.hep.ac.uk.logs.tgz |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
2314Z_000453.root to: root://ceph-svc21.gridpp.rl.ac.uk:1094/
[2025-05-08 19:57:57.452067 +0100][Debug ][XRootD ] 3. Retrying: root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/7c/6e/H4_v34b_1GeV_-27.7_1_20250430T222314Z_000453.root
[2025-05-08 19:57:57.452091 +0100][Debug ][ExDbgMsg ] [xrootd.echo.stfc.ac.uk:1094] Destroying MsgHandler: 0x327e1b0.
Error in <TNetXNGFile::Open>: [ERROR] Server responded with an error: [3011] File not found - too many attempts to gain dfs read access to the file
Error in <TFileMerger::AddFile>: cannot open file root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/7c/6e/H4_v34b_1GeV_-27.7_1_20250430T222314Z_000453.root
hadd exiting due to error in root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/7c/6e/H4_v34b_1GeV_-27.7_1_20250430T222314Z_000453.root
[2025-05-08 19:57:57.474766 +0100][Debug ][File ] [0x2a1dc60@root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/24/c0/H4_v34b_1GeV_-27.7_1_20250430T191733Z_000241.root?xrdcl.requuid=319f00a4-2fa7-4eec-ae67-58ef2adb8b8f] Sending a close command for handle 0x0 to pubstor2202.fnal.gov:20201
[2025-05-08 19:57:57.474832 +0100][Debug ][ExDbgMsg ] [pubstor2202.fnal.gov:20201] MsgHandler created: 0x32ac950 (message: kXR_close (handle: 0x00000000) ).
[2025-05-08 19:57:57.474906 +0100][Debug ][ExDbgMsg ] [pubstor2202.fnal.gov:20201] Moving MsgHandler: 0x32ac950 (message: kXR_close (handle: 0x00000000) ) from out-queu to in-queue.
[2025-05-08 19:57:57.581414 +0100][Debug ][ExDbgMsg ] [msg: 0x28c2900] Assigned MsgHandler: 0x32ac950.
[2025-05-08 19:57:57.581465 +0100][Debug ][ExDbgMsg ] [handler: 0x32ac950] Removed MsgHandler: 0x32ac950 from the in-queue.
[2025-05-08 19:57:57.581504 +0100][Debug ][ExDbgMsg ] [pubstor2202.fnal.gov:20201] Calling MsgHandler: 0x32ac950 (message: kXR_close (handle: 0x00000000) ) with status: [SUCCESS] .
[2025-05-08 19:57:57.581526 +0100][Debug ][File ] [0x2a1dc60@root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/24/c0/H4_v34b_1GeV_-27.7_1_20250430T191733Z_000241.root?xrdcl.requuid=319f00a4-2fa7-4eec-ae67-58ef2adb8b8f] Close returned from pubstor2202.fnal.gov:20201 with: [SUCCESS]
[2025-05-08 19:57:57.581543 +0100][Debug ][ExDbgMsg ] [pubstor2202.fnal.gov:20201] Destroying MsgHandler: 0x32ac950.
[2025-05-08 19:57:57.581660 +0100][Debug ][File ] [0x3241640@root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/47/52/H4_v34b_1GeV_-27.7_1_20250430T202155Z_004026.root?xrdcl.requuid=353c2e74-c040-4630-bdb5-ab4e8927430e] Sending a close command for handle 0x0 to pubstor2302.fnal.gov:22471
[2025-05-08 19:57:57.581681 +0100][Debug ][ExDbgMsg ] [pubstor2302.fnal.gov:22471] MsgHandler created: 0x32ac950 (message: kXR_close (handle: 0x00000000) ).
[2025-05-08 19:57:57.581731 +0100][Debug ][ExDbgMsg ] [pubstor2302.fnal.gov:22471] Moving MsgHandler: 0x32ac950 (message: kXR_close (handle: 0x00000000) ) from out-queu to in-queue.
[2025-05-08 19:58:14.168842 +0100][Debug ][ExDbgMsg ] [msg: 0x2ceb520] Assigned MsgHandler: 0x32ac950.
[2025-05-08 19:58:14.168887 +0100][Debug ][ExDbgMsg ] [handler: 0x32ac950] Removed MsgHandler: 0x32ac950 from the in-queue.
[2025-05-08 19:58:14.168925 +0100][Debug ][ExDbgMsg ] [pubstor2302.fnal.gov:22471] Calling MsgHandler: 0x32ac950 (message: kXR_close (handle: 0x00000000) ) with status: [SUCCESS] .
[2025-05-08 19:58:14.168936 +0100][Debug ][File ] [0x3241640@root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/47/52/H4_v34b_1GeV_-27.7_1_20250430T202155Z_004026.root?xrdcl.requuid=353c2e74-c040-4630-bdb5-ab4e8927430e] Close returned from pubstor2302.fnal.gov:22471 with: [SUCCESS]
[2025-05-08 19:58:14.168958 +0100][Debug ][ExDbgMsg ] [pubstor2302.fnal.gov:22471] Destroying MsgHandler: 0x32ac950.
[2025-05-08 19:58:14.169368 +0100][Debug ][JobMgr ] Stopping the job manager...
[2025-05-08 19:58:14.169579 +0100][Debug ][JobMgr ] Job manager stopped
[2025-05-08 19:58:14.169591 +0100][Debug ][TaskMgr ] Stopping the task manager...
[2025-05-08 19:58:14.169718 +0100][Debug ][TaskMgr ] Task manager stopped
[2025-05-08 19:58:14.169725 +0100][Debug ][Poller ] Stopping the poller...
[2025-05-08 19:58:14.169808 +0100][Debug ][AsyncSock ] [ceph-svc11.gridpp.rl.ac.uk:1094.0] Closing the socket
[2025-05-08 19:58:14.169821 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:33424><--><[::ffff:130.246.178.211]:1094> Removing socket from the poller
[2025-05-08 19:58:14.169932 +0100][Debug ][PostMaster ] [ceph-svc11.gridpp.rl.ac.uk:1094] Destroying stream
[2025-05-08 19:58:14.169944 +0100][Debug ][AsyncSock ] [ceph-svc11.gridpp.rl.ac.uk:1094.0] Closing the socket
[2025-05-08 19:58:14.169960 +0100][Debug ][AsyncSock ] [ceph-svc21.gridpp.rl.ac.uk:1094.0] Closing the socket
[2025-05-08 19:58:14.169964 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:34874><--><[::ffff:130.246.179.110]:1094> Removing socket from the poller
[2025-05-08 19:58:14.169994 +0100][Debug ][PostMaster ] [ceph-svc21.gridpp.rl.ac.uk:1094] Destroying stream
[2025-05-08 19:58:14.169999 +0100][Debug ][AsyncSock ] [ceph-svc21.gridpp.rl.ac.uk:1094.0] Closing the socket
[2025-05-08 19:58:14.170006 +0100][Debug ][AsyncSock ] [fndca1.fnal.gov:1094.0] Closing the socket
[2025-05-08 19:58:14.170011 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:42916><--><[::ffff:131.225.69.121]:1094> Removing socket from the poller
[2025-05-08 19:58:14.170037 +0100][Debug ][PostMaster ] [fndca1.fnal.gov:1094] Destroying stream
[2025-05-08 19:58:14.170042 +0100][Debug ][AsyncSock ] [fndca1.fnal.gov:1094.0] Closing the socket
[2025-05-08 19:58:14.170050 +0100][Debug ][AsyncSock ] [pubstor2202.fnal.gov:20201.0] Closing the socket
[2025-05-08 19:58:14.170054 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:36090><--><[::ffff:131.225.69.219]:20201> Removing socket from the poller
[2025-05-08 19:58:14.170068 +0100][Debug ][PostMaster ] [pubstor2202.fnal.gov:20201] Destroying stream
[2025-05-08 19:58:14.170072 +0100][Debug ][AsyncSock ] [pubstor2202.fnal.gov:20201.0] Closing the socket
[2025-05-08 19:58:14.170078 +0100][Debug ][AsyncSock ] [pubstor2302.fnal.gov:22471.0] Closing the socket
[2025-05-08 19:58:14.170083 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:49966><--><[::ffff:131.225.69.89]:22471> Removing socket from the poller
[2025-05-08 19:58:14.170090 +0100][Debug ][PostMaster ] [pubstor2302.fnal.gov:22471] Destroying stream
[2025-05-08 19:58:14.170094 +0100][Debug ][AsyncSock ] [pubstor2302.fnal.gov:22471.0] Closing the socket
[2025-05-08 19:58:14.170100 +0100][Debug ][AsyncSock ] [xrootd.echo.stfc.ac.uk:1094.0] Closing the socket
[2025-05-08 19:58:14.170104 +0100][Debug ][Poller ] <[::ffff:192.41.105.56]:40210><--><[::ffff:130.246.217.9]:1094> Removing socket from the poller
[2025-05-08 19:58:14.170174 +0100][Debug ][PostMaster ] [xrootd.echo.stfc.ac.uk:1094] Destroying stream
[2025-05-08 19:58:14.170180 +0100][Debug ][AsyncSock ] [xrootd.echo.stfc.ac.uk:1094.0] Closing the socket
Querying usertests:calcuttj_g4bl_prod_full_1_042825-w6502s1p1 for 10 files
Query: files from usertests:calcuttj_g4bl_prod_full_1_042825-w6502s1p1 where dune.output_status=confirmed ordered skip 3910 limit 10
Getting names and metadata
done
{'beam.momentum': 1.0, 'beam.polarity': 1, 'core.data_stream': 'g4beamline', 'core.data_tier': 'root-tuple', 'core.file_format': 'root', 'core.file_type': 'mc', 'core.group': 'dune', 'core.run_type': 'ehn1-beam-np04', 'dune.output_status': 'confirmed', 'retention.class': 'physics', 'retention.status': 'active', 'core.runs': [201113], 'core.runs_subruns': [20111300035]}
Getting paths from rucio
Got 10 paths from 10 files
['hadd', 'H4_v34b_1GeV_-27.7_1_201113_35_20250508T185640.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/24/c0/H4_v34b_1GeV_-27.7_1_20250430T191733Z_000241.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/47/52/H4_v34b_1GeV_-27.7_1_20250430T202155Z_004026.root', 'root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/7c/6e/H4_v34b_1GeV_-27.7_1_20250430T222314Z_000453.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/2e/8f/H4_v34b_1GeV_-27.7_1_20250430T232435Z_000652.root', 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/usertests/06/b4/H4_v34b_1GeV_-27.7_1_20250430T235819Z_004822.root', 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/usertests/07/6e/H4_v34b_1GeV_-27.7_1_20250501T001322Z_005592.root', 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/usertests/42/c3/H4_v34b_1GeV_-27.7_1_20250501T002124Z_003814.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/73/33/H4_v34b_1GeV_-27.7_1_20250501T002501Z_004489.root', 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/usertests/ee/04/H4_v34b_1GeV_-27.7_1_20250501T005943Z_005344.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/fd/62/H4_v34b_1GeV_-27.7_1_20250501T013216Z_004015.root']
Traceback (most recent call last):
File "/cvmfs/fifeuser3.opensciencegrid.org/sw/dune/4e9b42dda8c1cbee7b07e2de7059f47384a3867b/merge_g4bl.py", line 259, in <module>
do_merge(args)
File "/cvmfs/fifeuser3.opensciencegrid.org/sw/dune/4e9b42dda8c1cbee7b07e2de7059f47384a3867b/merge_g4bl.py", line 111, in do_merge
raise Exception('Error in hadd')
Exception: Error in hadd
Exiting with error