justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

21 July 2025: This instance at RAL is read-only. Please do not try submitting new workflows for now.

Jobsub ID 185083.127@justin-prod-sched02.dune.hep.ac.uk

Jobsub ID185083.127@justin-prod-sched02.dune.hep.ac.uk
Workflow ID6465
Stage ID1
User namecalcuttj@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4193255424 (3999 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-04-29 00:36:32
SiteUK_RAL-Tier1
EntryLIGO_UK_RAL_arc_ce04
Last heartbeat2025-04-29 12:46:45
From worker nodeHostnamedune001-5446814.0-lcg2710.gridpp.rl.ac.uk
cpuinfoAMD EPYC 9654 96-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4193255424 (3999 MiB)
Wall seconds limit216000 (60 hours)
GPU
Inner Apptainer?True
Job statestalled
Allocator namejustin-allocator-pro.dune.hep.ac.uk
Started2025-04-29 09:06:43
Input filesmonte-carlo-006465-000993
Outputting started2025-04-29 12:33:48
Output files
Finished2025-04-29 13:42:11
List job events     Wrapper job log

Jobscript log (last 10,000 characters)

F7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdCl/XrdClSyncQueue.hh:66
#5  XrdCl::JobManager::RunJobs (this=0x2e2a7af0) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:151
#6  0x00007fc679f84619 in RunRunnerThread (arg=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:34
#7  0x00007fc69c007ea5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fc69aefeb0d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7fc6755fc700 (LWP 973) "hadd"):
#0  0x00007fc69c00db3b in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00007fc69c00dbcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007fc69c00dc6b in sem_wait

GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x00007fc679f84566 in XrdSysSemaphore::Wait (this=0x2e30c170) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdSys/XrdSysPthread.hh:509
#4  XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x2e2a7b08) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdCl/XrdClSyncQueue.hh:66
#5  XrdCl::JobManager::RunJobs (this=0x2e2a7af0) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:151
#6  0x00007fc679f84619 in RunRunnerThread (arg=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:34
#7  0x00007fc69c007ea5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fc69aefeb0d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7fc674dfb700 (LWP 972) "hadd"):
#0  0x00007fc69c00db3b in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00007fc69c00dbcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007fc69c00dc6b in sem_wait

GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x00007fc679f84566 in XrdSysSemaphore::Wait (this=0x2e30c170) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdSys/XrdSysPthread.hh:509
#4  XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x2e2a7b08) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdCl/XrdClSyncQueue.hh:66
#5  XrdCl::JobManager::RunJobs (this=0x2e2a7af0) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:151
#6  0x00007fc679f84619 in RunRunnerThread (arg=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClJobManager.cc:34
#7  0x00007fc69c007ea5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fc69aefeb0d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7fc6765fe700 (LWP 971) "hadd"):
#0  0x00007fc69c00ee9d in nanosleep () from /lib64/libpthread.so.0
#1  0x00007fc67a45646d in XrdSysTimer::Wait (mills=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdSys/XrdSysTimer.cc:239
#2  0x00007fc679f0497d in XrdCl::TaskManager::RunTasks (this=0x2e1989f0) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClTaskManager.cc:246
#3  0x00007fc679f04a89 in RunRunnerThread (arg=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdCl/XrdClTaskManager.cc:38
#4  0x00007fc69c007ea5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fc69aefeb0d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fc676dff700 (LWP 970) "hadd"):
#0  0x00007fc69aeff0e3 in epoll_wait () from /lib64/libc.so.6
#1  0x00007fc67a4507d7 in XrdSys::IOEvents::PollE::Begin (this=0x2e2ecc40, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/./XrdSys/XrdSysIOEventsPollE.icc:212
#2  0x00007fc67a44c905 in XrdSys::IOEvents::BootStrap::Start (parg=0x7fff4fc62e60) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdSys/XrdSysIOEvents.cc:149
#3  0x00007fc67a455b28 in XrdSysThread_Xeq (myargs=0x2e2a4a80) at /scratch/workspace/canvas-products/label1/swarm/label2/SLF7/build/xrootd/v5_5_5a/source/xrootd-5.5.5/src/XrdSys/XrdSysPthread.cc:86
#4  0x00007fc69c007ea5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fc69aefeb0d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fc6a0b63cc0 (LWP 948) "hadd"):
#0  0x00007fc69aec5659 in waitpid () from /lib64/libc.so.6
#1  0x00007fc69ae42f62 in do_system () from /lib64/libc.so.6
#2  0x00007fc69ae43311 in system () from /lib64/libc.so.6
#3  0x00007fc69c6e5ecc in TUnixSystem::Exec (shellcmd=<optimized out>, this=0x2ba67620) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/unix/src/TUnixSystem.cxx:2104
#4  TUnixSystem::StackTrace (this=0x2ba67620) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/unix/src/TUnixSystem.cxx:2395
#5  0x00007fc69c6e5894 in TUnixSystem::DispatchSignals (this=0x2ba67620, sig=kSigSegmentationViolation) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/unix/src/TUnixSystem.cxx:3615
#6  <signal handler called>
#7  0x00007fc69c597237 in (anonymous namespace)::R__ListSlowClose (files=0x2ba6d020) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/base/src/TROOT.cxx:1095
#8  0x00007fc69c597f84 in TROOT::CloseFiles (this=0x7fc69ca4b240 <ROOT::Internal::GetROOT1()::alloc>) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/base/src/TROOT.cxx:1145
#9  0x00007fc69ae39ce9 in __run_exit_handlers () from /lib64/libc.so.6
#10 0x00007fc69ae39d37 in exit () from /lib64/libc.so.6
#11 0x00007fc69ae2255c in __libc_start_main () from /lib64/libc.so.6
#12 0x0000000000406be9 in _start ()
===========================================================


The lines below might hint at the cause of the crash. If you see question
marks as part of the stack trace, try to recompile with debugging information
enabled and export CLING_DEBUG=1 environment variable before running.
You may get help by asking at the ROOT forum https://root.cern/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#7  0x00007fc69c597237 in (anonymous namespace)::R__ListSlowClose (files=0x2ba6d020) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/base/src/TROOT.cxx:1095
#8  0x00007fc69c597f84 in TROOT::CloseFiles (this=0x7fc69ca4b240 <ROOT::Internal::GetROOT1()::alloc>) at /scratch/workspace/critic-slf/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/root/v6_28_12/source/root-6.28.12/core/base/src/TROOT.cxx:1145
#9  0x00007fc69ae39ce9 in __run_exit_handlers () from /lib64/libc.so.6
#10 0x00007fc69ae39d37 in exit () from /lib64/libc.so.6
#11 0x00007fc69ae2255c in __libc_start_main () from /lib64/libc.so.6
#12 0x0000000000406be9 in _start ()
===========================================================


Querying usertests:calcuttj_g4bl_prod_full_1_042125-w6391s1p1 for 10 files
Query: files from usertests:calcuttj_g4bl_prod_full_1_042125-w6391s1p1 where dune.output_status=confirmed ordered skip 9920 limit 10
Getting names and metadata
done
{'beam.momentum': 1.0, 'beam.polarity': 1, 'core.data_stream': 'g4beamline', 'core.data_tier': 'root-tuple', 'core.file_format': 'root', 'core.file_type': 'mc', 'core.group': 'dune', 'core.run_type': 'ehn1-beam-np04', 'dune.output_status': 'confirmed', 'retention.class': 'physics', 'retention.status': 'active', 'core.runs': [185083], 'core.runs_subruns': [18508300127]}
Getting paths from rucio
Got 10 paths from 10 files
['hadd', 'H4_v34b_1GeV_-27.7_1_185083_127_20250429T090649.root', 'root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/usertests/24/29/H4_v34b_1GeV_-27.7_1_20250424T033430Z_000784.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/f7/ee/H4_v34b_1GeV_-27.7_1_20250424T045318Z_000180.root', 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/usertests/a7/d0/H4_v34b_1GeV_-27.7_1_20250424T053955Z_002329.root', 'root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/1e/6e/H4_v34b_1GeV_-27.7_1_20250424T075310Z_002810.root', 'root://xrootd-archive.cr.cnaf.infn.it:1096//dune/usertests/21/0b/H4_v34b_1GeV_-27.7_1_20250424T092848Z_002562.root', 'root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/usertests/1c/bd/H4_v34b_1GeV_-27.7_1_20250424T103705Z_006464.root', 'root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/usertests/e9/d7/H4_v34b_1GeV_-27.7_1_20250424T104013Z_005791.root', 'root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/2b/c1/H4_v34b_1GeV_-27.7_1_20250424T134004Z_007629.root', 'root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/usertests/be/0f/H4_v34b_1GeV_-27.7_1_20250424T135156Z_006564.root', 'root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/9d/29/H4_v34b_1GeV_-27.7_1_20250425T041651Z_007323.root']
Finishing metadata
justIN time: 2025-08-24 12:53:42 UTC       justIN version: 01.03.02