Jobsub ID 359485.0@justin-prod-sched01.dune.hep.ac.uk
Jobsub ID | 359485.0@justin-prod-sched01.dune.hep.ac.uk |
Workflow ID | 5815 |
Stage ID | 1 |
User name | mhandley@fnal.gov |
HTCondor Group | group_dune |
Requested | Processors | 1 |
GPU | No |
RSS bytes | 4194304000 (4000 MiB) |
Wall seconds limit | 80000 (22 hours) |
Submitted time | 2025-04-01 00:10:50 |
Site | UK_QMUL |
Entry | DUNE_UK_London_QMUL_arcce02 |
Last heartbeat | 2025-04-01 00:50:51 |
From worker node | Hostname | cn537.htc.esc.qmul |
cpuinfo | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 4194304000 (4000 MiB) |
Wall seconds limit | 171000 (47 hours) |
GPU | |
Inner Apptainer? | True |
Job state | jobscript_error |
Allocator name | justin-allocator-pro.dune.hep.ac.uk |
Started | 2025-04-01 00:14:35 |
Input files | justin-tutorial:tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148.hdf5
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Max RSS bytes | 0 (0 MiB) |
Outputting started | |
Output files | |
Finished | 2025-04-01 00:50:51 |
Saved logs | justin-logs:359485.0-justin-prod-sched01.dune.hep.ac.uk.logs.tgz |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
e 1196 in H5B_iterate(): B-tree iteration failed
major: B-Tree node
minor: Iteration failed
#007: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5B.c line 1155 in H5B__iterate_helper(): B-tree iteration failed
major: B-Tree node
minor: Iteration failed
#008: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5B.c line 1155 in H5B__iterate_helper(): B-tree iteration failed
major: B-Tree node
minor: Iteration failed
#009: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Gnode.c line 1018 in H5G__node_sumup(): unable to load symbol table node
major: Symbol table
minor: Unable to load metadata into cache
#010: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5AC.c line 1426 in H5AC_protect(): H5C_protect() failed
major: Object cache
minor: Unable to protect metadata
#011: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5C.c line 2370 in H5C_protect(): can't load entry
major: Object cache
minor: Unable to load metadata into cache
#012: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5C.c line 7209 in H5C__load_entry(): Can't read image*
major: Object cache
minor: Read failed
#013: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fio.c line 148 in H5F_block_read(): read through page buffer failed
major: Low-level I/O
minor: Read failed
#014: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5PB.c line 721 in H5PB_read(): read through metadata accumulator failed
major: Page Buffering
minor: Read failed
#015: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Faccum.c line 202 in H5F__accum_read(): driver read request failed
major: Low-level I/O
minor: Read failed
#016: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDint.c line 189 in H5FD_read(): driver read request failed
major: Virtual File Layer
minor: Read failed
#017: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDsec2.c line 755 in H5FD__sec2_read(): file read failed: time = Tue Apr 1 01:49:19 2025
, filename = 'root://meitner.tier2.hep.manchester.ac.uk:1094//cephfs/experiments/dune/RSE/justin-tutorial/53/85/tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148.hdf5', file descriptor = 26, errno = 116, error message = 'Stale file handle', buf = 0xdf17790, total read size = 328, bytes this sub-read = 328, bytes actually read = 18446744073709551615, offset = 0
major: Low-level I/O
minor: Read failed
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 0:
#000: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5F.c line 711 in H5Fclose(): decrementing file ID failed
major: File accessibility
minor: Unable to close file
#001: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Iint.c line 1018 in H5I_dec_app_ref(): can't decrement ID ref count
major: Object atom
minor: Unable to decrement reference count
#002: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fint.c line 251 in H5F__close_cb(): unable to close file
major: File accessibility
minor: Unable to close file
#003: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 3983 in H5VL_file_close(): file close failed
major: Virtual Object Layer
minor: Unable to close file
#004: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 3952 in H5VL__file_close(): file close failed
major: Virtual Object Layer
minor: Unable to close file
#005: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLnative_file.c line 838 in H5VL__native_file_close(): can't close file
major: File accessibility
minor: Unable to decrement reference count
#006: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fint.c line 2349 in H5F__close(): can't close file
major: File accessibility
minor: Unable to close file
#007: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fint.c line 2522 in H5F_try_close(): problems closing file
major: File accessibility
minor: Unable to close file
#008: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fint.c line 1605 in H5F__dest(): unable to close file
major: File accessibility
minor: Unable to close file
#009: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FD.c line 830 in H5FD_close(): close failed
major: Virtual File Layer
minor: Unable to close file
#010: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e20/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDsec2.c line 456 in H5FD__sec2_close(): unable to close file, errno = 110, error message = 'Connection timed out'
major: Low-level I/O
minor: Unable to close file
DataPrepByApaModule::endJob: # events processed: 0
DataPrepByApaModule::endJob: # events skipped: 0
====================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
====================================================================================================================
[ No processed events ]
====================================================================================================================
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 2728.75 MB
Peak resident set size usage (VmHWM): 1761.43 MB
Details saved in: 'mem.db'
====================================================================================================
PandoraMonitoring, only able to use default TApplication (limited functionality).
PandoraMonitoring::SaveTree, error: No tree with name 'Validation' exists.
ToolBasedRawDigitPrepService:dtor: Event count: 0
ToolBasedRawDigitPrepService:dtor: Call count: 0
ToolBasedRawDigitPrepService:dtor: Time report for 7 tools.
ToolBasedRawDigitPrepService:dtor: digitReader :0.00 sec
ToolBasedRawDigitPrepService:dtor: vdcb_adcChannelRawRmsFiller :0.00 sec
ToolBasedRawDigitPrepService:dtor: adcSampleFiller :0.00 sec
ToolBasedRawDigitPrepService:dtor: vdbcb_adcScaleAdcToKe :0.00 sec
ToolBasedRawDigitPrepService:dtor: vdbcb_cnrw :0.00 sec
ToolBasedRawDigitPrepService:dtor: adcKeepAllSignalFinder :0.00 sec
ToolBasedRawDigitPrepService:dtor: vdbcb_adcScaleKeToAdc :0.00 sec
=== End last 100 lines of lar log file ===
lar exit code 0
Traceback (most recent call last):
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_00d00/bin/extractor_prod.py", line 434, in <module>
main()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_00d00/bin/extractor_prod.py", line 373, in main
mddict = expSpecificMetadata.getmetadata()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_00d00/bin/extractor_prod.py", line 344, in getmetadata
jobt = self.get_job(proc)
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_00d00/bin/extractor_prod.py", line 69, in get_job
raise RuntimeError('sam_metadata_dumper returned nonzero exit status {}.'.format(rc))
RuntimeError: sam_metadata_dumper returned nonzero exit status 1.
extractor_prod.py exit code 1
Error reading metadata from file: Expecting value: line 1 column 1 (char 0)
pdjson2metadata exit code 1
.:
total 108
-rw-r--r-- 1 pildune22 pildune 36864 Apr 1 01:50 mem.db
-rw-r--r-- 1 pildune22 pildune 33249 Apr 1 01:50 tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148_reco_2025-04-01T_001439Z.log
-rw-r--r-- 1 pildune22 pildune 16384 Apr 1 01:18 time.db
-rw-r--r-- 1 pildune22 pildune 9829 Apr 1 01:50 jobscript.log
-rw-r--r-- 1 pildune22 pildune 519 Apr 1 01:50 tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148_reco_hist.root
-rw-r--r-- 1 pildune22 pildune 182 Apr 1 01:14 all-input-dids.txt
-rw-r--r-- 1 pildune22 pildune 0 Apr 1 01:19 Pandora_Events.pndr
-rw-r--r-- 1 pildune22 pildune 0 Apr 1 01:16 debugprod.log
-rw-r--r-- 1 pildune22 pildune 0 Apr 1 01:50 tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148_reco_data_2025-04-01T_001439Z.root.ext.json
-rw-r--r-- 1 pildune22 pildune 0 Apr 1 01:50 tut_np02bde_307160012_np02_bde_coldbox_run012352_0053_20211216T000148_reco_data_2025-04-01T_001439Z.root.json