justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 148241.43@justin-prod-sched02.dune.hep.ac.uk

Jobsub ID148241.43@justin-prod-sched02.dune.hep.ac.uk
Workflow ID5271
Stage ID1
User namelwhite86@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit3600 (1 hours)
Submitted time2025-02-24 14:04:20
SiteUS_UConn-HPC
EntryGLUEX_US_UConn-HPC_osgce
Last heartbeat2025-02-24 16:42:53
From worker nodeHostnamecn416
cpuinfoAMD EPYC 7452 32-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit172800 (48 hours)
GPU
Inner Apptainer?True
Job stateoutputting_failed
Allocator namejustin-allocator-pro.dune.hep.ac.uk
Started2025-02-24 16:40:57
Input filesfardet-hd:nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root
JobscriptExit code0
Real time1m (105s)
CPU time1m (60s = 57%)
Max RSS bytes1157664768 (1104 MiB)
Outputting started2025-02-24 16:42:43
Output files
Finished2025-02-24 16:42:53
List job events     Wrapper job log

Jobscript log (last 10,000 characters)

Geometry>: 67545 nodes/ 3831 volume UID's in Geometry imported from GDML
Info in <TGeoManager::CloseGeometry>: ----------------modeler ready----------------
GeoApaChannelGroupService::ctor: Group 0 (apa00) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 1 (apa01) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 2 (apa02) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 3 (apa03) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 4 (apa04) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 5 (apa05) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 6 (apa06) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 7 (apa07) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 8 (apa08) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 9 (apa09) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 10 (apa10) has 2560 channels from 4/4 readout planes.
GeoApaChannelGroupService::ctor: Group 11 (apa11) has 2560 channels from 4/4 readout planes.
DuneToolManager::fclFilename: Taking fcl name from command line: lar -c /cvmfs/fifeuser3.opensciencegrid.org/sw/dune/6dd14ff1981743720bdff232ee6561f784728ca6/runPandora.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/fardet-hd/7a/f8/nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root
AcdDigitReader::ctor:     LogLevel: 1
DuneToolManager::getPrivate: ERROR: Tool name is blank
StandardRawDigitExtractService::ctor: Retrieved digit read tool digitReader
StandardRawDigitExtractService::ctor: StandardRawDigitExtractService:
StandardRawDigitExtractService::ctor:         LogLevel: 1
StandardRawDigitExtractService::ctor:    DigitReadTool: digitReader
StandardRawDigitExtractService::ctor:   PedestalOption: 1
StandardRawDigitExtractService::ctor:     FlagStuckOff: 0
StandardRawDigitExtractService::ctor:      FlagStuckOn: 0
StandardRawDigitPrepService::ctor: Fetching extract service.
StandardRawDigitPrepService::ctor: Fetching channel status provider.
StandardRawDigitPrepService::ctor:   Channel status provider: @0xbca41b0
StandardRawDigitPrepService::ctor:   Extract service: @0xd380b00
StandardRawDigitPrepService::ctor: Fetching deconvolution service.
StandardRawDigitPrepService::ctor:   Deconvolution service: @0x438b840
StandardRawDigitPrepService::ctor: Fetching ROI building service.
StandardRawDigitPrepService::ctor:   ROI building service: @0x438b870
StandardRawDigitPrepService::ctor: Fetching wire building service.
StandardRawDigitPrepService::ctor:   Wire building service: @0xb7f99e0
StandardRawDigitPrepService::ctor: StandardRawDigitPrepService:
StandardRawDigitPrepService::ctor:              LogLevel: 1
StandardRawDigitPrepService::ctor:               SkipBad: 1
StandardRawDigitPrepService::ctor:             SkipNoisy: 0
StandardRawDigitPrepService::ctor:   ChannelStatusOnline: 0
StandardRawDigitPrepService::ctor:          DoMitigation: 0
StandardRawDigitPrepService::ctor:  DoEarlySignalFinding: 0
StandardRawDigitPrepService::ctor:        DoNoiseRemoval: 0
StandardRawDigitPrepService::ctor:       DoDeconvolution: 1
StandardRawDigitPrepService::ctor:  DoPedestalAdjustment: 0
StandardRawDigitPrepService::ctor:                 DoROI: 1
StandardRawDigitPrepService::ctor:               DoWires: 1
StandardRawDigitPrepService::ctor:                DoDump: 0
StandardRawDigitPrepService::ctor:  DoIntermediateStates: 0
StandardRawDigitPrepService::ctor: No display tools.
Warning in <TFile::Append>: Replacing existing TH1: FieldResponse_U (Potential memory leak).
Warning in <TFile::Append>: Replacing existing TH1: FieldResponse_V (Potential memory leak).
Warning in <TFile::Append>: Replacing existing TH1: FieldResponse_Y (Potential memory leak).
24-Feb-2025 11:41:29 EST  Initiating request to open input file "root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/fardet-hd/7a/f8/nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root"
24-Feb-2025 11:41:33 EST  Opened input file "root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/fardet-hd/7a/f8/nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root"
Begin processing the 1st record. run: 1434 subRun: 1 event: 1 at 24-Feb-2025 11:41:35 EST
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_1_U_v04_06_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_1_V_v04_06_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_1_W_v04_06_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_2_U_v04_06_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_2_V_v04_06_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Accel_2_W_v04_06_00.pt'
PandoraMonitoring, only able to use default TApplication (limited functionality).
24-Feb-2025 11:41:38 EST  Opened output file with pattern "%ifb_reco2.root"
Begin processing the 2nd record. run: 1434 subRun: 1 event: 2 at 24-Feb-2025 11:41:42 EST
Begin processing the 3rd record. run: 1434 subRun: 1 event: 3 at 24-Feb-2025 11:41:49 EST
Begin processing the 4th record. run: 1434 subRun: 1 event: 4 at 24-Feb-2025 11:41:54 EST
Begin processing the 5th record. run: 1434 subRun: 1 event: 5 at 24-Feb-2025 11:42:01 EST
Begin processing the 6th record. run: 1434 subRun: 1 event: 6 at 24-Feb-2025 11:42:08 EST
Begin processing the 7th record. run: 1434 subRun: 1 event: 7 at 24-Feb-2025 11:42:15 EST
Begin processing the 8th record. run: 1434 subRun: 1 event: 8 at 24-Feb-2025 11:42:22 EST
Begin processing the 9th record. run: 1434 subRun: 1 event: 9 at 24-Feb-2025 11:42:27 EST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNEventClassification, STATUS_CODE_FAILURE
Begin processing the 10th record. run: 1434 subRun: 1 event: 10 at 24-Feb-2025 11:42:33 EST
24-Feb-2025 11:42:41 EST  Closed output file "nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2_reco2.root"
24-Feb-2025 11:42:41 EST  Closed input file "root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/fardet-hd/7a/f8/nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root"

================================================================================================================================
TimeTracker printout (sec)                        Min           Avg           Max         Median          RMS         nEvts   
================================================================================================================================
Full event                                      5.14087       6.14396       7.22092       6.24811       0.63433        10     
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                         0.0415617     0.069392      0.116681      0.0679417     0.0184753       10     
reco:pandora2:StandardPandora                   1.53916       2.00449       2.82249       1.93944      0.371049        10     
[art]:TriggerResults:TriggerResultInserter    1.1552e-05    2.11172e-05   3.9705e-05    1.91865e-05   7.66653e-06      10     
end_path:out1:RootOutput                       2.414e-06    3.9532e-06    1.1612e-05    2.8405e-06    2.64342e-06      10     
end_path:out1:RootOutput(write)                 3.42874       4.06985       4.93056       4.01088      0.470901        10     
================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 2151.49 MB
  Peak resident set size usage (VmHWM): 1157.66 MB
====================================================================================================
Art has completed and will exit with status 0.
lar exit code 0
total 76020
-rw-r--r-- 1 osgusers domain users      206 Feb 24 11:40 all-input-dids.txt
-rw-r--r-- 1 osgusers domain users        0 Feb 24 11:41 debugprod.log
-rw-r--r-- 1 osgusers domain users    32185 Feb 24 11:42 jobscript.log
-rw-r--r-- 1 osgusers domain users      179 Feb 24 11:42 justin-processed-pfns.txt
drwxr-xr-x 4 osgusers domain users       48 Feb 24 11:40 larpandoracontent
-rw-r--r-- 1 osgusers domain users 77716237 Feb 24 11:42 nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2_reco2.root
-rw-r--r-- 1 osgusers domain users      519 Feb 24 11:42 reco2_hist.root
-rw-r--r-- 1 osgusers domain users    79806 Feb 24 11:42 trainingFile_nu_dune10kt_1x2x6_1434_0_20230828T093259Z_gen_g4_detsim_hitreco__20240223T003752Z_reco2.root
justIN time: 2025-04-03 08:26:30 UTC       justIN version: 01.03.00