justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 108522.137@justin-prod-sched02.dune.hep.ac.uk

Jobsub ID108522.137@justin-prod-sched02.dune.hep.ac.uk
Workflow ID4213
Stage ID1
User nameismerio@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2024-11-22 22:48:20
SiteUK_Imperial
EntryDUNE_T2_UK_London_IC_ceprod02
Last heartbeat2024-11-22 23:14:12
From worker nodeHostnamewj54.grid.hep.ph.ic.ac.uk
cpuinfoIntel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit171000 (47 hours)
Inner Apptainer?True
Job statejobscript_error
Allocator namejustin-allocator-pro.dune.hep.ac.uk
Started2024-11-22 23:01:45
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50568178_718_20231202T172037Z_gen_g4_detsim_hitreco__20240507T210055Z_reco2.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Outputting started 
Output files
Finished2024-11-22 23:14:12
Saved logsjustin-logs:108522.137-justin-prod-sched02.dune.hep.ac.uk.logs.tgz
List job events     Wrapper job log

Jobscript log (last 10,000 characters)

iencegrid.org/products/larpandora/v09_22_09/slf7.x86_64.e26.prof/lib/liblarpandora_LArPandoraInterface_StandardPandora_module.so)
frame #19: lar_pandora::LArPandora::produce(art::Event&) + 0x54 (0x152200e923e4 in /cvmfs/larsoft.opensciencegrid.org/products/larpandora/v09_22_09/slf7.x86_64.e26.prof/lib/liblarpandora_LArPandoraInterface.so)
frame #20: art::EDProducer::produceWithFrame(art::Event&, art::ProcessingFrame const&) + 0x38 (0x15222b869138 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Core.so)
frame #21: art::detail::Producer::doEvent(art::EventPrincipal&, art::ModuleContext const&, std::atomic<unsigned long>&, std::atomic<unsigned long>&, std::atomic<unsigned long>&) + 0x52 (0x15222b8dc062 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Core.so)
frame #22: art::Worker::runWorker(art::EventPrincipal&, art::ModuleContext const&) + 0x215 (0x15222a040565 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Principal.so)
frame #23: <unknown function> + 0x407e3 (0x15222a0407e3 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Principal.so)
frame #24: tbb::detail::d1::function_task<hep::concurrency::SerialTaskQueue::QueuedTask>::execute(tbb::detail::d1::execution_data&) + 0x1f (0x152230803adf in /cvmfs/larsoft.opensciencegrid.org/products/hep_concurrency/v1_09_02/slf7.x86_64.e26.prof/lib/libhep_concurrency.so)
frame #25: <unknown function> + 0x424f1 (0x15222f4424f1 in /cvmfs/larsoft.opensciencegrid.org/products/tbb/v2021_9_0/Linux64bit+3.10-2.17-e26/lib/libtbb_debug.so.12)
frame #26: <unknown function> + 0x41437 (0x15222f441437 in /cvmfs/larsoft.opensciencegrid.org/products/tbb/v2021_9_0/Linux64bit+3.10-2.17-e26/lib/libtbb_debug.so.12)
frame #27: <unknown function> + 0x408ba (0x15222f4408ba in /cvmfs/larsoft.opensciencegrid.org/products/tbb/v2021_9_0/Linux64bit+3.10-2.17-e26/lib/libtbb_debug.so.12)
frame #28: tbb::detail::r1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) + 0x43 (0x15222f4406fe in /cvmfs/larsoft.opensciencegrid.org/products/tbb/v2021_9_0/Linux64bit+3.10-2.17-e26/lib/libtbb_debug.so.12)
frame #29: void art::EventProcessor::process<(art::Level)4>() + 0x2b7 (0x15222bc2b537 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #30: void art::EventProcessor::process<(art::Level)3>() + 0x75 (0x15222bc48845 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #31: void art::EventProcessor::process<(art::Level)2>() + 0x75 (0x15222bc48a15 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #32: void art::EventProcessor::process<(art::Level)1>() + 0x75 (0x15222bc48bb5 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #33: void art::EventProcessor::process<(art::Level)0>() + 0x85 (0x15222bc48d65 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #34: art::EventProcessor::runToCompletion() + 0x19 (0x15222bc325b9 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_EventProcessor.so)
frame #35: art::run_art(int, char**, boost::program_options::options_description&, std::vector<std::unique_ptr<art::OptionsHandler, std::default_delete<art::OptionsHandler> >, std::allocator<std::unique_ptr<art::OptionsHandler, std::default_delete<art::OptionsHandler> > > >&&) + 0x2762 (0x15223148c252 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Art.so)
frame #36: artapp(int, char**, bool) + 0x732 (0x1522314859f2 in /cvmfs/larsoft.opensciencegrid.org/products/art/v3_14_04/slf7.x86_64.e26.prof/lib/libart_Framework_Art.so)
frame #37: main + 0xe (0x40124e in lar)
frame #38: __libc_start_main + 0xf5 (0x15222c822555 in /lib64/libc.so.6)
frame #39: lar() [0x4012d0]

AdaBoostDecisionTree::Initialize - Invalid xml file.
AdaBoostDecisionTree::Initialize - Invalid xml file.
AdaBoostDecisionTree::Initialize - Invalid xml file.
AdaBoostDecisionTree::Initialize - Invalid xml file.
Failure in algorithm Alg0002, LArDLVertexing, unknown exception
PandoraContentApi::GetList(*this, m_inputVertexListName, pVertexList) throw STATUS_CODE_NOT_INITIALIZED
    in function: GetHitRegion
    in file:     /exp/dune/app/users/ismerio/env-testreco/srcs/larpandoracontent/larpandoradlcontent/LArVertex/DlVertexingAlgorithm.cc line#: 624
Failure in algorithm Alg0003, LArDLVertexing, STATUS_CODE_NOT_INITIALIZED
> Running Algorithm: Alg0003, LArNeutrinoEventValidation
---RAW-MATCHING-OUTPUT--------------------------------------------------------------------------
Failure in algorithm Alg0003, LArNeutrinoEventValidation, STATUS_CODE_NOT_INITIALIZED
%MSG-e WireIDIntersectionCheck:  PMAlgTrackMaker:pmtrack@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrackMaker:pmtrack@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrackMaker:pmtracktc@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrackMaker:pmtracktc@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrackMaker:pmtracktc@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrajFitter:pmtrajfittc@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
%MSG-e WireIDIntersectionCheck:  PMAlgTrajFitter:pmtrajfittc@BeginModule  22-Nov-2024 23:07:53 GMT run: 50568178 subRun: 1 event: 71801
Comparing two wires in the same plane: return failure
%MSG
Boundary wire vector sizes: 59, 55, 48
minwire 0: 541
minwire 1: 2036
minwire 2: 391
Used alternate method to get min and max wires due to vertex determination failure: 21, 220
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 199
Used alternate method to get min and max wires due to vertex determination failure: 0, 199
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 199
Used alternate method to get min and max wires due to vertex determination failure: 89, 288
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 199
Could not find serving_default in model signatures.
[libprotobuf FATAL /cvmfs/larsoft.opensciencegrid.org/products/protobuf/v3_21_12a/Linux64bit+3.10-2.17-e26/include/google/protobuf/map.h:1300] CHECK failed: it != end(): key not found: serving_default
22-Nov-2024 23:08:28 GMT  Opened output file with pattern "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50568178_718_20231202T172037Z_gen_g4_detsim_hitreco__20240507T210055Z_reco2_reco_data_2024-11-22T_230150Z.root"
22-Nov-2024 23:09:26 GMT  Closed input file "root://mover.pp.rl.ac.uk:1094/pnfs/pp.rl.ac.uk/data/dune/fardet-hd/61/b8/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50568178_718_20231202T172037Z_gen_g4_detsim_hitreco__20240507T210055Z_reco2.root"
Malformed TimeTracker database.  The TimeEvent table is empty, but
the TimeModule table is not.  This can happen if an exception has
been thrown from a module while processing the first event.  Any
saved database file is suspect and should not be used.

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 7071.43 MB
  Peak resident set size usage (VmHWM): 1154.71 MB
====================================================================================================
PandoraMonitoring, only able to use default TApplication (limited functionality).
PandoraMonitoring::SaveTree, error: No tree with name 'Validation' exists.
%MSG-s ArtException:  PostEndJob 22-Nov-2024 23:09:26 GMT ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- StdException BEGIN
      An exception was thrown while processing module CVNEvaluator/cvneva run: 50568178 subRun: 1 event: 71801
      CHECK failed: it != end(): key not found: serving_default
    ---- StdException END
    Exception going through path reco
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while sim::OpDetDivRecs_opdigi__detsim. has 100 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
=== End last 100 lines of lar log file ===
RootOutput-5d51-e58f-834c-d616.root
all-input-dids.txt
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50568178_718_20231202T172037Z_gen_g4_detsim_hitreco__20240507T210055Z_reco2_reco_2024-11-22T_230150Z.log
debugprod.log
jobscript.log
reco2_hist.root
MyPandoraSettings_Master_Atmos_DUNEFD.xml
MyPandoraSettings_Master_DUNEFD.xml
build_slf7.x86_64
localProducts_larsoft_v09_91_02_e26_prof
setup_env-testreco.sh
srcs
temp.txt
temp2.txt
test
work
justIN time: 2024-11-23 11:28:39 UTC       justIN version: 01.01.09