justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 290559.0@justin-prod-sched01.dune.hep.ac.uk

Jobsub ID290559.0@justin-prod-sched01.dune.hep.ac.uk
Workflow ID4103
Stage ID1
User nameamoor@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
RSS bytes8388608000 (8000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2024-11-15 13:21:00
SiteNL_SURFsara
EntryDUNE_SurfSARA_arc03
Last heartbeat2024-11-16 05:53:46
From worker nodeHostnamewn-la-17.gina.surfsara.nl
cpuinfoAMD EPYC 9754 128-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes8388608000 (8000 MiB)
Wall seconds limit129600 (36 hours)
Inner Apptainer?True
Job stateoutputting_failed
Allocator namejustin-allocator-pro.dune.hep.ac.uk
Started2024-11-15 13:22:37
Input filesusertests:000953_reco_data_2024-11-14T_093605Z.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Outputting started 
Output files
Finished2024-11-16 05:53:46
List job events     Wrapper job log

Jobscript log (last 10,000 characters)

AFM g4 jobscript.
Input PFN = root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/52/0d/000953_reco_data_2024-11-14T_093605Z.root
Setting up larsoft UPS area... /cvmfs/larsoft.opensciencegrid.org
Setting up DUNE UPS area... /cvmfs/dune.opensciencegrid.org/products/dune/
/cvmfs/larsoft.opensciencegrid.org/products/xrootd/v5_4_3b/Linux64bit+3.10-2.17-e20-p3913-prof/lib/libXrdPosixPreload.so
../justin-jobscript: line 72:   866 Aborted                 lar -c $FCL_FILE $events_option -o $outFile "$pfn" > ${fname}_reco_${now}.log 2>&1
=== Start last 100 lines of lar log file ===
%MSG-i generatePrimaries:  larg4Main:largeant@BeginModule  15-Nov-2024 15:14:53 CET run: 20000031 subRun: 0 event: 779 MCTruthEventAction.cc:112
Generating 1 particles
%MSG
%MSG-i ParticleListActionService:  larg4Main:largeant@BeginModule  15-Nov-2024 15:25:05 CET run: 20000031 subRun: 0 event: 779
Not Stored Process summary:
	compt : 9612181
	annihil : 464020
	Pair : 136
	phot : 3087362
	Brem : 2834896
	conv : 463860
	Ion : 1205659
%MSG
%MSG-i endOfEventAction:  larg4Main:largeant@BeginModule  15-Nov-2024 15:25:05 CET run: 20000031 subRun: 0 event: 779 ParticleListAction.cc:701
MCTruth Handles Size: 1
%MSG
%MSG-i endOfEventAction:  larg4Main:largeant@BeginModule  15-Nov-2024 15:25:05 CET run: 20000031 subRun: 0 event: 779 ParticleListAction.cc:708
mclistHandle Size: 1
%MSG
%MSG-i endOfEventAction:  larg4Main:largeant@BeginModule  15-Nov-2024 15:25:05 CET run: 20000031 subRun: 0 event: 779 ParticleListAction.cc:711
Found 1 particles
%MSG
%MSG-i NuRandomService:  IonAndScint:IonAndScint@BeginModule  15-Nov-2024 15:25:06 CET run: 20000031 subRun: 0 event: 779
Random seed for this event, engine 'IonAndScint.ISCalcAlg': 694579352
%MSG
IonAndScint Module Producer
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCPlaneUInner
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCActiveInner
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCInner
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCPlaneVInner
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCPlaneZInner
SimEnergyDeposit input module: largeant, instance name: LArG4DetectorServicevolTPCActiveOuter
%MSG-i NuRandomService:  SimDriftElectrons:elecDrift@BeginModule  15-Nov-2024 15:26:09 CET run: 20000031 subRun: 0 event: 779
Random seed for this event, engine 'elecDrift': 539685752
%MSG
%MSG-i NuRandomService:  PDFastSimPAR:PDFastSim@BeginModule  15-Nov-2024 15:35:30 CET run: 20000031 subRun: 0 event: 779
Random seed for this event, engine 'PDFastSim.photon': 490704635
%MSG
%MSG-i NuRandomService:  PDFastSimPAR:PDFastSim@BeginModule  15-Nov-2024 15:35:30 CET run: 20000031 subRun: 0 event: 779
Random seed for this event, engine 'PDFastSim.scinttime': 222275824
%MSG
IonAndScint endJob.
16-Nov-2024 06:53:09 CET  Closed input file "root://xrootd.echo.stfc.ac.uk:1094/dune:/protodune/RSE/usertests/52/0d/000953_reco_data_2024-11-14T_093605Z.root"

================================================================================================================================
TimeTracker printout (sec)                        Min           Avg           Max         Median          RMS         nEvts   
================================================================================================================================
Full event                                    0.00994034      76.0726       56269.3      0.282024       2014.69        779    
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                        0.000105637   0.000721234    0.152591     0.000179086   0.00614402       779    
simulate:rns:RandomNumberSaver                2.4958e-05    4.09714e-05   0.000262081   3.8317e-05    1.20549e-05      779    
simulate:largeant:larg4Main                   0.00813382      1.26155       613.606      0.249407       21.9732        779    
simulate:IonAndScint:IonAndScint              0.00011438     0.0836959      62.7679     0.000163844     2.2474         779    
simulate:elecDrift:SimDriftElectrons           4.014e-05     0.753285       560.596     5.0394e-05      20.0723        779    
simulate:PDFastSim:PDFastSimPAR               5.8957e-05      73.8417       55032.4     7.9408e-05      1970.42        779    
[art]:TriggerResults:TriggerResultInserter    1.0085e-05    1.3702e-05    4.9133e-05    1.2599e-05    3.91254e-06      779    
end_path:out1:RootOutput                       3.855e-06    6.92558e-06    0.0012522     5.087e-06    4.47047e-05      779    
end_path:out1:RootOutput(write)               0.000742144    0.131085       10.0941     0.00428302     0.514454        778    
================================================================================================================================
%MSG-i NuRandomService:  RootOutput:out1@EndJob 16-Nov-2024 06:53:09 CET  ModuleEndJob

Summary of seeds computed by the NuRandomService
Random policy: 'perEvent'
  algorithm version: EventTimestamp_v1
   Configured value          Last value   ModuleLabel.InstanceName
        (per event)           694579352   IonAndScint.ISCalcAlg
        (per event)           490704635   PDFastSim.photon
        (per event)           222275824   PDFastSim.scinttime
        (per event)           539685752   elecDrift
        (per event)           230665672   largeant

%MSG

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 54923.9 MB
  Peak resident set size usage (VmHWM): 50535.5 MB
====================================================================================================

TrigReport ---------- Event summary -------------
TrigReport Events total = 779 passed = 779 failed = 0

TrigReport ---------- Modules in End-path ----------
TrigReport        Run    Success      Error Name
TrigReport        779        779          0 out1

TimeReport ---------- Time summary [sec] -------
TimeReport CPU = 59249.290433 Real = 59316.692171

MemReport  ---------- Memory summary [base-10 MB] ------
MemReport  VmPeak = 54923.9 VmHWM = 50535.5

terminate called after throwing an instance of 'cet::coded_exception<art::errors::ErrorCodes, &art::ExceptionDetail::translate[abi:cxx11]>'
  what():  ---- FatalRootError BEGIN
  Fatal Root Error: TBufferFile::AutoExpand
  Request to expand to a negative size, likely due to an integer overflow: 0x8000000a for a max of 0x7ffffffe.
  ROOT severity: 6000
---- FatalRootError END

=== End last 100 lines of lar log file ===
lar exit code 0
Traceback (most recent call last):
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 434, in <module>
    main()
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 373, in main
    mddict = expSpecificMetadata.getmetadata()
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 344, in getmetadata
    jobt = self.get_job(proc)
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 69, in get_job
    raise RuntimeError('sam_metadata_dumper returned nonzero exit status {}.'.format(rc))
RuntimeError: sam_metadata_dumper returned nonzero exit status 1.
extractor_prod.py exit code 1
Error reading metadata from file: Expecting value: line 1 column 1 (char 0)
pdjson2metadata exit code 1
.:
total 1635584
-rw-r--r--. 1 dune003 dune 1672002308 Nov 16 06:53 RootOutput-4729-9524-db38-61f3.root
-rw-r--r--. 1 dune003 dune    2816408 Nov 16 06:53 000953_reco_data_2024-11-14T_093605Z_reco_2024-11-15T_132252Z.log
-rw-r--r--. 1 dune003 dune       7709 Nov 16 06:53 jobscript.log
-rw-r--r--. 1 dune003 dune        274 Nov 15 14:23 TFileService-61f3-d1fb-2938-92e6.root
-rw-r--r--. 1 dune003 dune        104 Nov 15 14:22 all-input-dids.txt
-rw-r--r--. 1 dune003 dune          0 Nov 16 06:53 000953_reco_data_2024-11-14T_093605Z_reco_data_2024-11-15T_132252Z.root.ext.json
-rw-r--r--. 1 dune003 dune          0 Nov 16 06:53 000953_reco_data_2024-11-14T_093605Z_reco_data_2024-11-15T_132252Z.root.json
justIN time: 2024-11-17 08:28:49 UTC       justIN version: 01.01.09