Jobsub ID 289859.1@justin-prod-sched01.dune.hep.ac.uk
Jobsub ID | 289859.1@justin-prod-sched01.dune.hep.ac.uk |
Workflow ID | 4103 |
Stage ID | 1 |
User name | amoor@fnal.gov |
HTCondor Group | group_dune |
Requested | Processors | 1 |
RSS bytes | 8388608000 (8000 MiB) |
Wall seconds limit | 80000 (22 hours) |
Submitted time | 2024-11-14 20:30:07 |
Site | UK_Edinburgh |
Entry | DUNE_UK_SGridECDF_ce1_multicore |
Last heartbeat | 2024-11-14 21:29:06 |
From worker node | Hostname | node2A06.ecdf.ed.ac.uk |
cpuinfo | Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 8388608000 (8000 MiB) |
Wall seconds limit | 171000 (47 hours) |
Inner Apptainer? | True |
Job state | jobscript_error |
Allocator name | justin-allocator-pro.dune.hep.ac.uk |
Started | 2024-11-14 20:31:52 |
Input files | usertests:000790_reco_data_2024-11-14T_093429Z.root
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Outputting started | |
Output files | |
Finished | 2024-11-14 21:29:06 |
Saved logs | justin-logs:289859.1-justin-prod-sched01.dune.hep.ac.uk.logs.tgz |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
:08.198786 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Stream TTL elapsed, disconnecting...
[2024-11-14 20:53:08.199192 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 20:53:08.199293 +0000][Debug ][Poller ][ 1126] <[::ffff:172.16.43.7]:38182><--><[::ffff:134.158.209.218]:1094> Removing socket from the poller
[2024-11-14 20:53:08.199522 +0000][Error ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Forcing error on disconnect: [ERROR] Operation interrupted.
[2024-11-14 20:53:08.199546 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Reporting disconnection to queued message handlers.
[2024-11-14 20:53:08.199693 +0000][Debug ][TaskMgr ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
[2024-11-14 20:53:08.199749 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 20:53:08.199772 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Destroying stream
[2024-11-14 20:53:08.199826 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 20:53:09.173319 +0000][Debug ][TaskMgr ][ 1126] Removing task: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
IonAndScint endJob.
[2024-11-14 21:28:05.534728 +0000][Debug ][File ][ 1126] [0xa37c980@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=306743bb-24ef-4692-a53d-68c4992b2a01] Sending a close command for handle 0x0 to ccdcacli487.in2p3.fr:30288
[2024-11-14 21:28:05.534785 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] MsgHandler created: 0xad4abc0 (message: kXR_close (handle: 0x00000000) ).
[2024-11-14 21:28:05.534892 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Moving MsgHandler: 0xad4abc0 (message: kXR_close (handle: 0x00000000) ) from out-queu to in-queue.
[2024-11-14 21:28:05.563907 +0000][Debug ][ExDbgMsg ][ 1126] [msg: 0x1958bd50] Assigned MsgHandler: 0xad4abc0.
[2024-11-14 21:28:05.563964 +0000][Debug ][ExDbgMsg ][ 1126] [handler: 0xad4abc0] Removed MsgHandler: 0xad4abc0 from the in-queue.
[2024-11-14 21:28:05.564138 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Calling MsgHandler: 0xad4abc0 (message: kXR_close (handle: 0x00000000) ) with status: [SUCCESS] .
[2024-11-14 21:28:05.564249 +0000][Debug ][File ][ 1126] [0xa37c980@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=306743bb-24ef-4692-a53d-68c4992b2a01] Close returned from ccdcacli487.in2p3.fr:30288 with: [SUCCESS]
[2024-11-14 21:28:05.564308 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying MsgHandler: 0xad4abc0.
14-Nov-2024 21:28:05 GMT Closed input file "root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root"
================================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
================================================================================================================================
Full event 0.013778 42.4802 3093.14 0.264302 349.96 77
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read) 0.00010713 0.00244185 0.143079 0.000134789 0.0164899 77
simulate:rns:RandomNumberSaver 1.9534e-05 3.60529e-05 0.000470729 2.8422e-05 5.05447e-05 77
simulate:largeant:larg4Main 0.0119716 0.785623 34.867 0.207887 3.956 77
simulate:IonAndScint:IonAndScint 0.000106653 0.0406385 2.99057 0.000123045 0.3384 77
simulate:elecDrift:SimDriftElectrons 3.2105e-05 0.430655 31.7285 4.4805e-05 3.59033 77
simulate:PDFastSim:PDFastSimPAR 5.1476e-05 41.1371 3023.55 6.5449e-05 342.13 77
[art]:TriggerResults:TriggerResultInserter 9.208e-06 1.21482e-05 3.747e-05 1.0268e-05 4.49607e-06 77
end_path:out1:RootOutput 2.671e-06 3.50203e-06 1.8454e-05 3.122e-06 1.90951e-06 77
end_path:out1:RootOutput(write) 0.000780986 0.0842361 0.753352 0.00333373 0.168548 76
================================================================================================================================
%MSG-i NuRandomService: RootOutput:out1@EndJob 14-Nov-2024 21:28:05 GMT ModuleEndJob
Summary of seeds computed by the NuRandomService
Random policy: 'perEvent'
algorithm version: EventTimestamp_v1
Configured value Last value ModuleLabel.InstanceName
(per event) 42050063 IonAndScint.ISCalcAlg
(per event) 566897688 PDFastSim.photon
(per event) 212160827 PDFastSim.scinttime
(per event) 489545914 elecDrift
(per event) 721211580 largeant
%MSG
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 10517.2 MB
Peak resident set size usage (VmHWM): 8057.83 MB
====================================================================================================
TrigReport ---------- Event summary -------------
TrigReport Events total = 77 passed = 77 failed = 0
TrigReport ---------- Modules in End-path ----------
TrigReport Run Success Error Name
TrigReport 77 77 0 out1
TimeReport ---------- Time summary [sec] -------
TimeReport CPU = 3282.693086 Real = 3296.457006
MemReport ---------- Memory summary [base-10 MB] ------
MemReport VmPeak = 10517.2 VmHWM = 8057.83
%MSG-s ArtException: PostEndJob 14-Nov-2024 21:28:40 GMT ModuleEndJob
---- EventProcessorFailure BEGIN
EventProcessor: an exception occurred during current event processing
---- FatalRootError BEGIN
Fatal Root Error: TBufferFile::WriteByteCount
bytecount too large (more than 1073741822)
ROOT severity: 3000
---- FatalRootError END
---- EventProcessorFailure END
---- FatalRootError BEGIN
Fatal Root Error: TTree::SetEntries
Tree branches have different numbers of entries, eg EventAuxiliary has 76 entries while art::TriggerResults_TriggerResults__MUSUNGen. has 1000 entries.
ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
[2024-11-14 21:28:40.896464 +0000][Debug ][JobMgr ][ 1126] Stopping the job manager...
[2024-11-14 21:28:40.897944 +0000][Debug ][JobMgr ][ 1126] Job manager stopped
[2024-11-14 21:28:40.898518 +0000][Debug ][TaskMgr ][ 1126] Stopping the task manager...
[2024-11-14 21:28:40.898725 +0000][Debug ][TaskMgr ][ 1126] Task manager stopped
[2024-11-14 21:28:40.898737 +0000][Debug ][Poller ][ 1126] Stopping the poller...
[2024-11-14 21:28:40.898873 +0000][Debug ][TaskMgr ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccdcacli487.in2p3.fr:30288"
[2024-11-14 21:28:40.898894 +0000][Debug ][AsyncSock ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
[2024-11-14 21:28:40.898974 +0000][Debug ][Poller ][ 1126] <[::ffff:172.16.43.7]:43686><--><[::ffff:134.158.209.235]:30288> Removing socket from the poller
[2024-11-14 21:28:40.899052 +0000][Debug ][PostMaster ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying stream
[2024-11-14 21:28:40.899072 +0000][Debug ][AsyncSock ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
=== End last 100 lines of lar log file ===
lar exit code 0
Traceback (most recent call last):
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 434, in <module>
main()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 373, in main
mddict = expSpecificMetadata.getmetadata()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 344, in getmetadata
jobt = self.get_job(proc)
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 69, in get_job
raise RuntimeError('sam_metadata_dumper returned nonzero exit status {}.'.format(rc))
RuntimeError: sam_metadata_dumper returned nonzero exit status 1.
extractor_prod.py exit code 1
Error reading metadata from file: Expecting value: line 1 column 1 (char 0)
pdjson2metadata exit code 1
.:
total 829800
-rw-r--r-- 1 gl05pi6 eddie_users 849067035 Nov 14 21:28 RootOutput-61ba-af72-7d37-826a.root
-rw-r--r-- 1 gl05pi6 eddie_users 617919 Nov 14 21:28 000790_reco_data_2024-11-14T_093429Z_reco_2024-11-14T_203158Z.log
-rw-r--r-- 1 gl05pi6 eddie_users 9962 Nov 14 21:28 jobscript.log
-rw-r--r-- 1 gl05pi6 eddie_users 519 Nov 14 21:28 g4_hist.root
-rw-r--r-- 1 gl05pi6 eddie_users 104 Nov 14 20:31 all-input-dids.txt
-rw-r--r-- 1 gl05pi6 eddie_users 0 Nov 14 21:28 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_203158Z.root.ext.json
-rw-r--r-- 1 gl05pi6 eddie_users 0 Nov 14 21:28 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_203158Z.root.json