Jobsub ID 289882.0@justin-prod-sched01.dune.hep.ac.uk
Jobsub ID | 289882.0@justin-prod-sched01.dune.hep.ac.uk |
Workflow ID | 4103 |
Stage ID | 1 |
User name | amoor@fnal.gov |
HTCondor Group | group_dune |
Requested | Processors | 1 |
RSS bytes | 8388608000 (8000 MiB) |
Wall seconds limit | 80000 (22 hours) |
Submitted time | 2024-11-14 21:23:40 |
Site | UK_Edinburgh |
Entry | DUNE_UK_SGridECDF_ce1_multicore |
Last heartbeat | 2024-11-14 22:29:26 |
From worker node | Hostname | node2A06.ecdf.ed.ac.uk |
cpuinfo | Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 8388608000 (8000 MiB) |
Wall seconds limit | 171000 (47 hours) |
Inner Apptainer? | True |
Job state | outputting_failed |
Allocator name | justin-allocator-pro.dune.hep.ac.uk |
Started | 2024-11-14 21:29:55 |
Input files | usertests:000790_reco_data_2024-11-14T_093429Z.root
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Outputting started | |
Output files | |
Finished | 2024-11-14 22:29:26 |
List job events Wrapper job log |
Jobscript log (last 10,000 characters)
1:08.890468 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Stream TTL elapsed, disconnecting...
[2024-11-14 21:51:08.890705 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:08.890749 +0000][Debug ][Poller ][ 1126] <[::ffff:172.16.43.7]:44540><--><[::ffff:134.158.209.218]:1094> Removing socket from the poller
[2024-11-14 21:51:08.890894 +0000][Error ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Forcing error on disconnect: [ERROR] Operation interrupted.
[2024-11-14 21:51:08.890903 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Reporting disconnection to queued message handlers.
[2024-11-14 21:51:08.890975 +0000][Debug ][TaskMgr ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
[2024-11-14 21:51:08.891011 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:08.891023 +0000][Debug ][PostMaster ][ 1126] [ccxrootdegee.in2p3.fr:1094] Destroying stream
[2024-11-14 21:51:08.891057 +0000][Debug ][AsyncSock ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:09.542844 +0000][Debug ][TaskMgr ][ 1126] Removing task: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
IonAndScint endJob.
[2024-11-14 22:28:26.321372 +0000][Debug ][File ][ 1126] [0xb52a4d0@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=7c5077c4-c641-4feb-a1cf-8b9f6c77466a] Sending a close command for handle 0x0 to ccdcacli487.in2p3.fr:30288
[2024-11-14 22:28:26.321425 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] MsgHandler created: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ).
[2024-11-14 22:28:26.321644 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Moving MsgHandler: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ) from out-queu to in-queue.
[2024-11-14 22:28:26.351782 +0000][Debug ][ExDbgMsg ][ 1126] [msg: 0x814de90] Assigned MsgHandler: 0xbef88e0.
[2024-11-14 22:28:26.351868 +0000][Debug ][ExDbgMsg ][ 1126] [handler: 0xbef88e0] Removed MsgHandler: 0xbef88e0 from the in-queue.
[2024-11-14 22:28:26.352033 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Calling MsgHandler: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ) with status: [SUCCESS] .
[2024-11-14 22:28:26.352106 +0000][Debug ][File ][ 1126] [0xb52a4d0@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=7c5077c4-c641-4feb-a1cf-8b9f6c77466a] Close returned from ccdcacli487.in2p3.fr:30288 with: [SUCCESS]
[2024-11-14 22:28:26.352142 +0000][Debug ][ExDbgMsg ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying MsgHandler: 0xbef88e0.
14-Nov-2024 22:28:26 GMT Closed input file "root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root"
================================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
================================================================================================================================
Full event 0.0138242 44.3036 3232.05 0.2652 365.685 77
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read) 0.000103587 0.00108059 0.0384995 0.000131251 0.00549651 77
simulate:rns:RandomNumberSaver 1.932e-05 3.49898e-05 0.000446615 2.8629e-05 4.76678e-05 77
simulate:largeant:larg4Main 0.0120177 0.78532 34.7448 0.20731 3.94328 77
simulate:IonAndScint:IonAndScint 9.8708e-05 0.0404752 2.9772 0.000122458 0.336886 77
simulate:elecDrift:SimDriftElectrons 3.6571e-05 0.424792 31.3002 4.7562e-05 3.54186 77
simulate:PDFastSim:PDFastSimPAR 5.2909e-05 42.9683 3163.03 7.2423e-05 357.918 77
[art]:TriggerResults:TriggerResultInserter 9.009e-06 1.13855e-05 3.4195e-05 9.83e-06 3.91001e-06 77
end_path:out1:RootOutput 2.804e-06 3.64836e-06 1.4699e-05 3.301e-06 1.48831e-06 77
end_path:out1:RootOutput(write) 0.000699126 0.0841502 0.761836 0.00349942 0.168652 76
================================================================================================================================
%MSG-i NuRandomService: RootOutput:out1@EndJob 14-Nov-2024 22:28:26 GMT ModuleEndJob
Summary of seeds computed by the NuRandomService
Random policy: 'perEvent'
algorithm version: EventTimestamp_v1
Configured value Last value ModuleLabel.InstanceName
(per event) 42050063 IonAndScint.ISCalcAlg
(per event) 566897688 PDFastSim.photon
(per event) 212160827 PDFastSim.scinttime
(per event) 489545914 elecDrift
(per event) 721211580 largeant
%MSG
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 10509.5 MB
Peak resident set size usage (VmHWM): 7993.62 MB
====================================================================================================
TrigReport ---------- Event summary -------------
TrigReport Events total = 77 passed = 77 failed = 0
TrigReport ---------- Modules in End-path ----------
TrigReport Run Success Error Name
TrigReport 77 77 0 out1
TimeReport ---------- Time summary [sec] -------
TimeReport CPU = 3422.374794 Real = 3437.005005
MemReport ---------- Memory summary [base-10 MB] ------
MemReport VmPeak = 10509.5 VmHWM = 7993.62
%MSG-s ArtException: PostEndJob 14-Nov-2024 22:29:02 GMT ModuleEndJob
---- EventProcessorFailure BEGIN
EventProcessor: an exception occurred during current event processing
---- FatalRootError BEGIN
Fatal Root Error: TBufferFile::WriteByteCount
bytecount too large (more than 1073741822)
ROOT severity: 3000
---- FatalRootError END
---- EventProcessorFailure END
---- FatalRootError BEGIN
Fatal Root Error: TTree::SetEntries
Tree branches have different numbers of entries, eg EventAuxiliary has 76 entries while art::TriggerResults_TriggerResults__MUSUNGen. has 1000 entries.
ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
[2024-11-14 22:29:02.492798 +0000][Debug ][JobMgr ][ 1126] Stopping the job manager...
[2024-11-14 22:29:02.494005 +0000][Debug ][JobMgr ][ 1126] Job manager stopped
[2024-11-14 22:29:02.494601 +0000][Debug ][TaskMgr ][ 1126] Stopping the task manager...
[2024-11-14 22:29:02.494750 +0000][Debug ][TaskMgr ][ 1126] Task manager stopped
[2024-11-14 22:29:02.494762 +0000][Debug ][Poller ][ 1126] Stopping the poller...
[2024-11-14 22:29:02.494943 +0000][Debug ][TaskMgr ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccdcacli487.in2p3.fr:30288"
[2024-11-14 22:29:02.494967 +0000][Debug ][AsyncSock ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
[2024-11-14 22:29:02.494998 +0000][Debug ][Poller ][ 1126] <[::ffff:172.16.43.7]:49436><--><[::ffff:134.158.209.235]:30288> Removing socket from the poller
[2024-11-14 22:29:02.495070 +0000][Debug ][PostMaster ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying stream
[2024-11-14 22:29:02.495121 +0000][Debug ][AsyncSock ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
=== End last 100 lines of lar log file ===
lar exit code 0
Traceback (most recent call last):
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 434, in <module>
main()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 373, in main
mddict = expSpecificMetadata.getmetadata()
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 344, in getmetadata
jobt = self.get_job(proc)
File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 69, in get_job
raise RuntimeError('sam_metadata_dumper returned nonzero exit status {}.'.format(rc))
RuntimeError: sam_metadata_dumper returned nonzero exit status 1.
extractor_prod.py exit code 1
Error reading metadata from file: Expecting value: line 1 column 1 (char 0)
pdjson2metadata exit code 1
.:
total 829800
-rw-r--r-- 1 gl05pi6 eddie_users 849067003 Nov 14 22:29 RootOutput-0bfa-ea88-76f4-8587.root
-rw-r--r-- 1 gl05pi6 eddie_users 617921 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_2024-11-14T_213000Z.log
-rw-r--r-- 1 gl05pi6 eddie_users 9961 Nov 14 22:29 jobscript.log
-rw-r--r-- 1 gl05pi6 eddie_users 519 Nov 14 22:29 g4_hist.root
-rw-r--r-- 1 gl05pi6 eddie_users 104 Nov 14 21:29 all-input-dids.txt
-rw-r--r-- 1 gl05pi6 eddie_users 0 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_213000Z.root.ext.json
-rw-r--r-- 1 gl05pi6 eddie_users 0 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_213000Z.root.json