justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 289882.0@justin-prod-sched01.dune.hep.ac.uk

Jobsub ID289882.0@justin-prod-sched01.dune.hep.ac.uk
Workflow ID4103
Stage ID1
User nameamoor@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
RSS bytes8388608000 (8000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2024-11-14 21:23:40
SiteUK_Edinburgh
EntryDUNE_UK_SGridECDF_ce1_multicore
Last heartbeat2024-11-14 22:29:26
From worker nodeHostnamenode2A06.ecdf.ed.ac.uk
cpuinfoIntel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes8388608000 (8000 MiB)
Wall seconds limit171000 (47 hours)
Inner Apptainer?True
Job stateoutputting_failed
Allocator namejustin-allocator-pro.dune.hep.ac.uk
Started2024-11-14 21:29:55
Input filesusertests:000790_reco_data_2024-11-14T_093429Z.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Outputting started 
Output files
Finished2024-11-14 22:29:26
List job events     Wrapper job log

Jobscript log (last 10,000 characters)

1:08.890468 +0000][Debug  ][PostMaster        ][ 1126] [ccxrootdegee.in2p3.fr:1094] Stream TTL elapsed, disconnecting...
[2024-11-14 21:51:08.890705 +0000][Debug  ][AsyncSock         ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:08.890749 +0000][Debug  ][Poller            ][ 1126] <[::ffff:172.16.43.7]:44540><--><[::ffff:134.158.209.218]:1094> Removing socket from the poller
[2024-11-14 21:51:08.890894 +0000][Error  ][PostMaster        ][ 1126] [ccxrootdegee.in2p3.fr:1094] Forcing error on disconnect: [ERROR] Operation interrupted.
[2024-11-14 21:51:08.890903 +0000][Debug  ][PostMaster        ][ 1126] [ccxrootdegee.in2p3.fr:1094] Reporting disconnection to queued message handlers.
[2024-11-14 21:51:08.890975 +0000][Debug  ][TaskMgr           ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
[2024-11-14 21:51:08.891011 +0000][Debug  ][AsyncSock         ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:08.891023 +0000][Debug  ][PostMaster        ][ 1126] [ccxrootdegee.in2p3.fr:1094] Destroying stream
[2024-11-14 21:51:08.891057 +0000][Debug  ][AsyncSock         ][ 1126] [ccxrootdegee.in2p3.fr:1094.0] Closing the socket
[2024-11-14 21:51:09.542844 +0000][Debug  ][TaskMgr           ][ 1126] Removing task: "TickGeneratorTask for: root://ccxrootdegee.in2p3.fr:1094"
IonAndScint endJob.
[2024-11-14 22:28:26.321372 +0000][Debug  ][File              ][ 1126] [0xb52a4d0@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=7c5077c4-c641-4feb-a1cf-8b9f6c77466a] Sending a close command for handle 0x0 to ccdcacli487.in2p3.fr:30288
[2024-11-14 22:28:26.321425 +0000][Debug  ][ExDbgMsg          ][ 1126] [ccdcacli487.in2p3.fr:30288] MsgHandler created: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ).
[2024-11-14 22:28:26.321644 +0000][Debug  ][ExDbgMsg          ][ 1126] [ccdcacli487.in2p3.fr:30288] Moving MsgHandler: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ) from out-queu to in-queue.
[2024-11-14 22:28:26.351782 +0000][Debug  ][ExDbgMsg          ][ 1126] [msg: 0x814de90] Assigned MsgHandler: 0xbef88e0.
[2024-11-14 22:28:26.351868 +0000][Debug  ][ExDbgMsg          ][ 1126] [handler: 0xbef88e0] Removed MsgHandler: 0xbef88e0 from the in-queue.
[2024-11-14 22:28:26.352033 +0000][Debug  ][ExDbgMsg          ][ 1126] [ccdcacli487.in2p3.fr:30288] Calling MsgHandler: 0xbef88e0 (message: kXR_close (handle: 0x00000000) ) with status: [SUCCESS] .
[2024-11-14 22:28:26.352106 +0000][Debug  ][File              ][ 1126] [0xb52a4d0@root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root?xrdcl.requuid=7c5077c4-c641-4feb-a1cf-8b9f6c77466a] Close returned from ccdcacli487.in2p3.fr:30288 with: [SUCCESS] 
[2024-11-14 22:28:26.352142 +0000][Debug  ][ExDbgMsg          ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying MsgHandler: 0xbef88e0.
14-Nov-2024 22:28:26 GMT  Closed input file "root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/usertests/80/cf/000790_reco_data_2024-11-14T_093429Z.root"

================================================================================================================================
TimeTracker printout (sec)                        Min           Avg           Max         Median          RMS         nEvts   
================================================================================================================================
Full event                                     0.0138242      44.3036       3232.05       0.2652        365.685        77     
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                        0.000103587   0.00108059     0.0384995    0.000131251   0.00549651       77     
simulate:rns:RandomNumberSaver                 1.932e-05    3.49898e-05   0.000446615   2.8629e-05    4.76678e-05      77     
simulate:largeant:larg4Main                    0.0120177      0.78532       34.7448       0.20731       3.94328        77     
simulate:IonAndScint:IonAndScint              9.8708e-05     0.0404752      2.9772      0.000122458    0.336886        77     
simulate:elecDrift:SimDriftElectrons          3.6571e-05     0.424792       31.3002     4.7562e-05      3.54186        77     
simulate:PDFastSim:PDFastSimPAR               5.2909e-05      42.9683       3163.03     7.2423e-05      357.918        77     
[art]:TriggerResults:TriggerResultInserter     9.009e-06    1.13855e-05   3.4195e-05     9.83e-06     3.91001e-06      77     
end_path:out1:RootOutput                       2.804e-06    3.64836e-06   1.4699e-05     3.301e-06    1.48831e-06      77     
end_path:out1:RootOutput(write)               0.000699126    0.0841502     0.761836     0.00349942     0.168652        76     
================================================================================================================================
%MSG-i NuRandomService:  RootOutput:out1@EndJob 14-Nov-2024 22:28:26 GMT  ModuleEndJob

Summary of seeds computed by the NuRandomService
Random policy: 'perEvent'
  algorithm version: EventTimestamp_v1
   Configured value          Last value   ModuleLabel.InstanceName
        (per event)            42050063   IonAndScint.ISCalcAlg
        (per event)           566897688   PDFastSim.photon
        (per event)           212160827   PDFastSim.scinttime
        (per event)           489545914   elecDrift
        (per event)           721211580   largeant

%MSG

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 10509.5 MB
  Peak resident set size usage (VmHWM): 7993.62 MB
====================================================================================================

TrigReport ---------- Event summary -------------
TrigReport Events total = 77 passed = 77 failed = 0

TrigReport ---------- Modules in End-path ----------
TrigReport        Run    Success      Error Name
TrigReport         77         77          0 out1

TimeReport ---------- Time summary [sec] -------
TimeReport CPU = 3422.374794 Real = 3437.005005

MemReport  ---------- Memory summary [base-10 MB] ------
MemReport  VmPeak = 10509.5 VmHWM = 7993.62

%MSG-s ArtException:  PostEndJob 14-Nov-2024 22:29:02 GMT ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- FatalRootError BEGIN
    Fatal Root Error: TBufferFile::WriteByteCount
    bytecount too large (more than 1073741822)
    ROOT severity: 3000
  ---- FatalRootError END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 76 entries while art::TriggerResults_TriggerResults__MUSUNGen. has 1000 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
[2024-11-14 22:29:02.492798 +0000][Debug  ][JobMgr            ][ 1126] Stopping the job manager...
[2024-11-14 22:29:02.494005 +0000][Debug  ][JobMgr            ][ 1126] Job manager stopped
[2024-11-14 22:29:02.494601 +0000][Debug  ][TaskMgr           ][ 1126] Stopping the task manager...
[2024-11-14 22:29:02.494750 +0000][Debug  ][TaskMgr           ][ 1126] Task manager stopped
[2024-11-14 22:29:02.494762 +0000][Debug  ][Poller            ][ 1126] Stopping the poller...
[2024-11-14 22:29:02.494943 +0000][Debug  ][TaskMgr           ][ 1126] Requesting unregistration of: "TickGeneratorTask for: root://ccdcacli487.in2p3.fr:30288"
[2024-11-14 22:29:02.494967 +0000][Debug  ][AsyncSock         ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
[2024-11-14 22:29:02.494998 +0000][Debug  ][Poller            ][ 1126] <[::ffff:172.16.43.7]:49436><--><[::ffff:134.158.209.235]:30288> Removing socket from the poller
[2024-11-14 22:29:02.495070 +0000][Debug  ][PostMaster        ][ 1126] [ccdcacli487.in2p3.fr:30288] Destroying stream
[2024-11-14 22:29:02.495121 +0000][Debug  ][AsyncSock         ][ 1126] [ccdcacli487.in2p3.fr:30288.0] Closing the socket
=== End last 100 lines of lar log file ===
lar exit code 0
Traceback (most recent call last):
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 434, in <module>
    main()
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 373, in main
    mddict = expSpecificMetadata.getmetadata()
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 344, in getmetadata
    jobt = self.get_job(proc)
  File "/cvmfs/dune.opensciencegrid.org/products/dune/duneutil/v09_75_03d00/bin/extractor_prod.py", line 69, in get_job
    raise RuntimeError('sam_metadata_dumper returned nonzero exit status {}.'.format(rc))
RuntimeError: sam_metadata_dumper returned nonzero exit status 1.
extractor_prod.py exit code 1
Error reading metadata from file: Expecting value: line 1 column 1 (char 0)
pdjson2metadata exit code 1
.:
total 829800
-rw-r--r-- 1 gl05pi6 eddie_users 849067003 Nov 14 22:29 RootOutput-0bfa-ea88-76f4-8587.root
-rw-r--r-- 1 gl05pi6 eddie_users    617921 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_2024-11-14T_213000Z.log
-rw-r--r-- 1 gl05pi6 eddie_users      9961 Nov 14 22:29 jobscript.log
-rw-r--r-- 1 gl05pi6 eddie_users       519 Nov 14 22:29 g4_hist.root
-rw-r--r-- 1 gl05pi6 eddie_users       104 Nov 14 21:29 all-input-dids.txt
-rw-r--r-- 1 gl05pi6 eddie_users         0 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_213000Z.root.ext.json
-rw-r--r-- 1 gl05pi6 eddie_users         0 Nov 14 22:29 000790_reco_data_2024-11-14T_093429Z_reco_data_2024-11-14T_213000Z.root.json
justIN time: 2024-11-22 08:17:05 UTC       justIN version: 01.01.09