Customized C4OpBoundaryProcess
BUT : need to change very nasty Geant4 impl:
Biggest problem with FastSim based POM : forces using unnatural geometry
body hidden under inner1 inner2
body is FastSim envelope volume
CURRENT : Unnatural 4-volume PMT (Pyrex+Pyrex+Vacuum+Vacuum) => Half Fake
Biggest advantage with Custom Boundary Process POM : enables natural geometry
Natural 2-volume PMT (Pyrex + Vacuum) : NOT COMPATIBLE WITH FastSim
Custom Boundary POM
+---------------pmt-Pyrex----------------+ | | | | | | | +~inner~Vacuum~~~~~~~~~~~+ | | ! ! | | ! ! | | ! ! | | ! ! | | ! ! | | + + | | | | | | | | | | | | | | | | | | | | | | +------------------------+ | | | | | | | +----------------------------------------+
OpSurfaceName[0] == '@'
Custom Boundary Process : Advantages
Current FastSim POM | 4 Solid, 4 LV, 4 PV |
Custom Boundary POM | 2 Solid, 2 LV, 2 PV |
Disadvantages
Advantages far outweigh disadvantages
Geant4 FastSim based junoPMTOpticalModel implementation has incorrect polarization and propagation time within PMT and also overcomplex 4 volume PMT geometry
junosw/-/tree/blyth-88-pivot-PMT-optical-model-from-FastSim-to-CustomG4OpBoundaryProcess
Pivot Branch
Merge Request
arr.view("datetime64[us]")
In [1]: a.rr Out[1]: array([1684168432404887, 1684168434567412], dtype=uint64) In [2]: np.c_[a.rr.view("datetime64[us]")] Out[2]: ## beginOfRun, endOfRun array([['2023-05-15T16:33:52.404887'], ['2023-05-15T16:33:54.567412']], dtype='datetime64[us]') In [5]: np.c_[a.s0.view("datetime64[us]")] Out[5]: ## beginPhoton stamps array([['2023-05-15T16:33:53.860482'], ['2023-05-15T16:33:53.860451'], ['2023-05-15T16:33:53.860421'], ..., ['2023-05-15T16:33:52.470429'], ['2023-05-15T16:33:52.470155'], ['2023-05-15T16:33:52.469285']], dtype='datetime64[us]') In [8]: a.t.shape ## pointPhoton stamps Out[8]: (10000, 32)
sysrap/stamp.h stamp::Now()
Collect uint64_t timestamps (microseconds since epoch (UTC))
Standalone FewPMT(1) test : No large stamp gaps ([us] microseconds)
TimeStamps : beginPhoton, pointPhoton, ..., endPhoton, OnePMT Standalone : UNIFORM : NO BIG GAPS (NO ProcessHits)
A,B Point counts consistent : Bump at 32 from recording truncation
Point 31 : at propagation time 4.3 ns (photon ends at: 10.9 ns, at [-237.9,0.,-200] -- 6.6ns of propagation not recorded
SD (+ some SA) photons : last step stamp "gaps" : slow ProcessHits?
Insitu (ntds2_cf), input photons targetting NNVT:0:1000
STAMP_TT=257400,600 STAMP_ANNO=1 ~/j/ntds/stamp.sh ## inpho => Hama:0:0
(s0) beginPhoton (s1) finalPhoton ( ) pointPhoton (h0) ProcessHits[ (i0) ProcessHits] -- ProcessHits taking ~56us for SD
junoSD_PMT_c2.cc
#include "SProfile.h" template<> std::vector<SProfile<16>> SProfile<16>::RECORD = {} ; ... G4bool junoSD_PMT_v2::ProcessHits(G4Step * step,G4TouchableHistory*) { m_profile->stamp(0); ... m_profile->stamp(1); const G4VTouchable* touch = track->GetTouchable(); int pmtID_1 = touch->GetReplicaNumber(1) ; if(pmtID_1 <= 0) pmtID_1 = touch->GetReplicaNumber(2) ; m_profile->stamp(2); int pmtID_2 = U4Touchable::AncestorReplicaNumber(touch) ; m_profile->stamp(3); int pmtID = get_pmtid(track); // TAKING 50-56 us m_profile->stamp(4);
#include <cstdint> #include <vector> #include <chrono> #include "NPX.h" template<int N> struct SProfile { uint64_t idx ; uint64_t t[N] ; static std::vector<SProfile<N>> RECORD ; void add(){ RECORD.push_back(*this) ; } void zero(){ idx = 0 ; for(int i=0 ; i < N ; i++) t[i] = 0 ; } static uint64_t Now() // microseconds[us] since UTC epoch { std::chrono::time_point<std::chrono::system_clock> t0 = std::chrono::system_clock::now(); return std::chrono::duration_cast<std::chrono::microseconds> (t0.time_since_epoch()).count() ; } void stamp(int i){ t[i] = Now(); } // stamp collection static constexpr const char* NAME = "SProfile.npy" ; static NP* Array(){ return NPX::ArrayFromVec<uint64_t,SProfile<N>>(RECORD,1+N); } static void Save(const char* dir, const char* reldir=nullptr){ NP* a = Array(); a->save(dir, reldir, NAME) ; } };
Timestamp collection + persisting implemented in the header
ProcessHits time fraction >10%
Sum of ProcessHits times/ event time
In [16]: np.sum(a.pfr)/a.ee[-1] Out[16]: 0.10450881266649384 In [17]: np.sum(b.pfr)/b.ee[-1] Out[17]: 0.11134881257006096
Select complete SProfile (ProcessHit hit candidates) in A,B:
// sevt.py : pf = f.junoSD_PMT_v2_SProfile In [30]: aa = a.pf[a.pf[:,6]>0,1:] In [38]: bb = b.pf[b.pf[:,6]>0,1:]
Subtract first timestamp for each ProcessHits call. Relative to start microseconds [us]:
In [35]: aa[:,:6] - aa[:,0,np.newaxis]
Out[35]:
array([[ 0, 551, 551, 571, 678, 1899],
[ 0, 1, 1, 1, 53, 96],
[ 0, 1, 1, 1, 54, 61],
[ 0, 0, 0, 1, 52, 57],
[ 0, 1, 1, 1, 57, 63],
...
In [39]: bb[:,:6] - bb[:,0,np.newaxis]
Out[39]:
array([[ 0, 0, 0, 1, 53, 1192],
[ 0, 0, 0, 1, 54, 99],
[ 0, 1, 1, 1, 54, 64],
[ 0, 1, 1, 1, 53, 59],
[ 0, 1, 1, 1, 53, 61],
...
## 0 1 2 3--->4 5
Most (50->56 us) ProcessHits time between stamps 3 and 4 in junoSD_PMT_v2::get_pmtid
"get_pmtid_fast" (~1us)
Only few SD volumes => use simpler approach :
const G4VTouchable* th = track->GetTouchable(); int ID(-1) ; ID = th->GetReplicaNumber(1); if( ID <= 0) ID = th->GetReplicaNumber(2);
int junoSD_PMT_v2::get_pmtid(G4Track* track) { int ipmt= -1; const G4VTouchable* touch= track->GetTouchable(); int nd= touch->GetHistoryDepth(); int id=0; for (id=0; idGetVolume(id)==track->GetVolume()) { int idid=1; G4VPhysicalVolume* tmp_pv=NULL; for (idid=1; idid < (nd-id); ++idid) { tmp_pv = touch->GetVolume(id+idid); G4LogicalVolume* mother_vol = tmp_pv->GetLogicalVolume(); G4LogicalVolume* daughter_vol = touch->GetVolume(id+idid-1)-> GetLogicalVolume(); int no_daugh = mother_vol -> GetNoDaughters(); if (no_daugh > 1) { int count = 0; for (int i=0; (count<2) &&(i < no_daugh); ++i) { if (daughter_vol->GetName() ==mother_vol->GetDaughter(i) ->GetLogicalVolume()->GetName()) { ++count; } } if (count > 1) break; } } ipmt= touch->GetReplicaNumber(id+idid-1); break; } } return ipmt; }
Looping over daughters at multiple levels and name matching to find copyNo : NO NEED FOR SUCH GENERALITY
spmt (old/new makes no difference):: junoSD_PMT_v2::ProcessHits pmtID 325044 pmtID_1 325044 pmtID_2 325044 pv PMT_3inch_body_phys junoSD_PMT_v2::ProcessHits C4Touchable::Desc HistoryDepth 7 TouchDepth 0 ReplicaDepth 1 ReplicaNumber 325044 i 0 cp 0 nd 2 so PMT_3inch_body_solid_ell_ell_helper pv PMT_3inch_body_phys i 1 cp 325044 nd 2 so PMT_3inch_pmt_solid pv PMT_3inch_log_phys i 2 cp 0 nd 46276 so sInnerWater pv pInnerWater i 3 cp 0 nd 1 so sReflectorInCD pv pCentralDetector i 4 cp 0 nd 4521 so sOuterWaterPool pv pOuterWaterPool i 5 cp 0 nd 1 so sPoolLining pv pPoolLining i 6 cp 0 nd 1 so sBottomRock pv pBtmRock LPMT_NEW (same depth to get copynumber as SPMT):: junoSD_PMT_v2::ProcessHits pmtID 1425 pmtID_1 1425 pmtID_2 1425 pv NNVTMCPPMT_PMT_20inch_log_phys junoSD_PMT_v2::ProcessHits C4Touchable::Desc HistoryDepth 7 TouchDepth 0 ReplicaDepth 1 ReplicaNumber 1425 i 0 cp 0 nd 1 so NNVTMCPPMT_PMT_20inch_pmt_solid_head pv NNVTMCPPMT_PMT_20inch_log_phys i 1 cp 1425 nd 3 so NNVTMCPPMTsMask_virtual pv pLPMT_NNVT_MCPPMT i 2 cp 0 nd 46276 so sInnerWater pv pInnerWater i 3 cp 0 nd 1 so sReflectorInCD pv pCentralDetector i 4 cp 0 nd 4521 so sOuterWaterPool pv pOuterWaterPool i 5 cp 0 nd 1 so sPoolLining pv pPoolLining i 6 cp 0 nd 1 so sBottomRock pv pBtmRock
Also checked with input photons targetting Hama:0:0 ## (pmtid == 0)
LPMT_OLD (one more level to get copynumber):: junoSD_PMT_v2::ProcessHits pmtID 1425 pmtID_1 1425 pmtID_2 1425 pv NNVTMCPPMT_PMT_20inch_body_phys junoSD_PMT_v2::ProcessHits C4Touchable::Desc HistoryDepth 8 TouchDepth 0 ReplicaDepth 2 ReplicaNumber 1425 i 0 cp 0 nd 2 so NNVTMCPPMT_PMT_20inch_body_solid_head pv NNVTMCPPMT_PMT_20inch_body_phys i 1 cp 0 nd 1 so NNVTMCPPMT_PMT_20inch_pmt_solid_head pv NNVTMCPPMT_PMT_20inch_log_phys i 2 cp 1425 nd 3 so NNVTMCPPMTsMask_virtual pv pLPMT_NNVT_MCPPMT i 3 cp 0 nd 46276 so sInnerWater pv pInnerWater i 4 cp 0 nd 1 so sReflectorInCD pv pCentralDetector WPMT (same depth to get copynumber as LPMT_OLD) junoSD_PMT_v2::ProcessHits pmtID 31000 pmtID_1 31000 pmtID_2 31000 pv PMT_20inch_veto_body_phys junoSD_PMT_v2::ProcessHits C4Touchable::Desc HistoryDepth 6 TouchDepth 0 ReplicaDepth 2 ReplicaNumber 31000 i 0 cp 0 nd 2 so PMT_20inch_veto_body_solid_1_2 pv PMT_20inch_veto_body_phys i 1 cp 0 nd 1 so PMT_20inch_veto_pmt_solid_1_2 pv PMT_20inch_veto_log_phys i 2 cp 31000 nd 2 so mask_PMT_20inch_vetosMask_virtual pv mask_PMT_20inch_vetolMaskVirtual_phys i 3 cp 0 nd 4521 so sOuterWaterPool pv pOuterWaterPool i 4 cp 0 nd 1 so sPoolLining pv pPoolLining i 5 cp 0 nd 1 so sBottomRock pv pBtmRock
OPTICKS_INPUT_PHOTON_FRAME=PMT_20inch_veto:0:1000 ntds2
N=1 MODE=3 ~/j/ntds/ntds.sh ana ## checking "get_pmtid_fast" for WPMT
Yaoguang : Event Time from Mean of 100 runs of 500 evt
--pmt-unnatural-geometry | 1.40s | |
--pmt-natural-geometry | 2.04s | (~45% longer with bug fixes ?) |
/junofs/users/wangyg/Software/juno-dev/Official/sim_dat
python ${RunPath}/tut_detsim.py --evtmax 500 --seed $1 ... (output opts skipped) ... --pmt-natural-geometry [ OR --pmt-unnatural-geometry ] gun --particles gamma --momentums 2.223 --momentums-interp KineticEnergy --positions 0 0 0
Simon : Comparing an implementation with many known bugs with one that fixes them
Natural always faster than Unnatural
A | N=0 | --pmt-unnatural-geometry |
B | N=1 | --pmt-natural-geometry |
B always faster than A in all config, variations tried
Debug build, with Opticks U4Recorder, only opticks-anamgr
GUN=2 ntds2_cf: Count Total(ms) A : DetSimAlg 10 55232.92090 B : DetSimAlg 10 48998.21484
Debug build, no Opticks, default anamgr
GUN=2 ntds0_cf: Count Total(ms) A : DetSimAlg 10 32182.09277 B : DetSimAlg 10 20433.68774
Release build, no Opticks, default anamgr
GUN=2 ntds0_cf: Count Total(ms) A : DetSimAlg 10 13037.99695 B : DetSimAlg 10 12551.25012
Release build, no Opticks, default anamgr, switch to "get_pmtid_fast" (B: ~6% faster)
GUN=2 ntds0_cf: Count Total(ms) A : DetSimAlg 10 12678.48804 B : DetSimAlg 10 11844.33704
Release build, no Opticks, default anamgr, switch to "get_pmtid_fast" (B: ~6% faster), disable all anamgr
GUN=2 ntds0_cf: Count Total(ms)
A : DetSimAlg 10 11973.54700
B : DetSimAlg 10 11426.45898 Workstation : : A,B get closer as memory decreases ?
A/B B/A ratios
A B array([[55232.921, 48998.215], [32182.093, 20433.688], [13037.997, 12551.25 ]]) a[:,1]/a[:,0] # B/A array([0.887, 0.635, 0.963]) a[:,0]/a[:,1] # A/B array([1.127, 1.575, 1.039])
0 | Both slow | recording every photon point is expensive (also fake skipping is expensive) |
1 | A,B most diff | default multiple anamgr => memory/IO pressure ?
|
2 | A,B closest | Release eases memory ?
|
cfmeta for 5 evt
NEVT=5 ./ntds.sh cfmeta msab.c_itab array([ [[26082, 25126, 25586, 26383, 26177], # A 0:ProcessHits_count [31874, 31477, 31544, 31319, 32353]],# B [[ 4022, 3967, 3934, 4193, 4137], 1:ProcessHits_true [ 3972, 4011, 3941, 4026, 3947]], [[22060, 21159, 21652, 22190, 22040], 2:ProcessHits_false [27902, 27466, 27603, 27293, 28406]], [[ 4022, 3967, 3934, 4193, 4137], 3:SaveNormHit_count [ 3972, 4011, 3941, 4026, 3947]], ... [[25249, 24311, 24723, 25561, 25368], 8:NEDEP [30964, 30552, 30654, 30408, 31459]], [[ 1, 1, 4, 0, 2], 9:NBOUND [ 99, 92, 94, 93, 81]],
A | fake volume 1e-3 mm larger than vacuum is Sensitive |
B | full 5mm larger than vacuum is Sensitive
|
Only Opticks anamgr
When using Opticks U4Recorder, only --opticks-anamgr is enabled:
anamgr(){ cat << EOU --opticks-anamgr --no-anamgr-normal --no-anamgr-genevt --no-anamgr-edm-v2 --no-anamgr-grdm --no-anamgr-deposit --no-anamgr-deposit-tt --no-anamgr-interesting-process --no-anamgr-optical-parameter --no-anamgr-timer EOU }
Suggestions to understand futher:
which has most impact on A/B performance ?
However : Pointless to spend much effort as A is WRONG:
(*) these actually fixed in branch
Detailed comparison only worthwhile between valid alternatives