JUNO Opticks/Geant4 Optical Photon Simulation Matching

Opticks replaces Geant4 optical photon simulation with an equivalent implementation that benefits from state-of-the-art GPU ray tracing from NVIDIA OptiX. All optically relevant aspects of Geant4 context must be translated+copied to GPU:

geometry : solids, structure, material+surface properties

generation : Cerenkov + Scintillation (using Gensteps from Geant4)

propagation : Rayleigh scattering, absorption, reemission, boundary

Achieving+maintaining equivalence is time consuming, BUT:

transformative performance benefits >1000x Geant4

Simon C Blyth, July 12, 2021

Geant4OpticksWorkflow

Feb 2021 Review : "More manpower to Opticks"

Simon Blyth, two "branches" of development throughout this year:
1. JUNO-Opticks simulation matching, the focus of this presentation
2. Opticks with all new : NVIDIA OptiX 7 API
  - demands reimplementation of all GPU code + GPU/CPU shared geometry model

Yuxiang Hu, brave first JUNO-Opticks user
- installed JUNO-Offline + Opticks onto new NVIDIA Quadro 8000 GPU workstation
- helped with finding rough edges in installation and usage
- finding hit discrepancies motivated work on simulation matching

Wang Zike, non-JUNO : applying Opticks to huge Cerenkov water detectors
- found Geant4/Opticks discrepancy in Cerenkov wavelength range (RINDEX vs standard)

Helpful addition of active users, BUT: JUNO-Opticks still needs many more active users

early adopter payback : drastically faster optical simulation

Outline 1 : Matching Machinery + Fixes

Simulation Matching Machinery
- p6: JUNO-Offline : Tools for Optical Photon Simulation Matching
- p7: Opticks : Tools for Optical Matching, Reemission Bookkeeping
- p8: G4OpticksRecorder : full details of Geant4 simulation passed to Opticks
- p9: JUNO Offline DsG4Scintillation : Reemission Bookkeeping
- p10: Comparing Opticks/Geant4 photon history counts : "tut_detsim.py gun" event

Examples of recently fixed issues
- p11: Overview of JUNO-Opticks issues and fixes
- p12: Photons not stopping at Tyvek, FIXED by adding explicit surfaces
- p13: Prepare an input photon that misses PMTs for checking the Water
- p14,15,18: Opticks : excess AB, lack SA in innerWater (FIX: remove degenerates)
- p16,17: PMT Volume GDML plots showing degenerates
- p19: Less scattering with 100k input photons in Water, FIX: X4MaterialWater

Outline 2 : Scintillation + Cerenkov + Geometry

Scintillation Wavelength Matching
- p20: Compare Geant4 and Opticks implementations
- p21: Explanations of following plots
- p22,23,24,25: wavelength comparison plots
- p26: Inverse Cumulative Distribution Function (ICDF)
- p27: "Multi-resolution" GPU texture (3,4096,1) : 20x Resolution for 3x bins

Cerenkov Wavelength Matching
- p28: JUNO Offline G4Cerenkov_modified BUG FIX
- p29: Cerenkov Cone Angle + Beta Requirements reminder
- p30,31: Cerenkov distributions ASIS and SKIP_CONTINUE
- p32: Cerenkov wavelength comparison : First Look after bug fix
- p33: Poor Chi2 Match : Few bins + interpolation difference ?

Remaining JUNO-Opticks Issues + Summary
- p34: Remaining JUNO-Opticks Issues, TODO List
- p35-41: Look into slow geometry
- p42: Back to OK/G4 photon history counts : small LS absorption excess ?
- p43: LS properties as function of wavelength
- p44: Summary and Links

Outline 3 : Extras on Opticks NVIDIA OptiX 7 future

Extras on Opticks future
- NVIDIA OptiX 7 : Entirely new thin API (Introduced Aug 2019)
- New "Foundry" Model : Sharef CPU/GPU Geometry Context
- 1st JUNO Opticks OptiX 7 Ray-trace
- Current JUNO Geometry : Auto-Factorized by "progeny digest"
- Vary Geom. Compare Render Times
- JUNO Geometry : OptiX 7 Ray Trace Times ~2M pixels : TITAN RTX
- JUNO ALL PMTs : 2M ray traced pixels in 0.0097 s : NVIDIA TITAN RTX, NVIDIA OptiX 7.0.0, Opticks
- LS properties as function of wavelenth

JUNO Offline : Tools for Optical Photon Simulation Matching

tut_detsim.py options:

--opticks-mode 1 : Opticks optical simulation only
--opticks-mode 2 : Geant4 optical simulation only
--opticks-mode 3 : both OK+G4 -> writes paired OpticksEvent, OK:1,2,.. G4:-1,-2,..
--opticks-anamgr : pass all G4Track G4Step to G4OpticksRecorder

Using GtOpticksTool GenTool to run from input photons : NumPy arrays (num_photons,4,4):

tut_detsim.py ...
   --opticks-mode 3 ... opticks
   --input-photon-path /path/to/input_photons.npy
   --input-photon-repeat 10000                         ## repeat photons
   --input-photon-wavelength 360,380,400,420,440,460   ## override

create photon arrays : ~/opticks/ana/input_photons.py,parallel_input_photons.py
focus on an issue by repeating a single photon
create parallel beam of offset photons to probe problem geometry

Opticks : Tools for Optical Photon Simulation Matching

G4OpticksRecorder/CManager/CRecorder/CWriter

records Geant4 propagations into OpticksEvent data structures
OpticksEvent are persisted to NumPy .npy files

~/opticks/ana/evt.py ab.py ... NumPy based analysis machinery

ab.py : load and compare paired events
- a:OK tags:1,2,3,..
- b:G4 tags:-1,-2,-3,...

C+S Gensteps : Geant4 -> Opticks

number of photons to generate, line segment to generate them along, any other parameters needed
-> number of photons always equal in OK and G4
Opticks bookkeeping treats reemission as "rebirth" of a fraction of the absorbed
- not treated as separate photons

G4OpticksRecorder : full details of Geant4 simulation passed to Opticks


 34 struct G4OK_API G4OpticksRecorder
 35 {
 ...
 47     G4OpticksRecorder();
 48     virtual ~G4OpticksRecorder();
 49
 50     void setGeometry(const GGeo* ggeo);
 51
 52     virtual void BeginOfRunAction(const G4Run*);
 53     virtual void EndOfRunAction(const G4Run*);
 54
 55     virtual void BeginOfEventAction(const G4Event*);
 56     virtual void EndOfEventAction(const G4Event*);
 57
 58     virtual void PreUserTrackingAction(const G4Track*);
 59     virtual void PostUserTrackingAction(const G4Track*);
 60
 61     virtual void UserSteppingAction(const G4Step*);
 62
 63     void ProcessHits( const G4Step* step, bool efficiency_collect );
 64
 65     static G4OpticksRecorder*  fInstance;
 66
 67 private:
 68     void TerminateEvent();
 69
 70 };

JUNO Offline DsG4Scintillation : Reemission Bookkeeping

Both simulations using same REEMISSIONPROB but different mechanics

 200 G4VParticleChange*
 201 DsG4Scintillation::PostStepDoIt(const G4Track& aTrack, const G4Step& aStep)
 ...
 235             flagReemission= doReemission
 236                 && aTrack.GetTrackStatus() == fStopAndKill
 237                 && aStep.GetPostStepPoint()->GetStepStatus() != fGeomBoundary;

Geant4 : bulk absorbed G4Track yield 0/1 reemission photon as secondary G4Track
Opticks : fraction of BULK_ABSORB (AB) photons reborn in the same CUDA thread, no "secondary" to handle

Geant4 WITH_G4OPTICKS photon_id

photon_id assigned at first S + C generation (attached to G4Track with G4VUserTrackInformation)
photon_id retained thru reemission generations

CRecorder/CWriter : REJOIN reemission secondary tracks with ancestors

continue writing reemission secondary G4Track steps to original photon_id record

Comparing OK/G4 photon history counts for "tut_detsim.py gun" event

 epsilon:ana blyth$ tds3gun.sh 1    ## seqhis: 64bit uint : 16x4bit step flags for each photon
 In [1]: ab.his[:20]   ##  OK:1    G4:-1     OK-G4   "c2" deviation

 .  n           seqhis        a        b      a-b     (a-b)^2/(a+b)   label                ## optickscore/OpticksPhoton.h enum
           ## hexstring
 0000               42     1822     1721      101            2.88     SI AB                ## AB : BULK_ABSORB 
 0001            7ccc2     1446     1406       40            0.56     SI BT BT BT SD       ## SD : SURFACE_DETECT 
 0002           7ccc62      672      666        6            0.03     SI SC BT BT BT SD    ## SC : BULK_SCATTER 
 0003            8ccc2      649      597       52            2.17     SI BT BT BT SA       ## BT : BOUNDARY_TRANSMIT 
 0004             8cc2      606      615       -9            0.07     SI BT BT SA          ## SI : SCINTILLATION 
 0005              452      538      536        2            0.00     SI RE AB             ## RE : BULK_REEMIT 
 0006           7ccc52      433      438       -5            0.03     SI RE BT BT BT SD
 0007              462      397      405       -8            0.08     SI SC AB
 0008           8ccc62      269      262        7            0.09     SI SC BT BT BT SA
 0009          7ccc662      242      222       20            0.86     SI SC SC BT BT BT SD
 0010            8cc62      217      212        5            0.06     SI SC BT BT SA
 0011          7ccc652      211      205        6            0.09     SI RE SC BT BT BT SD
 0012           8ccc52      200      201       -1            0.00     SI RE BT BT BT SA
 0013            8cc52      158      192      -34            3.30     SI RE BT BT SA
 0014             4552      181      165       16            0.74     SI RE RE AB
 0015               41      164      145       19            1.17     CK AB                ## CK : CERENKOV
 0016          7ccc552      135      160      -25            2.12     SI RE RE BT BT BT SD
 0017             4cc2      130      115       15            0.92     SI BT BT AB
 0018             4652      120      117        3            0.04     SI RE SC AB

 .             TOTALS:    11684    11684                    52.92     52.92/63 =  0.84   pvalue:P[C2>]:0.814  1-pvalue:P[C2<]:0.186

 In [2]: ab.his.ss[ab.his.c2 > 2.7]       ## largest chi-squared contributors -> next issue to look into  
 Out[2]:
 array(['0000               42     1822     1721      101            2.88     SI AB',
        '0013            8cc52      158      192      -34            3.30     SI RE BT BT SA',
        '0020          8ccc662       75      108      -33            5.95     SI SC SC BT BT BT SA',
        '0027           8cc552       49       70      -21            3.71     SI RE RE BT BT SA',
        '0029             4562       39       66      -27            6.94     SI SC RE AB'],

Overview of fixed JUNO-Opticks material property/geometry issues

photons not stopping at Tyvek
- FIX: add explicit surfaces for GPU corresponding to implicit RINDEX-NoRINDEX

excess BULK_ABSORB (AB), lack SURFACE_ABSORB (SA) in inner Water
- FIX: skip near degenerate(-0.001 mm) pointless Pyrex///Pyrex PMT solids for GPU
- must avoid degeneracy as Opticks material/surface properties are boundary based

much less scattering in the Water
- FIX: get Geant4 calculated RAYLEIGH property for "Water" to GPU
- Geant4 special casing of "Water" now matched with Opticks

poor scintillation wavelength chi2 match in tails
- FIX: adopt "multi-resolution" GPU texture to 20x resolution for tails
- Actually was non-issue : but the fix technique is useful anyhow

matching G4Cerenkov_modified deferred as found/investigated/reported Offline bug
- Tao : bug fixed in "Pre1" already

Photons not stopping at Tyvek, FIXED by adding explicit surfaces

Major problems -> huge chi2, from zeros in history table
G4: implicit RINDEX-NoRINDEX surface at Tyvek///Water boundary
OK: FIX with explicit surface Tyvek//Implicit_RINDEX_NoRINDEX_pInnerWater_pCentralDetector/Water

In [7]: ab.a.seqhis_ana.table.compare(ab.b.seqhis_ana.table, ordering="sum")
Out[7]:
.                   cfo:sum  1:g4live:tds3gun   -1:g4live:tds3gun
.                         11278     11278                  1766.28/69 = 25.60  (pval:0.000 prob:1.000)
0000               42      1653      1665    -12             0.04    [2 ] SI AB
0001            7ccc2      1292      1230     62             1.52    [5 ] SI BT BT BT SD
0002            8ccc2       590       674    -84             5.58    [5 ] SI BT BT BT SA
0003           7ccc62       581       552     29             0.74    [6 ] SI SC BT BT BT SD
0004              452       422       534   -112            13.12    [3 ] SI RE AB
0005           7ccc52       380       397    -17             0.37    [6 ] SI RE BT BT BT SD
0006              462       392       367     25             0.82    [3 ] SI SC AB
0007           8ccc62       251       267    -16             0.49    [6 ] SI SC BT BT BT SA
0008             8cc2         0       464   -464           464.00    [4 ] SI BT BT SA  ## LS->Acrylic->Water->SURFACE_ABSORB on Tyvek 
0009          7ccc662       219       213      6             0.08    [7 ] SI SC SC BT BT BT SD
0010             4cc2       278       127    151            56.30    [4 ] SI BT BT AB
0011           8ccc52       154       188    -34             3.38    [6 ] SI RE BT BT BT SA
0012          7ccc652       157       159     -2             0.01    [7 ] SI RE SC BT BT BT SD
0013               41       142       144     -2             0.01    [2 ] CK AB
0014            4cc62       197        71    126            59.24    [5 ] SI SC BT BT AB
0015             4552       124       142    -18             1.22    [4 ] SI RE RE AB
0016             4662       137       121     16             0.99    [4 ] SI SC SC AB
0017             4652       121       112      9             0.35    [4 ] SI RE SC AB
0018          7ccc552       102       124    -22             2.14    [7 ] SI RE RE BT BT BT SD
0019           7cccc2        53       154   -101            49.28    [6 ] SI BT BT BT BT SD
0020            8cc62         0       186   -186           186.00    [5 ] SI SC BT BT SA
0021          8ccc662        69       107    -38             8.20    [7 ] SI SC SC BT BT BT SA

Prepare an input photon that misses PMTs for checking the Water

epsilon:ana blyth$ tds3gun.sh 1      # IPython session using Opticks NumPy based analysis machinery  

In [1]: b.sel = "SI BT BT SA"    # selecting G4 photons that go direct to Tyvek

In [2]: b.ox.shape               # ox : final photons array honours the selection
Out[2]: (464, 4, 4)

In [3]: p = b.ox[0:1] ; p        # first photon
Out[3]:
A([[[ 14210.083 ,   5228.8896, -13142.859 ,    110.41  ],        # position, time
    [     0.7191,      0.2595,     -0.6447,      1.    ],        # direction, weight
    [    -0.4468,     -0.538 ,     -0.7148,    415.3976],        # polarization, wavelength 
    [     0.    ,      0.    ,      0.    ,      0.    ]]],      # uint flags: not visible as float
    dtype=float32)

In [4]: p[0,0,:3] -= p[0,1,:3]*2220.   # back up the photon by 2220mm to position 10mm into the water

In [5]: np.save("/tmp/check_innerwater.npy", p )   # save photon array with 1 photon to .npy file

Perform both Geant4 and Opticks simulations with 100k repeats of the input photon

tut_detsim.py ...
    --opticks-mode 3     # enable Geant4 + Opticks simulations
    --opticks-anamgr     # G4Track, G4Step --> G4OpticksRecorder
    opticks              # GtOpticksTool input photon "GenTool"
    --input-photon-path /tmp/check_innerwater.npy
    --input-photon-repeat 100000

Opticks : excess AB, lack SA in innerWater (FIX: remove degenerates)

In [2]: ab.his          # repr of opticks.ana.seq:SeqTable object 
Out[2]:
ab.his
.       seqhis_ana  cfo:sum  1:g4live:tds3gun   -1:g4live:tds3gun        c2        ab        ba
.                         11142     11142                  616.50/66 =  9.34
   n             iseq         a         b    a-b               c2    [ns] label
0000               42      1666      1621     45             0.62    [2 ] SI AB
0001            7ccc2      1264      1258      6             0.01    [5 ] SI BT BT BT SD
0002            8ccc2       581       766   -185            25.41    [5 ] SI BT BT BT SA
0003           7ccc62       579       543     36             1.16    [6 ] SI SC BT BT BT SD
0004             8cc2       570       496     74             5.14    [4 ] SI BT BT SA
0005              452       408       495    -87             8.38    [3 ] SI RE AB
0006              462       399       351     48             3.07    [3 ] SI SC AB
0007           7ccc52       360       385    -25             0.84    [6 ] SI RE BT BT BT SD
0008           8ccc62       239       258    -19             0.73    [6 ] SI SC BT BT BT SA
0009          7ccc662       218       195     23             1.28    [7 ] SI SC SC BT BT BT SD
0010            8cc62       195       180     15             0.60    [5 ] SI SC BT BT SA
0011             4cc2       268       104    164            72.30    [4 ] SI BT BT AB
0012           8ccc52       160       191    -31             2.74    [6 ] SI RE BT BT BT SA
0013          7ccc652       163       165     -2             0.01    [7 ] SI RE SC BT BT BT SD
0014               41       156       160     -4             0.05    [2 ] CK AB
0015             4552       118       152    -34             4.28    [4 ] SI RE RE AB
0016            8cc52       136       133      3             0.03    [5 ] SI RE BT BT SA
0017             4662       125       114     11             0.51    [4 ] SI SC SC AB
0018            4cc62       189        40    149            96.95    [5 ] SI SC BT BT AB
.                         11142     11142                  616.50/66 =  9.34

SI BT BT AB: photons that (BT) BOUNDARY_TRANSMIT from : LS -> Acrylic -> Water where they are (AB) BULK_ABSORB-ed
SI BT BT BT SA: LS -> Acrylic -> Water -> Pyrex where they are (SA) SURFACE_ABSORB-ed on Pyrex/Vacuum boundary

Opticks : excess AB, lack SA in innerWater (FIX: remove degenerates)

wrong boundary means Opticks using Pyrex properties, not Water, for a fraction of photons (5%)

In [3]: a.sel = "SI BT BT BT SA"      ## select photons with this history
In [15]: a.bn.reshape(-1,4).view(np.int8)[:20]     ## signed boundary at up to 16 steps of the photon 
Out[15]:
A([[ 18,  17, -23, -25,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
   [ 18,  17, -23, -27,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
   ...
   [ 18,  17, -23, -27,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
   [ 18,  17, -24, -25,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0]], dtype=int8)

In [17]: print(a.blib.format(a.bn.reshape(-1,4).view(np.int8)[0]))   ## 16*char(int8) packed into uint4 4*32bit

 18 : Acrylic///LS            ##  boundary : outer-material/outer-surface/inner-surface/inner-material  
 17 : Water///Acrylic         ## +ve boundary : photon direction is with the normal, exiting the shape  
-23 : Water///Pyrex           ## -ve boundary : photon direction against the normal, entering the shape 
-25 : Pyrex/NNVTMCPPMT_photocathode_logsurf2/NNVTMCPPMT_photocathode_logsurf1/Vacuum

In [18]: print(a.blib.format(a.bn.reshape(-1,4).view(np.int8)[1]))
 18 : Acrylic///LS
 17 : Water///Acrylic
-23 : Water///Pyrex
-27 : Pyrex/HamamatsuR12860_photocathode_logsurf2/HamamatsuR12860_photocathode_logsurf1/Vacuum

In [19]: print(a.blib.format(a.bn.reshape(-1,4).view(np.int8)[19]))
 18 : Acrylic///LS
 17 : Water///Acrylic
-24 : Pyrex///Pyrex   ## expect Water///Pyrex --> GEOMETRY PROBLEM 
-25 : Pyrex/NNVTMCPPMT_photocathode_logsurf2/NNVTMCPPMT_photocathode_logsurf1/Vacuum

All LPMTs : near degenerate Water///Pyrex and "pointless" same material Pyrex///Pyrex borders offset: 0.001mm

gpmt Hamamatsu jun28

gpmt NNVT jun28

Opticks : excess AB, lack SA in innerWater : FIX remove degenerates

Possible approaches to handle degenerate geometry:

fix source geometry removing the pointless Pyrex///Pyrex -0.001mm boundary

seems to cause no problem for Geant4 : get "microStep" of ~0.001-0.002mm

eg --dbgseqhis 0x7ccccd dumps Geant4 details for photons in category

experience shows is time consuming to get agreement on geometry change

runtime workaround trying to notice inconsistency by maintaining state of which material are in

need very strong reasons to add complexity to GPU code, so disqualify this

skip the near degenerate inner Pyrex solids in the translated GPU geometry

easy, causing no problems because are removing a pointless Pyrex///Pyrex border

BUT: requires suppression of Geant4 "microSteps" in CRecorder to facilitate history comparison

Currently using 3, with Opticks commandline option:

--skipsolidname
    NNVTMCPPMT_body_solid,
    HamamatsuR12860_body_solid_1_9,
    PMT_20inch_veto_body_solid_1_2

Less scattering with 100k input photons in Water, FIX: X4MaterialWater

epsilon:ana blyth$ tds3ip.sh 1
In [1]: ab.his[:15]
Out[1]:
ab.his
.       seqhis_ana  cfo:sum  1:g4live:tds3ip   -1:g4live:tds3ip
.                        100000    100000                  739.41/9 = 82.16  (pval:0.000 prob:1.000)
0000               8d     93766     92777    989             5.24   [2 ] TO SA
0001               4d      6031      5918    113             1.07   [2 ] TO AB
0002             7c6d        38       311   -273           213.55   [4 ] TO SC BT SD
0003              86d        33       236   -203           153.19   [3 ] TO SC SA
0004            4cc6d        10       212   -202           183.80   [5 ] TO SC BT BT AB
0005             8c6d        13        80    -67            48.27   [4 ] TO SC BT SA
0006              46d        20        63    -43            22.28   [3 ] TO SC AB
0007          8ccac6d         0        72    -72            72.00   [7 ] TO SC BT SR BT BT SA
0008           46cc6d         1        39    -38            36.10   [6 ] TO SC BT BT SC AB
0009             4c6d        10        21    -11             3.90   [4 ] TO SC BT AB
0010            7cc6d         2        27    -25             0.00   [5 ] TO SC BT BT SD
0011       ccacccac6d         0        26    -26             0.00   [10] TO SC BT SR BT BT BT SR BT BT
0012           4ccc6d         0        19    -19             0.00   [6 ] TO SC BT BT BT AB
0013         7ccccc6d         9         9      0             0.00   [8 ] TO SC BT BT BT BT BT SD
0014          466cc6d         1        17    -16             0.00   [7 ] TO SC BT BT SC SC AB
.                        100000    100000                  739.41/9 = 82.16  (pval:0.000 prob:1.000)

JUNO "Water" material has no RAYLEIGH scattering length property
G4OpRayleigh calculates RAYLEIGH for materials named "Water" with RINDEX and no RAYLEIGH
- BUT: G4Material is unchanged, only G4OpRayleigh::thePhysicsTable
- Opticks converts materials, not process physics tables
- https://bitbucket.org/simoncblyth/opticks/src/master/extg4/X4MaterialWater.cc
  - use G4OpRayleigh to calculate RAYLEIGH and add it to the G4Material so Opticks sees it

Scintillation Wavelength : Compare Geant4 and Opticks implementations

 200 G4VParticleChange*
 201 DsG4Scintillation::PostStepDoIt(
        const G4Track& aTrack, const G4Step& aStep)
 208 {
 ...
 540     G4PhysicsOrderedFreeVector* ScintillationIntegral = NULL;
 543     ScintillationIntegral =
 544       (G4PhysicsOrderedFreeVector*)
               ((*theFastIntegralTable)(materialIndex));
 ...
 579    for(G4int i = 0 ; i < NumPhoton ; i++) {
 ...
 583    G4double sampledEnergy;
 586       G4double CIIvalue = G4UniformRand()*
 587          ScintillationIntegral->GetMaxValue();
 588       sampledEnergy=
 589          ScintillationIntegral->GetEnergy(CIIvalue);  

 096 G4double G4PhysicsOrderedFreeVector::GetEnergy(G4double aValue)
  97 {
  98         G4double e;
 ...
 104           size_t closestBin = FindValueBinLocation(aValue);
 105           e = LinearInterpolationOfEnergy(aValue, closestBin);
 107         return e;
 108 }
 110 size_t G4PhysicsOrderedFreeVector::FindValueBinLocation(G4double aValue)
 111 {
 112         size_t bin = std::lower_bound(dataVector.begin(), dataVector.end(), aValue)
 113                    - dataVector.begin() - 1;
 114         bin = std::min(bin, numberOfNodes-2);
 115         return bin;
 116 }

Geant4 : find ICDF bin then interpolate, Opticks : direct linear interpolation on HD GPU texture : 4096 bins

Scintillation Wavelength Matching : Explanations of following plots

hd0 : simple 4096 bins
hd20 : 4096 bins with 20x bins for u < 0.05 and u > 0.95
cudaFilterModePoint : linear interpolation disabled

DsG4Scintillator compared with	chi2	note
hd0_cudaFilterModePoint	13.79	"artifact" issue in tails, terrible chi2
hd0	1.00	interpolation enabled, fixes chi2
hd20	0.88	effectively x20 bins, improves chi2
hd20_cudaFilterModePoint	1.29	XHD "HD20" even OK without interpolation

GPU textures can reproduce very precisely inverse cumulative distribution functions by using high definition
"hd20" multi-resolution technique can effect variable resolution suited to scintillator inverse-CDF

1M Wavelength Sample comparison hd0 no interpolation

1M Wavelength Sample comparison hd0

1M Wavelength Sample comparison hd20

1M Wavelength Sample comparison hd20 no interpolation

GScintillatorLib ICDF

"Multi-resolution" GPU texture (3,4096,1) : 20x Resolution for 3x bins

156 NPY<double>* X4Scintillation::CreateGeant4InterpolatedInverseCDF(
157        const G4PhysicsOrderedFreeVector* ScintillatorIntegral,
158        unsigned num_bins, unsigned hd_factor, ..
161 )
164     double mx = ScintillatorIntegral->GetMaxValue() ;
166     NPY<double>* icdf = NPY<double>::make(3, num_bins, 1 );
170     int nj = icdf->getShape(1);
180     double edge = 1./double(hd_factor) ;
...
202     for(int j=0 ; j < nj ; j++)
203     {
204     double u_all = double(j)/double(nj) ;
205     double u_lhs = double(j)/double(hd_factor*nj) ;
206     double u_rhs = 1. - edge + double(j)/double(hd_factor*nj) ;
207
208     double energy_all = ScintillatorIntegral->GetEnergy( u_all*mx );
209     double energy_lhs = ScintillatorIntegral->GetEnergy( u_lhs*mx );
210     double energy_rhs = ScintillatorIntegral->GetEnergy( u_rhs*mx );
211
212     double wavelength_all = h_Planck*c_light/energy_all/nm ;
213     double wavelength_lhs = h_Planck*c_light/energy_lhs/nm ;
214     double wavelength_rhs = h_Planck*c_light/energy_rhs/nm ;
215
217     icdf->setValue(0, j, 0, 0,  wavelength_all );
218     icdf->setValue(1, j, 0, 0,  wavelength_lhs );
219     icdf->setValue(2, j, 0, 0,  wavelength_rhs );
220     }
221     return icdf ;
222 }

https://bitbucket.org/simoncblyth/opticks/src/master/extg4/X4Scintillation.cc

https://bitbucket.org/simoncblyth/opticks/src/master/qudarap/qctx.h

JUNO Offline trunk G4Cerenkov_modified : BUG FIXED in "Pre1"

168 G4VParticleChange*
169 G4Cerenkov_modified::PostStepDoIt(
      const G4Track& aTrack, const G4Step& aStep)
...
252   G4double Pmin = Rindex->GetMinLowEdgeEnergy();
253   G4double Pmax = Rindex->GetMaxLowEdgeEnergy();
254   G4double dp = Pmax - Pmin;
...
268   G4double maxCos = BetaInverse / nMax;
270   G4double maxSin2 = (1.0 - maxCos) * (1.0 + maxCos);
...
315   for (G4int i = 0; i < fNumPhotons; i++) {
317
318       G4double rand;
319       G4double sampledEnergy, sampledRI;
320       G4double cosTheta, sin2Theta;
...
324       do {
325          rand = G4UniformRand();
326          sampledEnergy = Pmin + rand * dp;
327          sampledRI = Rindex->Value(sampledEnergy);
328
329          // check if n(E) > 1/beta
330          if (sampledRI < BetaInverse) {
331              continue;
332          }
333
334          cosTheta = BetaInverse / sampledRI;
341
342          sin2Theta = (1.0 - cosTheta)*(1.0 + cosTheta);
343          rand = G4UniformRand();
344
346       } while (rand*maxSin2 > sin2Theta);

https://bitbucket.org/simoncblyth/opticks/src/master/examples/Geant4/CerenkovStandalone/G4Cerenkov_modifiedTest.cc

Cerenkov Cone Angle + Beta requirements reminder


              \    B    /
          \    .   |   .    /                            AC     ct / n          1         i       BetaInverse
      \    C       |       C    /             cos th =  ---- =  --------   =  ------ =   ---  =  -------------
  \    .    \      |      /    .     /                   AB       bct           b n       n        sampledRI
   .         \    bct    /          .
              \    |    /                                  BetaInverse
               \   |   /  ct                  maxCos  =  -----------------
                \  |th/  ----                                nMax
                 \ | /    n
                  \|/
                   A

Particle travels AB, light travels AC,  ACB is right angle


 Only get Cerenkov radiation when

        cos th <= 1 ,

        beta >= beta_min = 1/n        BetaInverse <= BetaInverse_max = n


 At the small beta threshold AB = AC,   beta = beta_min = 1/n     eg for n = 1.333, beta_min = 0.75

        cos th = 1,  th = 0         light is in direction of the particle


 For ultra relativistic particle beta = 1, there is a maximum angle

        th = arccos( 1/n )

G4Cerenkov_modifiedTest_ASIS

G4Cerenkov_modifiedTest_SKIP_CONTINUE

1.5/1.8 = .83, 1.5/1.62 = 0.92

2.82M Sample Cerenkov Wavelength comparison

FIRST LOOK, AFTER BUG FIX. Fixed BetaInverse = 1.50

Poor Cerenkov Chi2 Match : Few Bins + Interpolation difference ?

 In [1]: ri = np.load(os.path.join(kd, "GScintillatorLib/LS_ori/RINDEX.npy")) ; ri[:,0] *= 1e6
 In [2]: ri_w = ri.copy() ; ri_w[:,0] = 1240./ri_w[:,0] ;
 In [2]: np.c_[ri, ri_w]
 Out[2]:
 array([[  1.55  ,   1.4781, 800.    ,   1.4781],
        [  1.7951,   1.48  , 690.7886,   1.48  ],
        [  2.105 ,   1.4842, 589.0764,   1.4842],
        [  2.2708,   1.4861, 546.0703,   1.4861],
        [  2.5511,   1.4915, 486.0629,   1.4915],
        [  2.845 ,   1.4955, 435.8554,   1.4955],
        [  3.0636,   1.4988, 404.7513,   1.4988],
        [  4.1328,   1.5264, 300.038 ,   1.5264],
        [  6.2   ,   1.6185, 200.    ,   1.6185],
        [  6.526 ,   1.6176, 190.0092,   1.6176],
        [  6.889 ,   1.527 , 179.9971,   1.527 ],
        [  7.294 ,   1.5545, 170.0027,   1.5545],
        [  7.75  ,   1.793 , 160.    ,   1.793 ],
        [  8.267 ,   1.7826, 149.994 ,   1.7826],
        [  8.857 ,   1.6642, 140.0023,   1.6642],
        [  9.538 ,   1.5545, 130.0063,   1.5545],
        [ 10.33  ,   1.4536, 120.0387,   1.4536],
        [ 15.5   ,   1.4536,  80.    ,   1.4536]])

GPU and CPU code very similar
RINDEX interpolation details likely to be cause of difference

Remaining JUNO-Opticks Issues, TODO List

Expected Straightforward Updates

currently using LocalG4Cerenkov1042 not G4Cerenkov_modified
- add Opticks genstep collection to G4Cerenkov_modified
recent changes to PMTParamSvc PMTSimParamSvc
- -> reimplementation of Opticks sensor efficiency collection

Uncertain Difficulty : JUNO Geometry Issues

slow "Double Greek Temple" geometry issue from "fasteners"
implement standalone test to understand leading causes of slow geometry:
1. complex CSG tree
2. CSG subtraction of huge acrylic sphere (huge bounding box)
separately instanced siblings : 590x sStrutBallhead, uni1, base_steel, uni_acrylic3
- unrelated volumes positioned together not being instanced together
- simpler to change Opticks geometry translation than change source geometry

Currently skipping LSExpDetectorConstruction::setupCD_Sticks to allow fair comparisons.

[17]lLowerChimney_phys_all_00000

compare render times varying geometry included from same viewpoint
render time is good indicator for simulation time

[11]lLowerChimney_phys8,00000

590 "uni_acrylic3" Fasteners : cost 65x more than all 45.6k PMTs

[4]lLowerChimney_phys6,00000

590 of these "uni1" render quickly
BUT : this shape is subtracted from "uni_acrylic3"

quicktime lAddition_uni_acrylic3

LV : lAddition, Solid : uni_acrylic3

quicktime 2 lAddition_uni_acrylic3

gplt lAddition_uni_acrylic3

LV : lAddition, Solid : uni_acrylic3, Class : AdditionAcrylicConstruction

Suspect : subtracting spherical segment, sagitta 5.68mm, is performance killer.
also complicated interior of Fastener irrelevant to optical photons

dotted line is tangent to acrylic sphere circle touching at (0,0) in uni_acrylic3 frame
sagitta : 17820. - np.sqrt( np.power(17820.,2) - np.power(450.,2)) = 5.68 mm

2d GDML matplotlib.pyplot with : opticks/ana/gplt.py opticks/analytic/GDML.py parser

Possible cause of slow geometry : Subtracting 17.82 m sphere

092 AdditionAcrylicConstruction::makeAdditionLogical(){
...
108         double ZNodes3[3];
109         double RminNodes3[3];
110         double RmaxNodes3[3];
111         ZNodes3[0] = 5.7*mm; RminNodes3[0] = 0*mm; RmaxNodes3[0] = 450.*mm;
112         ZNodes3[1] = 0.0*mm; RminNodes3[1] = 0*mm; RmaxNodes3[1] = 450.*mm;
113         ZNodes3[2] = -140.0*mm; RminNodes3[2] = 0*mm; RmaxNodes3[2] = 200.*mm;
114
115         solidAddition_down = new G4Polycone("solidAddition_down",0.0*deg,360.0*deg,3,ZNodes3,RminNodes3,RmaxNodes3);
...
122     solidAddition_up = new G4Sphere("solidAddition_up",0*mm,17820*mm,0.0*deg,360.0*deg,0.0*deg,180.*deg); 
123
124     uni_acrylic1 = new G4SubtractionSolid("uni_acrylic1",solidAddition_down,solidAddition_up,0,G4ThreeVector(0*mm,0*mm,+17820.0*mm));
125
126     solidAddition_up1 = new G4Tubs("solidAddition_up1",120*mm,208*mm,15.2*mm,0.0*deg,360.0*deg);
127     uni_acrylic2 = new G4SubtractionSolid("uni_acrylic2",uni_acrylic1,solidAddition_up1,0,G4ThreeVector(0.*mm,0.*mm,-20*mm));
128     solidAddition_up2 = new G4Tubs("solidAddition_up2",0,14*mm,52.5*mm,0.0*deg,360.0*deg);
129
130     for(int i=0;i<8;i++)
131     {
132     uni_acrylic3 = new G4SubtractionSolid("uni_acrylic3",uni_acrylic2,solidAddition_up2,0,G4ThreeVector(164.*cos(i*pi/4)*mm,164.*sin(i*pi/4)*mm,-87.5));
133     uni_acrylic2 = uni_acrylic3;
134
135     }
136
137

Subtraction of huge sphere from small polycone

huge combined bounding box (bbox) for this boolean solid (repeated x590)
terrible CSG performance on GPU

Back to OK/G4 photon history counts : small LS absorption excess ?

 epsilon:ana blyth$ tds3gun.sh 1    ## seqhis: 64bit uint : 16x4bit step flags for each photon
 In [1]: ab.his[:20]   ##  OK:1    G4:-1     OK-G4   "c2" deviation

 .  n           seqhis        a        b      a-b     (a-b)^2/(a+b)   label                ## optickscore/OpticksPhoton.h enum
           ## hexstring
 0000               42     1822     1721      101            2.88     SI AB                ## AB : BULK_ABSORB 
 0001            7ccc2     1446     1406       40            0.56     SI BT BT BT SD       ## SD : SURFACE_DETECT 
 0002           7ccc62      672      666        6            0.03     SI SC BT BT BT SD    ## SC : BULK_SCATTER 
 0003            8ccc2      649      597       52            2.17     SI BT BT BT SA       ## BT : BOUNDARY_TRANSMIT 
 0004             8cc2      606      615       -9            0.07     SI BT BT SA          ## SI : SCINTILLATION 
 0005              452      538      536        2            0.00     SI RE AB             ## RE : BULK_REEMIT 
 0006           7ccc52      433      438       -5            0.03     SI RE BT BT BT SD
 0007              462      397      405       -8            0.08     SI SC AB
 0008           8ccc62      269      262        7            0.09     SI SC BT BT BT SA
 0009          7ccc662      242      222       20            0.86     SI SC SC BT BT BT SD
 0010            8cc62      217      212        5            0.06     SI SC BT BT SA
 0011          7ccc652      211      205        6            0.09     SI RE SC BT BT BT SD
 0012           8ccc52      200      201       -1            0.00     SI RE BT BT BT SA
 0013            8cc52      158      192      -34            3.30     SI RE BT BT SA
 0014             4552      181      165       16            0.74     SI RE RE AB
 0015               41      164      145       19            1.17     CK AB                ## CK : CERENKOV
 0016          7ccc552      135      160      -25            2.12     SI RE RE BT BT BT SD
 0017             4cc2      130      115       15            0.92     SI BT BT AB
 0018             4652      120      117        3            0.04     SI RE SC AB

 .             TOTALS:    11684    11684                    52.92     52.92/63 =  0.84   pvalue:P[C2>]:0.814  1-pvalue:P[C2<]:0.186

Opticks using 1nm step domain for all properties, populated by Geant4 interpolation
perhaps not fine enough for fast moving LS absorption across peak region 410->430 nm
(but first increase statistics as excess not very significant)

LS Wavelength

Summary and Links

Opticks : state-of-the-art GPU ray traced optical simulation integrated with Geant4. Match with JUNO-Geant4 is converging. More pioneer users welcome

More JUNO-Opticks users essential to bring into production

https://bitbucket.org/simoncblyth/opticks	code repository
https://github.com/simoncblyth/opticks/releases	.zip .tar.gz archives
https://simoncblyth.bitbucket.io	presentations and videos
https://groups.io/g/opticks	forum/mailing list archive
email:opticks+subscribe@groups.io	subscribe to mailing list

"Extra" Slides Follow

NVIDIA OptiX 7 : Entirely new thin API (Introduced Aug 2019)

NVIDIA OptiX 6->7 : drastically slimmed down

headers only (no library, just Driver)
low-level CUDA-centric thin API (Vulkan-ized)
Minimal host state, All host functions are thread-safe
GPU launches : explicit, asynchronous (CUDA streams)
near perfect scaling to 4 GPUs, for free
Shared CPU/GPU geometry context
GPU memory management
Multi-GPU support

Advantages

More control/flexibility over everything.

Fully benefit from future GPUs
Keep pace with state-of-the-art GPU ray tracing

Disadvantages

Demands much more developer effort than OptiX 6

Major re-implementation of Opticks required

New "Foundry" Model : Shared CPU/GPU Geometry Context

Inst : 4x4 tran. + Solid ref. ( Inst -> 1 IAS )
Solid : 1 or more Prim : ( Solid -> GAS )
Prim : 1,3,7,15,31,... Node : (Prim ~ G4VSolid)

struct CSGFoundry
{
   void upload(); // to GPU 
...
   std::vector<CSGSolid>  solid ; // compounds (eg PMT)
   std::vector<CSGPrim>   prim ;
   std::vector<CSGNode>   node ; // shapes, operators

   std::vector<float4> plan ; // planes
   std::vector<qat4>   tran ; // CSG transforms
   std::vector<qat4>   itra ; // inverse CSG transforms
   std::vector<qat4>   inst ; // instance transforms

   // entire geometry in four GPU allocations
   CSGPrim*    d_prim ;
   CSGNode*    d_node ;
   float4*     d_plan ;
   qat4*       d_itra ;
 };

referencing by offset, count

replaces geometry context dropped in OptiX 6->7
array-based -> simple, inherent serialization + persisting
entire geometry in 4 GPU allocations

Simple intersect headers, common CPU/GPU types

use with : pre-7, 7 + testing on CPU

https://github.com/simoncblyth/CSG "Foundry" model

csg_intersect_tree.h/csg_intersect_node.h/...: simple headers common to pre-7/7/CPU-testing

https://github.com/simoncblyth/CSG_GGeo

Convert Opticks/GGeo -> CSGFoundry

https://github.com/simoncblyth/CSGOptiX

OptiX 7 + pre-7 rendering

GAS : Geometry Acceleration Structure

IAS : Instance Acceleration Structure

CSG : Constructive Solid Geometry

[9]cxr_i0_t8,_-1 : EXCLUDE SLOWEST

Current JUNO Geometry : Auto-Factorized by "progeny digest"

ridx	plc	prim	component	note
0	1	3084	3084:sWorld	non-repeated remainder
1	25600	5	5:PMT_3inch_pmt_solid	4 types of PMT
2	12612	5	5:NNVTMCPPMTsMask
3	5000	5	5:HamamatsuR12860sMask
4	2400	5	5:mask_PMT_20inch_vetosMask
5	590	1	1:sStrutBallhead	4 parts of same assembly, BUT not grouped as siblings (not parent-child)
6	590	1	1:uni1
7	590	1	1:base_steel
8	590	1	1:uni_acrylic3
9	504	130	130:sPanel	repeated parts of TT

ridx:0 "remainder" Prim
- Prim that did not pass instancing criteria, on number of repeats + complexity
- TODO: tune criteria to instance more, reducing remainder Prim (Expect: 3084->~ 84)
ridx:1,2,3,4
- four types of PMT, all with 5 Prim
ridx:5,6,7,8
- same 590x assembly but not grouped together : as siblings (not parent-child like PMTs)
- TODO: implement instancing of siblings, combining 4 -> 1

Increasing instancing : reduces memory for geometry -> improved performance

JUNO OptiX 7 : "Foundry" Geometry Scan

JUNO Geometry : OptiX 7 Ray Trace Times ~2M pixels : TITAN RTX

idx	-e	time(s)	relative	enabled geometry description
0	9,	0.0017	0.1702	ONLY: 130:sPanel
1	7,	0.0017	0.1714	ONLY: 1:base_steel
2	6,	0.0019	0.1923	ONLY: 1:uni1
3	5,	0.0027	0.2780	ONLY: 1:sStrutBallhead
4	4,	0.0032	0.3268	ONLY: 5:mask_PMT_20inch_vetosMask
5	1,	0.0032	0.3287	ONLY: 5:PMT_3inch_pmt_solid
6	2,	0.0055	0.5669	ONLY: 5:NNVTMCPPMTsMask
7	3,	0.0074	0.7582	ONLY: 5:HamamatsuR12860sMask
8	1,2,3,4	0.0097	1.0000	ONLY PMT
9	t8,0	0.0099	1.0179	EXCL: 1:uni_acrylic3 3084:sWorld
10	0,	0.1171	12.0293	ONLY: 3084:sWorld
11	t8,	0.1186	12.1769	EXCL: 1:uni_acrylic3
12	t0,	0.5278	54.2066	EXCL: 3084:sWorld
13	8,	0.5310	54.5298	ONLY: 1:uni_acrylic3
14	t3,	0.6017	61.7954	EXCL: 5:HamamatsuR12860sMask
15	t2,	0.6043	62.0620	EXCL: 5:NNVTMCPPMTsMask
16	t5,	0.6171	63.3787	EXCL: 1:sStrutBallhead
17	t6,	0.6196	63.6301	EXCL: 1:uni1
18	t7,	0.6226	63.9458	EXCL: 1:base_steel
19	t0	0.6240	64.0879	3084:sWorld
20	t4,	0.6243	64.1169	EXCL: 5:mask_PMT_20inch_vetosMask
21	t9,	0.6335	65.0636	EXCL: 130:sPanel
22	t1,	0.6391	65.6384	EXCL: 5:PMT_3inch_pmt_solid

JUNO ALL PMTs : 2M ray traced pixels in 0.0097 s : NVIDIA TITAN RTX, NVIDIA OptiX 7.0.0, Opticks

JUNO Opticks/Geant4 Optical Photon Simulation Matching

JUNO Opticks/Geant4 Optical Photon Simulation Matching

Geant4OpticksWorkflow

Feb 2021 Review : "More manpower to Opticks"

Outline 1 : Matching Machinery + Fixes

Outline 2 : Scintillation + Cerenkov + Geometry

Outline 3 : Extras on Opticks NVIDIA OptiX 7 future

JUNO Offline : Tools for Optical Photon Simulation Matching

Opticks : Tools for Optical Photon Simulation Matching

G4OpticksRecorder : full details of Geant4 simulation passed to Opticks

JUNO Offline DsG4Scintillation : Reemission Bookkeeping

Comparing OK/G4 photon history counts for "tut_detsim.py gun" event

Overview of fixed JUNO-Opticks material property/geometry issues

Photons not stopping at Tyvek, FIXED by adding explicit surfaces

Prepare an input photon that misses PMTs for checking the Water

Opticks : excess AB, lack SA in innerWater (FIX: remove degenerates)

Opticks : excess AB, lack SA in innerWater (FIX: remove degenerates)

gpmt Hamamatsu jun28

gpmt NNVT jun28

Opticks : excess AB, lack SA in innerWater : FIX remove degenerates

Less scattering with 100k input photons in Water, FIX: X4MaterialWater

Scintillation Wavelength : Compare Geant4 and Opticks implementations

Scintillation Wavelength Matching : Explanations of following plots

1M Wavelength Sample comparison hd0 no interpolation

1M Wavelength Sample comparison hd0

1M Wavelength Sample comparison hd20

1M Wavelength Sample comparison hd20 no interpolation

GScintillatorLib ICDF

"Multi-resolution" GPU texture (3,4096,1) : 20x Resolution for 3x bins

JUNO Offline trunk G4Cerenkov_modified : BUG FIXED in "Pre1"

Cerenkov Cone Angle + Beta requirements reminder

G4Cerenkov_modifiedTest_ASIS

G4Cerenkov_modifiedTest_SKIP_CONTINUE

2.82M Sample Cerenkov Wavelength comparison

Poor Cerenkov Chi2 Match : Few Bins + Interpolation difference ?

Remaining JUNO-Opticks Issues, TODO List

[17]lLowerChimney_phys_all_00000

[11]lLowerChimney_phys__8,__00000

[4]lLowerChimney_phys__6,__00000

quicktime lAddition_uni_acrylic3

quicktime 2 lAddition_uni_acrylic3

gplt lAddition_uni_acrylic3

Possible cause of slow geometry : Subtracting 17.82 m sphere

Back to OK/G4 photon history counts : small LS absorption excess ?

LS Wavelength

Summary and Links

"Extra" Slides Follow

NVIDIA OptiX 7 : Entirely new thin API (Introduced Aug 2019)

New "Foundry" Model : Shared CPU/GPU Geometry Context

[9]cxr_i0_t8,_-1 : EXCLUDE SLOWEST

Current JUNO Geometry : Auto-Factorized by "progeny digest"

JUNO OptiX 7 : "Foundry" Geometry Scan

JUNO Geometry : OptiX 7 Ray Trace Times ~2M pixels : TITAN RTX

JUNO ALL PMTs : 2M ray traced pixels in 0.0097 s : NVIDIA TITAN RTX, NVIDIA OptiX 7.0.0, Opticks

[11]lLowerChimney_phys8,00000

[4]lLowerChimney_phys6,00000