Integration of JUNO Offline + Opticks

Integration of JUNO Offline + Opticks :
GPU Optical Simulation with NVIDIA® OptiX™

Open source,

Simon C Blyth, IHEP, CAS — JUNO Collaboration Meeting, 6 July 2020, Virtual



JUNO Optical Photon Simulation Problem...

Ray-tracing vs Rasterization

/env/presentation/nvidia/nv_rasterization.png /env/presentation/nvidia/nv_raytrace.png

Optical Photon Simulation ≈ Ray Traced Image Rendering

Much in common : geometry, light sources, optical physics

Many Applications of ray tracing :


Spatial Index Acceleration Structure

NVIDIA® OptiX™ Ray Tracing Engine --

OptiX makes GPU ray tracing accessible

NVIDIA expertise:

User provides (Yellow):

[1] Turing+ GPUs eg NVIDIA TITAN RTX


Opticks : Translates G4 Optical Physics to CUDA/OptiX

OptiX : single-ray programming model -> line-by-line translation

CUDA Ports of Geant4 classes
  • G4Cerenkov (only generation loop)
  • G4Scintillation (only generation loop)
  • G4OpAbsorption
  • G4OpRayleigh
  • G4OpBoundaryProcess (only a few surface types)
Modify Cherenkov + Scintillation Processes
  • collect genstep, copy to GPU for generation
  • avoids copying millions of photons to GPU
Scintillator Reemission
  • fraction of bulk absorbed "reborn" within same thread
  • wavelength generated by reemission texture lookup
Opticks (OptiX/Thrust GPU interoperation)
  • OptiX : upload gensteps
  • Thrust : seeding, distribute genstep indices to photons
  • OptiX : launch photon generation and propagation
  • Thrust : pullback photons that hit PMTs
  • Thrust : index photon step sequences (optional)

G4Solid -> CUDA Intersect Functions for ~10 Primitives


Sphere, Cylinder, Disc, Cone, Convex Polyhedron, Hyperboloid, Torus, ...

G4Boolean -> CUDA/OptiX Intersection Program Implementing CSG

Complete Binary Tree, pick between pairs of nearest intersects:

UNION tA < tB Enter B Exit B Miss B
Enter A ReturnA LoopA ReturnA
Exit A ReturnA ReturnB ReturnA
Miss A ReturnB ReturnB ReturnMiss
[1] Ray Tracing CSG Objects Using Single Hit Intersections, Andrew Kensler (2006)
with corrections by author of XRT Raytracer
Similar to binary expression tree evaluation using postorder traverse.

Opticks : Translates G4 Geometry to GPU, Without Approximation

G4 Structure Tree -> Instance+Global Arrays -> OptiX

Group structure into repeated instances + global remainder:

instancing -> huge memory savings for JUNO PMTs




JUNO Geometry : Remove Torus to enable translation to GPU options (in SVN trunk r3990)

skip the guide tube torus

"cylinder - torus" -> polycone

  • Hamamatsu_R12860_PMTSolid
TODO: --pmt20inch-simpler-csg
simpler+faster CSG, same geometry

"tubs-torus" -> polycone : --pmt20inch-polycone-neck







Integration of Offline + Opticks : Opticks is now a JunoENV External

Add Opticks to an existing installation (SVN trunk r3990+):

bash junoenv libs all opticks

Four mandatory envvars :

export CMTEXTRATAGS=opticks 
# -> Opticks ON in JunoENV + CMT
export OPTICKS_COMPUTE_CAPABILITY=70  # -> nvcc flags
export OPTICKS_CUDA_PREFIX=/usr/local/cuda-10.1
export OPTICKS_OPTIX_PREFIX=/usr/local/OptiX_650
# location of CUDA and OptiX installs 

Build all JUNO externals, including Opticks (takes ~1hr, 13G) :

    mkdir -p $JUNOTOP && cd $JUNOTOP
    svn co && cd $JUNOTOP/junoenv
    local libs=$(bash junoenv libs list | perl -ne 'm, (\S*)@, && print "$1\n"' -)
    for lib in $libs ; do
        bash junoenv libs all $lib || return 1

Integration of Offline + Opticks : Where WITH_G4OPTICKS used ?

Offline Opticks : #ifdef WITH_G4OPTICKS ... #endif + if(m_opticksMode > 0){ ... }

cd $JUNOTOP/offline/Simulation/DetSimV2 ; find . -type f -exec grep -l WITH_G4OPTICKS {} \+

./DetSimOptions/src/                  ## initialize/finalize
./DetSimOptions/src/   ## translate geometry to GPU

./PhysiSim/src/    ## collect Scintillation gensteps, excluding reemission
./PhysiSim/src/  ## collect Cerenkov gensteps
./PhysiSim/src/    ## use LocalG4Cerenkov1042 rather than G4Cerenkov

./PMTSim/src/         ## setup separate PMTHitMerger m_pmthitmerger_opticks
./PMTSim/src/    ## bulk hit creation+collection at EndOfEvent
  • setup extra hitCollection_opticks (temporarily doing both GPU+CPU sim, with separate hits + merger)
  • invoke G4Opticks::propagateOpticalPhotons with the gensteps collected
  • bulk populate hitCollection_opticks

Integration of Offline + Opticks : options options relevant to Opticks usage

--opticks-mode 0
do not use Opticks GPU propagation, the default
--opticks-mode 1 (currently SVN trunk r3990)
performs both CPU and GPU optical simulations put GPU hits into separate hitCollectionOpticks
switch off CPU optical photons, put GPU hits into standard hitCollection
change geometry skipping the guide tube torus
change geometry of LPMT necks avoiding use of torus

tds : Example Bash Function :

tds-(){ python $JUNOTOP/offline/Examples/Tutorial/share/ $* ; }
tds(){ tds- --opticks-mode 1 --no-guide_tube --pmt20inch-polycone-neck --evtmax 2 gun ; }

Integration of Offline + Opticks : Next Step : Theta dep. CE -> GPU

(1) Complete CPU side bulk hit creation in PMTSim: junoSD_PMT_v2::EndOfEvent_Opticks

Hits currently missing:

(2) CUDA port: junoSD_PMT_v2::get_ce


[1] set in NormalTrackInfo by NormalAnaMgr::UserSteppingAction [2] PMTID int distinguishes NNVT/Hamamatsu !

[3] "GPU Prefilter for Accurate Cubic B-spline Interpolation"

Opticks + Geant4 News : Two Geant4 Task Force R&Ds Based on Opticks

Full integration of Opticks with Geant4 : LAr TPCs, DUNE

Creation of a Geant4 extended example G4OpticksTest

Advised Hans on Opticks use, adding Wavelength Shifting

Prototype of gamma or electron transport using GPU vendor library : HL-LHC ECAL

Examine the Opticks simulation of optical photons, to identify similarities/differences ... for gamma and/or electron interactions

Agreed to act in a consulting role for this task ; email exchanges with John Apostolakis

[1] HSF : HEP Software Foundation

scan-pf-1_Opticks_vs_Geant4 2

JUNO analytic, 400M photons from center Speedup
Geant4 Extrap. 95,600 s (26 hrs)  
Opticks RTX ON (i) 58 s 1650x

PMTID physvol/@copynumber -?-> PMTtype Hamamatsu/NNVT -- GDML:

014743       <physvol name="pLPMT_NNVT_MCPPMT0x329e330">       <-- zero is special ? NOT A GOOD ID  
 14744         <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/>
 14745         <position name="pLPMT_NNVT_MCPPMT0x329e330_pos" unit="mm" x="-930.2976" y="-111.8724" z="19365"/>
 14746         <rotation name="pLPMT_NNVT_MCPPMT0x329e330_rot" unit="deg" x="180" y="0" z="180"/>
 14747       </physvol>
 14748       <physvol copynumber="1" name="pLPMT_Hamamatsu_R128600x329e3b0">
 14749         <volumeref ref="HamamatsuR12860lMaskVirtual0x3290b70"/>
 14750         <position name="pLPMT_Hamamatsu_R128600x329e3b0_pos" unit="mm" x="-492.5657" y="-797.0872" z="19365"/>
 14751         <rotation name="pLPMT_Hamamatsu_R128600x329e3b0_rot" unit="deg" x="180" y="0" z="180"/>
 14752       </physvol>
 14753       <physvol copynumber="2" name="pLPMT_NNVT_MCPPMT0x329e4b0">   <-- forced to use name OR PMTParamSvc -> PMTType 
 14754         <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/>
 14755         <position name="pLPMT_NNVT_MCPPMT0x329e4b0_pos" unit="mm" x="316.0782" y="-882.0791" z="19365"/>
 14756         <rotation name="pLPMT_NNVT_MCPPMT0x329e4b0_rot" unit="deg" x="180" y="0" z="180"/>
 14757       </physvol>
102774       <physvol copynumber="17610" name="pLPMT_Hamamatsu_R128600x3a2a9d0">  Simple fix : separate NNVT/Hamamatsu offsets 
102775         <volumeref ref="HamamatsuR12860lMaskVirtual0x3290b70"/>
102776         <position name="pLPMT_Hamamatsu_R128600x3a2a9d0_pos" unit="mm" x="-468.5" y="811.4658" z="-19365"/>
102777       </physvol>
102778       <physvol copynumber="17611" name="pLPMT_NNVT_MCPPMT0x3a2aa80">
102779         <volumeref ref="NNVTMCPPMTlMaskVirtual0x32a56c0"/>
102780         <position name="pLPMT_NNVT_MCPPMT0x3a2aa80_pos" unit="mm" x="-926.5345" y="139.6526" z="-19365"/>
102781       </physvol>

102782       <physvol copynumber="300000" name="PMT_3inch_log_phys0x3a307c0">
102783         <volumeref ref="PMT_3inch_log0x3a2e430"/>
102784         <position name="PMT_3inch_log_phys0x3a307c0_pos" unit="mm" x="-3734.24733197939" y="1835.06563378074" z="18932.1786116475"/>
102785         <rotation name="PMT_3inch_log_phys0x3a307c0_rot" unit="deg" x="-174.463707019134" y="-11.1072339347848" z="-26.7089289722046"/>
102786       </physvol>
102787       <physvol copynumber="300001" name="PMT_3inch_log_phys0x3a30890">
102788         <volumeref ref="PMT_3inch_log0x3a2e430"/>
102789         <position name="PMT_3inch_log_phys0x3a30890_pos" unit="mm" x="-3470.82479867782" y="1705.61581563788" z="18994.3071973244"/>
102790         <rotation name="PMT_3inch_log_phys0x3a30890_rot" unit="deg" x="-174.86882108069" y="-10.3147892102431" z="-26.6336342093065"/>
102791       </physvol>

PMTParamSvc relies on large std::map<int,int> m_pmt_categories

unsigned long long pmtdat
64-bit bitfield can encode all PMTParamSvc info in CUDA friendly way
label sensitive volumes with pmtdat
bit twiddling then provides all the info, no big PMTParam buffers needed
43 class PMTParamSvc: public SvcBase {      //  Encode all this into : 64 bit pmtdat (unsigned long long)
44 public:                                  //
45   PMTParamSvc(const std::string& n)      //
..                                          //  16 bits : which PMT identifier (enough for 1..65,536)
53   bool isCD(int pmtid);                  //  16 bits : 8 bits of type flags + 8 bits spare
54   bool isWP(int pmtid);                  //  32 bits : float qe
56   bool is20inch(int pmtid);              //  ----------------------------------------------
57   bool is3inch(int pmtid);               //  64 bits : unsigned long long pmtdat
59   bool isNNVT(int pmtid);                //
60   bool isHamamatsu(int pmtid);           //
61   bool isHighQE(int pmtid);              //   Adhere to hex 4bit boundaries:
62   bool isHZC(int pmtid);                 //
                                            //                      Which PMT 16 bits
63   float getPMTQE(int pmtid);             //                       |
..                                          //   0x|....|....|..00|....|
74   std::map<int, int> m_pmt_categories;   //                 |
75   std::map<int, float> m_pmt_qe;         //       float qe  type flags
76 };                                       //
                                            //   unsigned long long pmtdata[npmt] ;