Opticks : GPU Optical Photon Simulation for Particle Physics with NVIDIA OptiX

Opticks : GPU Optical Photon Simulation for Particle Physics with NVIDIA® OptiX™

Simon C Blyth, National Taiwan University — https://bitbucket.org/simoncblyth/opticks — July 2017 IHEP

Opticks Benefits

Opticks Progress : Moving to Fully Analytic GPU Geometry

Geometry Status (Nov 2016)

Geometry Status (July 2017)

[1] LLR Workshop, https://simoncblyth.bitbucket.io/env/presentation/opticks_gpu_optical_photon_simulation_nov2016_llr.html

Constructive Solid Geometry (CSG)

Simple by construction definition, implicit geometry.

CSG expressions

3D Parametric Ray : ray(t) = r0 + t rDir

Ray Geometry Intersection

How to pick exactly ?

CSG : Which primitive intersect to pick ?

Classical Roth diagram approach

Computational requirements:

BUT : High performance on GPU requires:

Classical approach not appropriate on GPU

Ray Tracing CSG Objects Using Single Hit Intersections (A. Kensler) [*]

Union, tA < tB Enter B Exit B Miss B
Enter A ReturnA LoopA ReturnA
Exit A ReturnA ReturnB ReturnA
Miss A ReturnB ReturnB ReturnMiss
Union, tB < tA Enter B Exit B Miss B
Enter A ReturnB ReturnB ReturnA
Exit A LoopB ReturnA ReturnA
Miss A ReturnB ReturnB ReturnMiss

Recursive CSG tree python prototype of Kensler pseudocode worked after state table corrections/extensions

[*] Ray Tracing CSG Objects Using Single Hit Intersections, Andrew Kensler (2006)
with corrections by author of XRT Raytracer http://xrt.wikidot.com/doc:csg

CSG Complete Binary Tree Serialization -> simplifies GPU side

CSG Tree, leaf node primitives, internal node operators, 4x4 transforms on any node, serialized as complete binary tree:

Height 3 complete binary tree with level order indices:

                                                   depth     elevation

                     1                               0           3

          10                   11                    1           2

     100       101        110        111             2           1

 1000 1001  1010 1011  1100 1101  1110  1111         3           0

postorder_next(i,elevation) = i & 1 ? i >> 1 : (i << elevation) + (1 << elevation) ; // from pattern of bits

Postorder tree traverse visits all nodes, starting from leftmost, such that children are visited prior to their parents.

Evaluative CSG intersection Pseudocode : recursion emulated

fullTree = PACK( 1 << height, 1 >> 1 )  // leftmost, parent_of_root(=0)
tranche.push(fullTree, ray.tmin)

while (!tranche.empty)         // stack of begin/end indices 
{
    begin, end, tmin <- tranche.pop  ; node <- begin ;
    while( node != end )                   // over tranche of postorder traversal 
    {
        elevation = height - TREE_DEPTH(node) ;
        if(is_primitive(node)){ isect <- intersect_primitive(node, tmin) ;  csg.push(isect) }
        else{
            i_left, i_right = csg.pop, csg.pop            // csg stack of intersect normals, t 
            l_state = CLASSIFY(i_left, ray.direction, tmin)
            r_state = CLASSIFY(i_right, ray.direction, tmin)
            action = LUT(operator(node), leftIsCloser)(l_state, r_state)

            if(      action is ReturnLeft/Right)     csg.push(i_left or i_right)
            else if( action is LoopLeft/Right)
            {
                left = 2*node ; right = 2*node + 1 ;
                endTranche = PACK( node,  end );
                leftTranche = PACK(  left << (elevation-1), right << (elevation-1) )
                rightTranche = PACK(  right << (elevation-1),  node  )
                loopTranche = action ? leftTranche : rightTranche

                tranche.push(endTranche, tmin)
                tranche.push(loopTranche, tminAdvanced )  // subtree re-traversal with changed tmin 
                break ; // to next tranche
            }
        }
        node <- postorder_next(node, elevation)         // bit twiddling postorder 
    }
}
isect = csg.pop();         // winning intersect  

https://bitbucket.org/simoncblyth/opticks/src/tip/optixrap/cu/csg_intersect_boolean.h

Opticks CSG Primitives : Closed Solids, Consistent Normals

Closed Solid as: implementation requires otherside intersect, Rigidly attached normals

Type code Python name C++ nnode sub-struct
CSG_BOX3,CSG_BOX box3,box nbox
CSG_SPHERE,CSG_ZSPHERE sphere,zsphere nsphere,nzsphere
CSG_CYLINDER,CSG_DISC cylinder,disc ncylinder,ndisc
CSG_CONE cone ncone
CSG_CONVEXPOLYHEDRON convexpolyhedron nconvexpolyhedron
CSG_TRAPEZOID,CSG_SEGMENT trapezoid,segment nconvexpolyhedron

Non-primitives, high level CSG definition avoids loadsa code

Opticks CSG Primitives On Parade (GPU raytrace)

/env/presentation/tboolean_parade.png

Three on right (Trapezoid, Segment, Icosahedron) all represented by planes with nconvexpolyhedron

Opticks CSG Serialized into OpticksCSG format (numpy buffers, json)


// tboolean-parade

from opticks.ana.base import opticks_main
from opticks.analytic.csg import CSG

args = opticks_main(csgpath="$TMP/$FUNCNAME")

container = CSG("box", param=[0,0,0,1200], boundary=args.container, poly="MC", nx="20" )

a = CSG("sphere", param=[0,0,0,100])
b = CSG("zsphere", param=[0,0,0,100], param1=[-50,60,0,0])
c = CSG("box3",param=[100,50,70,0])
d = CSG("box",param=[0,0,10,50])
e = CSG("cylinder",param=[0,0,0,100], param1=[-100,100,0,0])
f = CSG("disc",param=[0,0,0,100], param1=[-1,1,0,0])
g = CSG("cone", param=[100,-100,50,100])
h = CSG.MakeTrapezoid(z=100, x1=80, y1=100, x2=100, y2=80)
i = CSG.MakeSegment(phi0=0,phi1=45,sz=100,sr=100)
j = CSG.MakeIcosahedron(scale=100.)

prims = [a,b,c,d,e,f,g,h,i,j]

...  // setting translations

CSG.Serialize([container] + prims, args.csgpath )     <-- write trees to file 

Opticks CSG Primitives : What is included

OptiX/CUDA functions providing:

C++/nnode sub-struct methods

4x4 Transforms on any node (translation/rotation/scaling)

Intersect inverse-transformed ray with un-transformed primitive

Supporting non-uniform scaling requires rayDir not be be normalized (or assumed to be normalized) by primitives.

Opticks CSG : Balancing Deep Trees Drastically Improves Performance

Intended for solids, not scenes (tree height <8, <256 nodes[*])

Dayabay TopESRCutHols lvidx:57  (height:9 totnodes:1023)
di(di(di(di(di(di(di(di(di(cy,cy),cy),cy),cy),cy),cy),cy),cy),cy)

                                                                    di
                                                             di          cy
                                                     di          cy
                                             di          cy
                                     di          cy
                             di          cy
                     di          cy
             di          cy
     di          cy
 cy      cy


Balanced Tree, height:4 totnodes:31
in(in(in(in(cy,!cy),in(!cy,!cy)),in(in(!cy,!cy),in(!cy,!cy))),!cy)

                                                             in
                             in                                 !cy
             in                              in
     in              in              in              in
 cy     !cy     !cy     !cy     !cy     !cy     !cy     !cy

[*] Algorithm has no inherent height limit, but use of complete binary tree imposes practical performance limitation

Dayabay ESR reflector : Deep CSG tree : disc with 9 holes

/env/presentation/lvidx_57_esr_deep_tree_disc_with_holes.png

Subtraction of thin CSG_CYLINDER -> speckle in the hole

CSG_DISC implemented to handle disc like cylinders : intersects at middle (z1+z2)/2 and offsets, avoids issue

/env/opticks_refs/speckle_lvTopESR.png /env/opticks_refs/tboolean_esr_speckle_centered_on_pole_delta_10-3.png

Coincident Faces are Primary Cause of Issues : Fake Intersects

Coincidences common (alignment too tempting?). To fix:

/env/opticks_refs/opticks_tscan_29_nzero_5_OcrGdsPrt.png

Signed Distance Functions (SDF) for Primitives and Composites

SDFs are composable using R-functions, eg min and max

Recursively applicable to CSG trees

-> analytic representation of arbitrary solid

SDFs enable geometry querying/debugging:

Rvachev-functions (R-functions) : sign uniquely determined by signs of arguments -> parallel CSG operations

SDF Isosurface Extracted Polgonization Mesh with offset GPU raytrace

/env/presentation/tboolean_interlocked_gpuoffset.png

Translated Solid Debugging : Using py/bash/c++ Code Generation

NScanTest : Find Problem Solids by SDF Scanning

Count SDF zero crossings along scanlines (expect even)

~5/249 odd solids : not yet handled by auto-uncoincidence

GScene::compareMeshes : Bounding Box comparisons

G4Poly(via G4DAE) vs Opticks parsurf (parametric surface)

NEXT STEP: direct intersect comparison between Opticks/G4 CSG implementations for definitive answers

(*) could also be G4DAE export of G4Poly

Full Analytic GDML Geometry Translation Workflow

General CSG ray tracing on GPU allows operation from GDML files

gdml2gltf.py (opticks.analytic.sc)

NScene

NCSG

GScene/OScene

glTF : (GL Transmission Format) "JPEG of 3D" : efficient transmission and loading of 3D scenes and models, https://www.khronos.org/gltf

Opticks Analytic Daya Bay Near Site, GPU Raytrace Screenshots

GPU Raytrace on following pages are for Daya Near Site Geometry, auto translated from GDML

General Translation Machinery

Scene Debugging : NSceneLoadTest

Opticks Analytic Daya Bay Near Site, GPU Raytrace (3)

Opticks Analytic Daya Bay Near Site, GPU Raytrace (1)

Opticks Analytic Daya Bay Near Site, GPU Raytrace (0)

Opticks Analytic Daya Bay Near Site, GPU Raytrace (2)

July 2017 Summary

/env/presentation/1px.png

Opticks enables Geant4 based simulations to benefit from optical photon simulation taking effectively zero time and zero CPU memory, due to massive parallelism made accessible by NVIDIA OptiX. GDML detector geometry is auto translated into a GPU optimized analytic form, fully equivalent to the source geometry.

/env/presentation/1px.png
  • The more photons the bigger the overall speedup (99% -> 100x)
  • Drastic speedup -> better detector understanding -> greater precision
  • Large PMT based neutrino experiments, such as JUNO, can benefit the most
/env/presentation/1px.png

https://bitbucket.org/simoncblyth/opticks

Context Pages

  • Optical Photon Simulation Problem
  • Ray Traced Image Synthesis ≈ Optical Photon Simulation
  • Open Source Opticks
  • YouTube Video

For more on Opticks see

https://bitbucket.org/simoncblyth/opticks

[1] LLR Workshop, https://simoncblyth.bitbucket.io/env/presentation/opticks_gpu_optical_photon_simulation_nov2016_llr.html

Summary from LLR Workshop (Nov 2016)

/env/presentation/1px.png

Opticks enables Geant4 based simulations to benefit from optical photon simulation taking effectively zero time and zero CPU memory, thanks to massive parallelism made accessible by NVIDIA OptiX.

/env/presentation/1px.png
  • The more photons the bigger the overall speedup (99% -> 100x)
  • Drastic speedup -> better detector understanding -> greater precision
  • Large PMT based neutrino experiments, such as JUNO, can benefit the most
/env/presentation/1px.png

https://bitbucket.org/simoncblyth/opticks

JPMT Before Contact 2

Ray Traced Image Synthesis ≈ Optical Photon Simulation

Geometry, light sources, optical physics ->
  • pixel values at image plane
  • photon parameters at detectors (eg PMTs)
Ray tracing has many applications :
  • advertising, design, entertainment, games,...
  • BUT : most ray tracers just render images
NVIDIA OptiX had foresight to be less specific:
  • general geometry intersection API
  • OptiX is to ray tracing what OpenGL is to rasterization


rasterization
project 3D primitives onto 2D image plane, combine fragments into pixel values
ray tracing
cast rays thru image pixels into scene, recursively reflect/refract with geometry intersected, combine returns into pixel values

OpticksDocs

Debugging Coincident Subtractions

Switching subtraction into union with complemented -> can see whats subtracted.

/env/opticks_refs/lvidx_69_ring_box_cuts_artifact.png

YouTube Video

https://www.youtube.com/watch?v=CBpOha4RzIs

Slides Booted to Backup

Various Polygonization Approaches

Several Integrations of Open Source Polygonizations

Marching Cubes
fill bbox with cubes, generate triangles for cubes with sign changes, fixed resolution (miss features or too many tris)
Dual Contouring
multiresolution using octree (slow, finnicky)
Implicit Mesher
finds and follows surface, fast, fixed resolution, (needs help in finding)

My own attempt (WIP)

Hybrid Mesher
combining SDF classification with parametric generation, (complicated mesh joining)