csg_intersect_boolean.h : General CSG intersection using evaluative_csg approach

Intersecting rays with general CSG shapes requires the appropriate primitive intersect to be selected depending on the origin and direction of the ray and the current t_min.

Traditional implementations of CSG intersection first calculate ray intervals with each primitive and then combine these intervals using the boolean operators to determine intersects. Efficient use of GPUs requires many thousands of simultaneously operational threads which disfavors the traditional approach due to the requirement to store intervals for all constituent primitives. A quite different approach described by Andrew Kensler avoids interval storage by instead selecting between the two candidate intersects at each level of the binary tree, which allows a recursive algorithm to be developed. The two candidate intersects at each level are classified as “Enter”, “Exit” or “Miss” using the angle between the ray direction and surface normal. Six decision tables corresponding to which side is closer and to the three boolean operators are used to determine an action from the classifications such as returning an intersect or advancing t_min and intersecting again.

Recursive function calls are a natural way to process self similar structures such as CSG trees, however recursion is a memory expensive technique which makes it inappropriate for GPU usage. Although NVIDIA OptiX supports recursive ray tracing in does not support recursion within intersect programs. The Opticks “evaluative” CSG implementation was inspired by the realization that CSG node tree intersection directly parallels binary expression tree evaluation and that techniques to simplify expression tree evaluation such as using postorder traversals could be applied. Binary expression trees are used to represent and evaluate mathematical expressions. Factoring out the postorder sequence allowed an iterative solution to be developed for a recursive problem.

The CSG implementation relies on selection of the closer of two intersects at each level of the node tree. When the faces of constituent shapes coincide the ambiguity regarding which is closer can cause spurious intersects. Modifying some constituents to prevent coincident faces avoids the issue without changing the intended geometry. As such coincidences are rather common Opticks includes detection and automated fixing for some common situations.

  • postorder traversal means that have always visited left and right subtrees before visiting a node
  • slavish python translations of the csg intersect implementationa are in dev/csg/slavish.py
  • postorder_sequence for four tree heights were prepared by opticks/dev/csg/node.py:Node.postOrderSequence
  • the sequence contains 1-based levelorder indices(nodeIdx) in right to left postorder
  • left,right child of nodeIdx are at (nodeIdx*2, nodeIdx*2+1)
  • for non-perfect trees, the height means the maximum

Below indices are postorder slice flavor, not levelorder

 Height 3 tree, numNodes = 15, halfNodes = 7, root at i = 14

    upTree       i      : numNodes    14        : 15      =  14:15
    leftTree     i - 2h : i - h       14 - 2*7  : 14 - 7  =   0:7
    rightTree    i -  h : i           14 -  7   : 14      =   7:14

NB all nodes including root needs an upTree tranche to evaluate left and right

CSG shape complete binary tree serialization

A complete binary tree serialization with array indices matching level order tree indices and zeros at missing nodes is used for the serialization of the CSG trees. This simple serialization allows tree navigation directly from bitwise manipulations of the serialized array index.


  • simple, with no tree overheads for child/parent etc..
  • no need to deserialize as can postorder tree traverse direct from the serialized buffer just by bit twiddling, as shown below


  • very inefficient for complex CSG shapes with many constituent nodes
  • must implement tree balancing to handle CSG shapes of “medium” complexity, this is done in geometry preprocessing
                                                  depth     elevation

                     1                               0           3

          10                   11                    1           2

     100       101        110        111             2           1

 1000 1001  1010 1011  1100 1101  1110  1111         3           0

postorder_next(i,elevation) = i & 1 ? i >> 1 : (i << elevation) + (1 << elevation) ;   // from pattern of bits

A postorder tree traverse visits all nodes, starting from leftmost, such that children are visited prior to their parents.

CSG Development code including python implementations of the CUDA

CSG Algorithm Prototyping

1-based binary tree indices in binary

Repeat above enumeration in nodeIdx lingo


          height 3 tree

                               1{1}                                                 elev:3    depth:0

            10{2}                                  11{3}                            elev:2    depth:1
             [6]                                   [13]

      100{4}           101{5}                110{6}              111{7}             elev:1    depth:2
       [2]              [5]                  [9]                 [12]

  1000{8} 1001{9}   1010{a}  1011{b}      1100{c} 1101{d}      1110{e}  1111{f}     elev:0    depth:3
  [0]     [1]       [3]      [4]          [7]     [8]          [10]     [11]

  upTree         1    :    0
                root     1 >> 1 ("parent" of root)

                 10{2} << 2      11{3} << 2   (leftmost of the rhs is one-past the leftTree)
               1000{8}         1100{c}

                 11{3} << 2        1      <-- one-past the end of the rightTree is root

  Algorithm to find leftTree of nodeIdx, is find leftIdx

      leftIdx  = (nodeIdx << 1)
      rightIdx = (nodeIdx << 1) + 1

leftTree      (leftIdx << (elevation-1))    :  (rightIdx << (elevation-1))
rightTree      (rightIdx << (elevation-1))  :   nodeIdx


  • parallelR variables
  • use quad-decker begin,end,beginR,endR inside the tranche slice so can see exactly where things diverge
  • maybe running into elev 0 and doing << -1

Recursive within intersect, nope

Hmm seems cannot do recursive within an intersect program anyhow ??

libc++abi.dylib: terminating with uncaught exception of type optix::Exception:
Parse error (Details: Function "RTresult _rtProgramCreateFromPTXFile(RTcontext, const char *, const char *, RTprogram *)"
caught exception: /usr/local/opticks/installcache/PTX/OptiXRap_generated_hemi-pmt.cu.ptx:
error: Recursion detected in function
_Z15recursive_csg_rjjjf( file /usr/local/opticks/installcache/PTX/OptiXRap_generated_hemi-pmt.cu.ptx ),
cannot inline. [5308892], [5308892])


Stack curr:

  • -1 : empty
  • 0 : one item
  • 1 : two items
  • SIZE - 1 : SIZE items, full stack

csg_intersect_boolean.h : struct Tranche

Postorder Tranch storing a stack of slices into the postorder sequence.

32 bit unsigned holding a pair of begin and end indices specifying a range over the postorder traversal sequence
push (slice, tmin) onto the small stack
pops off (slice, tmin)
representation of the stack of slices packed into a 64 bit unsigned long long

csg_intersect_boolean.h : struct CSG

Small stack of float4 isect (surface normals and t:distance_to_intersect).

push float4 isect and nodeIdx into stack
pop float4 isect and nodeIdx off the stack

csg_intersect_boolean.h : evaluative_csg

Recursive CSG intersection algorithm implemented iteratively using a stack of slices into the postorder traversal sequence of the complete binary tree, effectively emulating recursion.

The whole algoritm depends on the postorder traversal feature that the left and right children of a node are always visited before the node itself.


macro that is now standardly defined

bit-twiddle postorder is limited to trees of height 7, ie a maximum of 0xff (255) nodes (using 2-bytes with PACK2 would bump that to 0xffff (65535) nodes) In any case 0xff nodes are far more than this is expected to be used with