Optical Photon Marshalling ============================ Overview/Issues ------------------ Simplify OP lifecycle ? * maybe avoid intermediary `std::vector` using NPY format * consider memory impact of approach, do not want to require twice (or more) the memory of the raw photon data to be used * may necessitate chunking and sending chunks separately, dummyfication of killed OP objects * chunking can allow to use fixed chunk sizes, with special case-ing the last one * chunking may impact transport decision, eg SBE with fixed group size would provide a way to skip the intermediate vectors as can do the ordered write directly into the binary encoding * implications of blocking `ClassifyNewTrack` vs `NewStage` * need architecture for * easy swapping different compression/transport techniques fine grained timing the steps * speed of the python implementation less important, push on the cpp route Numpy Serialize into buffer ? ----------------------------------- * http://stackoverflow.com/questions/25116/binary-buffer-in-python * env/numpy/numpy_bytesio_load_save.py Efficient Selection of Numpy array, eg just the hits with PMTID > 0 ------------------------------------------------------------------------ * http://ipython-books.github.io/featured-01/ * np.take np.where ? Current lifecycle of optical photon data ------------------------------------------ #. `G4ThreeVector` for position, momentum, polarization + wavelength, time etc.. generated by Geant4 processes `G4Cerenkov` `G4Scintillation` and set as a member of a `G4Track` #. `G4Track*` passed to:: `G4ClassificationOfNewTrack DsChromaStackAction::ClassifyNewTrack (const G4Track* aTrack)` #. collected into `ChromaPhotonList` `std::vector` members `env/nuwa/detsim/src/DsChromaStackAction.cc`:: 083 void DsChromaStackAction::CollectPhoton(const G4Track* aPhoton ) 84 { 85 #ifdef WITH_CHROMA_ZMQ 86 G4ParticleDefinition* pd = aPhoton->GetDefinition(); 87 assert( pd->GetParticleName() == "opticalphoton" ); 88 89 G4String pname="-"; 90 const G4VProcess* process = aPhoton->GetCreatorProcess(); 91 if(process) pname = process->GetProcessName(); ... 101 102 assert( pname == "Cerenkov" || pname == "Scintillation" ); 103 104 G4ThreeVector pos = aPhoton->GetPosition()/mm ; 105 G4ThreeVector dir = aPhoton->GetMomentumDirection() ; 106 G4ThreeVector pol = aPhoton->GetPolarization() ; 107 float time = aPhoton->GetGlobalTime()/ns ; 108 float wavelength = (h_Planck * c_light / aPhoton->GetKineticEnergy()) / nanometer ; 109 110 fPhotonList->AddPhoton( 111 pos.x(), pos.y(), pos.z(), 112 dir.x(), dir.y(), dir.z(), 113 pol.x(), pol.y(), pol.z(), 114 time, 115 wavelength ); 116 117 #endif 118 } #. Serialized from `ChromaPhotonList` with ROOT `TObject` into a byte string #. Copied (still?) into ZMQRoot buffer and sent across the wire from ZMQ Client #. relayed at ZMQ broker on to ZMQ worker #. de-serialized by ZMQRoot back into a `ChromaPhotonList` object within CPLResponder #. data read from `ChromaPhotonList` into `chroma.event.Photon` numpy arrays:: 11 class DAEEventBase(object): .. 33 def setup_cpl(self, cpl): 34 """ 35 :param cpl: ChromaPhotonList instance 36 37 Convert serialization level ChromaPhotonList into operation level Photons 38 """ 39 photons = Photons.from_cpl(cpl, extend=True) #. `chroma.event.Photons` copies from `std::vector` into numpy arrays `chroma/chroma/event.py`:: 092 class Photons(object): 93 """ 94 A few enhancements: 95 96 #. from_cpl classmethod 97 #. timesorting 98 #. dump 99 """ 100 @classmethod 101 def from_cpl(cls, cpl, extend=True ): 102 """ 103 :param cpl: ChromaPhotonList instance, as obtained from file or MQ 104 :param extend: when True add pmtid attribute and sort by time 105 """ 106 xyz_ = lambda x,y,z,dtype:np.column_stack((np.array(x,dtype=dtype),np.array(y,dtype=dtype),np.array(z,dtype=dtype))) 107 108 pos = xyz_(cpl.x,cpl.y,cpl.z,np.float32) 109 dir = xyz_(cpl.px,cpl.py,cpl.pz,np.float32) 110 pol = xyz_(cpl.polx,cpl.poly,cpl.polz,np.float32) 111 wavelengths = np.array(cpl.wavelength, dtype=np.float32) 112 t = np.array(cpl.t, dtype=np.float32) 113 pass 114 obj = cls(pos,dir,pol,wavelengths,t) 115 116 if extend: 117 order = np.argsort(obj.t) 118 pmtid = np.array(cpl.pmtid, dtype=np.int32) 119 obj.sort(order) 120 obj.pmtid = pmtid[order] 121 pass 122 return obj #. `chroma.event.Photons` copied to GPU in `GPUPhotons` `chroma/chroma/gpu/photon.py`:: 13 class GPUPhotons(object): 14 def __init__(self, photons, ncopies=1): 15 """Load ``photons`` onto the GPU, replicating as requested. 16 17 Args: 18 - photons: chroma.Event.Photons 19 Photon state information to load onto GPU 20 - ncopies: int, *optional* 21 Number of times to replicate the photons 22 on the GPU. This is used if you want 23 to propagate the same event many times, 24 for example in a likelihood calculation. 25 26 The amount of GPU storage will be proportionally 27 larger if ncopies > 1, so be careful. 28 """ 29 nphotons = len(photons) 30 self.pos = ga.empty(shape=nphotons*ncopies, dtype=ga.vec.float3) 31 self.dir = ga.empty(shape=nphotons*ncopies, dtype=ga.vec.float3) 32 self.pol = ga.empty(shape=nphotons*ncopies, dtype=ga.vec.float3) 33 self.wavelengths = ga.empty(shape=nphotons*ncopies, dtype=np.float32) 34 self.t = ga.empty(shape=nphotons*ncopies, dtype=np.float32) 35 self.last_hit_triangles = ga.empty(shape=nphotons*ncopies, dtype=np.int32) 36 self.flags = ga.empty(shape=nphotons*ncopies, dtype=np.uint32) 37 self.weights = ga.empty(shape=nphotons*ncopies, dtype=np.float32) 38 39 # Assign the provided photons to the beginning (possibly 40 # the entire array if ncopies is 1 41 self.pos[:nphotons].set(to_float3(photons.pos)) 42 self.dir[:nphotons].set(to_float3(photons.dir)) 43 self.pol[:nphotons].set(to_float3(photons.pol)) 44 self.wavelengths[:nphotons].set(photons.wavelengths.astype(np.float32)) 45 self.t[:nphotons].set(photons.t.astype(np.float32)) 46 self.last_hit_triangles[:nphotons].set(photons.last_hit_triangles.astype(np.int32)) 47 self.flags[:nphotons].set(photons.flags.astype(np.uint32)) #. **all** photons read back from GPU after propagation to termination or max_steps #. back into `event.Photons` #. thence into `ChromaPhotonList` and serialized #. into byte string and replied over wire to waiting Geant4 process #. photons reaching PMT cathode converted into hits and collected into hit collections Photon Data ------------- #. per-photon data is ~12 float32 and ~2 int32 Using C++ placement new for dummy OP ? ------------------------------------------- * http://stackoverflow.com/questions/222557/what-uses-are-there-for-placement-new Position an object at a known address. * Could possibly replacing millions of Optical Photon `G4Track` with a single dummy placeholder object to avoid memory expense. Alternate Serialization/Compression to ROOT TObject ----------------------------------------------------- #. capnproto, pycapnproto * faster version of google protocol buffers, need to compile a schema spec to create stub objects * more flexibility needed, maybe with performance cost * https://groups.google.com/forum/m/#!topic/capnproto/BDpV6WEG5Bw performance comparisons show pycapnproto very slow #. cnpy : small C project * see :doc:`/numpy/numpy_persistency` * https://github.com/rogersce/cnpy #. SBE : Simple Binary Encoding * https://github.com/real-logic/simple-binary-encoding/wiki * compile XML schema into C++ stub classes, using a java tool * load data into those and serialize * has a repeating group construct (maybe photon) but seems to need to know how many entries * http://mechanical-sympathy.blogspot.tw/2014/05/simple-binary-encoding.html #. blosc, bloscpack, bcolz, carray : interesting blocked/shuffled compression algo * C api may be restricted to low level blosc, high level packing only exposed via Python interface implented in Cython making use from C/C++ problematic ? * http://www.blosc.org/blosc-in-depth.html * https://github.com/Blosc/bloscpack * http://bcolz.blosc.org/ * :google:`carray ctable numpy` compressible/exandable/chunked numpy ndarray #. npy and cnpy, uncompressed numpy serialization format * appears to allow serialized numpy arrays to be created in C, this would shortcut many of the lifecycle steps as could go straight from `std::vector` into npy serialization and send that over the wire * BUT, uncompressed nature may be problematic #. msgpack : very wide support * http://msgpack.org