Skip to content

Run models in separate processes to solve multiple issues #106

@HDembinski

Description

@HDembinski

One of the major annoyances when working with impy interactively, is that one cannot create several instances of most models without spawning processes. My plan for impy is to make it part of the API that a model is always started in a separate process. This solves a lot of issues.

Here, creating a model spawns a new process, and the user never notices. It is an implementation detail. Finally, the model acts like a normal Python class.

m1 = EposLHC()  # starts process 1 which initializes an independent instance of EposLHC
m2 = EposLHC()  # starts process 2 which initializes another independent instance
for event in m1(kin1, 10):
   ...
for event in m2(kin2, 10):
   ...

MCRun.__call__ communicates with the process and runs the event generation in the other process.

A caveat of this approach is that there are overheads in process communication which may slow impy down, and we want to avoid that (speed is something we care about). I assume that basic interprocess process communication (sending input data to the other process, waiting for its output) has a negligible impact for most models, but I have to see. For SIBYLL, which is incredibly fast, it may be noticable.

The main overhead of pickling and unpickling an event object can be avoided by using shared memory. Shared memory allows two processes to read/write to the same section of memory, so that data can be transferred without any overhead.

Implementing this requires some changes to EventData, which will be allocated in shared memory, and the models, because they need to write to EventData allocated in Python instead of their own Fortran buffers wrapped by MCEvent. Most models call a Fortran routine anyway to copy data from their internal buffers into a HepEvt like buffer. This should be replaced by new routines that copy the data into the EventData object instead. This can probably be implemented with Numba, if the original buffers are readable from Python, requiring no new Fortran code.

Since we run the models in another process, we may even be able to solve the issue that most models produce way too much output, since we should be able to redirect stdout of the other process into some buffers that we control from the main thread.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions