-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenMP state #5
Comments
For my understanding: |
I've implemented a simple algebra in 1725e45 that just uses a random access container like std::vector or array from multiple threads, but doesn't support a dispatcher. The system function is called once from the main thread and the user needs to take care of multi-threading there. To get to a common parallel interface I think the system function should transparently be called from multiple threads by odeint, with partial views of the state, so the user won't have to worry about parallelization, or later, MPI, it would look the same. |
Special OpenMP state may be required in order to initialize the underlying memory properly. This is important for performance on NUMA systems: each OpenMP thread should be the first to touch its chunk of memory (see e.g. this presentation for an explanation). This should probably be handled by odeint's resizer implementation for the state. |
and you don't even need a separate state type :) On 06/14/2013 11:39 PM, neapel wrote:
|
i do think an omp state is a good idea to have more control over the parallelization. how else could you specialize the resizer if you dont have an omp state type you can specialize with. |
On 17.06.2013 10:29, Mario Mulansky wrote:
Ok, you are right. I think there are possibilities with SFINAE and
|
State splits a given Range into an InnerState, one for each thread. The algebra's for_eachN calls for_eachN in parallel on each part, using the InnerState's algebra. There's an openmp_wrapper to parallelize the system function; this needs a way to pass on the offset. The idea was that this design should allow using OpenMP on each MPI node with a single-threaded inner_state: mpi_state< openmp_state< inner_state > > with mpi_wrapper(openmp_wrapper(system_function))
State splits a given Range into an InnerState, one for each thread. The algebra's for_eachN calls for_eachN in parallel on each part, using the InnerState's algebra. There's an openmp_wrapper to parallelize the system function; this needs a way to pass on the offset. The idea was that this design should allow using OpenMP on each MPI node with a single-threaded inner_state: mpi_state< openmp_state< inner_state > > with mpi_wrapper(openmp_wrapper(system_function))
openmp_range_algebra: parallel for over a random access container. openmp_nested_algebra: processs parts of a split container in parallel. openmp_state: a split container based on vector<vector<>>. openmp_algebra: use a range_algebra on each part of that container.
|
No description provided.
The text was updated successfully, but these errors were encountered: