Optimizing MAP performance for complex workflows like nnunet #555

chezhia · 2025-09-25T15:49:37Z

chezhia
Sep 25, 2025
Collaborator

Improving MAP Performance for Complex Workflows (e.g., nnU-Net)

Hello MONAI team and community,

I'd like to start a discussion around optimizing the performance of MONAI Application Packages (MAPs), particularly for complex, multi-stage workflows similar to nnU-Net. As these models become more prevalent, ensuring efficient execution within the MAP framework is crucial.

I have a few specific points I'd like to explore:

1. Holoscan Operator Optimizations vs. Container Spin-up Time

In the current MAP architecture, each inference request for a single image typically spins up a new container. Given this overhead, how much can we realistically expect optimizations at the Holoscan operator level to impact end-to-end performance? Is the container initialization time the primary bottleneck for these complex, single-instance workflows, potentially overshadowing the gains from operator-level enhancements?

2. Optimizing the TotalSegmentator MAP

The AIDE/TotalSegmentator MAP currently invokes the TotalSegmentator tool as a subprocess. This is effective but may not be the most performant approach. Should we consider re-implementing this MAP to leverage a more native, operator-based architecture? This could potentially reduce overhead and improve data flow efficiency. What would be the challenges and benefits of creating a more deeply integrated TotalSegmentator MAP using native operators?

3. Graph Execution Engine (GXF) Benefits for Single vs. Batch Processing

The Graph Execution Engine (GXF) in Holoscan is designed to manage and optimize complex dataflows. However, are its benefits, such as zero-copy and efficient resource management, as significant when processing a single input image compared to processing a batch of images? For many clinical use cases, inference is performed on a per-study basis. It would be valuable to understand the performance trade-offs in this context.

4. GXF Benefits with a Remote Inference Server (Triton)

When a MAP is configured to use a remote inference service like Triton, the model execution happens outside the local graph. In this scenario, what are the primary benefits of using the Graph Execution Engine within the MAP itself? Does it still offer significant advantages in managing the pre- and post-processing steps that run locally before and after the call to the remote server?

I'm looking forward to hearing your thoughts and insights on these topics, before we experiment with these ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing MAP performance for complex workflows like nnunet #555

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Optimizing MAP performance for complex workflows like nnunet #555

Uh oh!

chezhia Sep 25, 2025 Collaborator

Improving MAP Performance for Complex Workflows (e.g., nnU-Net)

1. Holoscan Operator Optimizations vs. Container Spin-up Time

2. Optimizing the TotalSegmentator MAP

3. Graph Execution Engine (GXF) Benefits for Single vs. Batch Processing

4. GXF Benefits with a Remote Inference Server (Triton)

Replies: 0 comments

chezhia
Sep 25, 2025
Collaborator