The Habitat-Sim backend module is implemented in C++ and leverages the magnum graphics middleware library to support cross-platform deployment on a broad variety of hardware configurations. The architecture of the main abstraction classes is shown below. The design of this module ensures a few key properties:
- Memory-efficient management of 3D environment resources (triangle mesh geometry, textures, shaders) ensuring shared resources are cached and re-used
- Flexible, structured representation of 3D environments using SceneGraphs, allowing for programmatic manipulation of object state, and combination of objects from different environments
- High-efficiency rendering engine with multi-attachment render passes for reduced overhead when multiple sensors are active
- Arbitrary numbers of Agents and corresponding Sensors that can be linked to a 3D environment by attachment to a SceneGraph.
Architecture of Habitat-Sim
main classes
The Simulator delegates management of all resources related to 3D environments to a ResourceManager that is responsible for loading and caching 3D environment data from a variety of on-disk formats. These resources are used within SceneGraphs at the level of individual SceneNodes that represent distinct objects or regions in a particular Scene. Agents and their Sensors are instantiated by being attached to SceneNodes in a particular SceneGraph.
Example rendered sensor observations
The table below reports performance statistics for a test scene from the Matterport3D dataset (id 17DRP5sb8fy
) on a Xeon E5-2690 v4 CPU
and Nvidia Titan Xp
. Single-thread performance reaches several thousand frames per second, while multi-process operation with several independent simulation backends can reach more than 10,000 frames per second on a single GPU!
1 proc | 3 procs | 5 procs | |||||||
---|---|---|---|---|---|---|---|---|---|
Sensors / Resolution | 128 | 256 | 512 | 128 | 256 | 512 | 128 | 256 | 512 |
RGB | 4093 | 1987 | 848 | 10638 | 3428 | 2068 | 10592 | 3574 | 2629 |
RGB + depth | 2050 | 1042 | 423 | 5024 | 1715 | 1042 | 5223 | 1774 | 1348 |
RGB + depth + semantics* | 709 | 596 | 394 | 1312 | 1219 | 979 | 1521 | 1429 | 1291 |
Previous simulation platforms that have operated on similar datasets typically produce on the order of a couple hundred frames per second. For example Gibson reports up to about 150 fps with 8 processes, and MINOS reports up to about 167 fps with 4 threads.
*Note: The semantic sensor in MP3D houses currently requires the use of additional house 3D meshes with orders of magnitude more geometric complexity leading to reduced performance. We expect this to be addressed in future versions leading to speeds comparable to RGB + depth; stay tuned.
We support transferring rendering results directly to a PyTorch tensor via CUDA-GL Interop.
This feature is built by when Habitat-Sim is compiled with CUDA, i.e. built with --with-cuda
. To enable it, set the
gpu2gpu_transfer
flag of the sensor specification(s) to True
This is implemented in a way that is reasonably agnostic to the exact GPU-Tensor library being used, but we currently have only implemented support for PyTorch.