Releases: isarandi/poseviz
v0.3.2 — ModernGL Rendering Backend
What's New in v0.3
This is a major release that completely replaces the rendering backend — from Mayavi to a custom OpenGL renderer built on ModernGL and GLFW. The new renderer is faster, more interactive, and supports GPU-accelerated video encoding.
New Rendering Engine
The entire rendering pipeline has been rewritten from scratch:
- ModernGL + GLFW: Replaces the Mayavi/VTK dependency with a lightweight OpenGL stack. Faster startup, lower memory usage, no Qt dependency.
- Instanced rendering: Skeleton joints (spheres) and limbs (tubes) are rendered with GPU instancing — one draw call per color group instead of one per joint.
- MSAA antialiasing: 4x multisampled framebuffer for smooth edges.
- Multiple mesh color sources: Body meshes support uniform color, per-vertex RGB, scalar-to-colormap mapping, and UV-mapped textures — each with its own shader variant.
- Raymond 3-point lighting: Camera-relative Lambertian lighting for consistent illumination from any angle.
Interactive Camera
The new free-fly terrain camera provides full interactive navigation:
- Orbit (left drag): Rotate around a pivot point in the scene.
- Look around (Shift + left drag): Rotate the viewing direction in place — the camera stays fixed while the view rotates. Useful for surveying a scene from a fixed vantage point.
- Pan (middle drag): Move the pivot parallel to the view plane.
- Zoom (right drag or scroll wheel): Change distance from pivot.
- Fly (arrow keys, Page Up/Down): Move through the scene in the camera's look direction.
- Field of view (+/−): Adjust the camera's FOV.
- Snap to camera (number keys 1–9): Jump to a displayed camera and track it. Any manual movement unsnaps.
- Camera history (mouse back/forward buttons): Navigate between previous camera positions, like a browser's back/forward.
Split-Screen Mode
Press Tab to toggle split-screen mode:
- Left pane: The original camera view (locked to the recording camera or scripted
viz_camera). - Right pane: The free-fly camera — fully interactive, independent of the left pane.
This lets you see exactly what the camera sees while simultaneously exploring the scene from a custom angle. Mouse and keyboard input only affects the right pane.
GPU Video Encoding
- Zero-copy NVENC encoding: The rendered framebuffer texture is passed directly to the hardware encoder via CUDA-OpenGL interop. No pixel readback to CPU.
- CPU fallback: Set
gpu_encode=Falsefor machines without NVENC support. - Independent render resolution: Set
render_resolution=(1920, 1080)to render at high resolution for video output while the window displays at a lower resolution. - Audio copying: Pass
audio_pathto copy the audio track from the source video into the output. - Mid-session recording: Start and stop recording with
new_sequence_output()andfinalize_sequence_output(). Record multiple segments to different files in a single session.
Auto-Headless Detection
The headless parameter now defaults to None (auto-detect). If neither DISPLAY nor WAYLAND_DISPLAY is set (e.g., SSH without X forwarding), PoseViz automatically switches to headless mode. The same code works on a desktop (opens a window) and on a remote server (renders offscreen to video) without changes.
GPU Frame Input
The new gpu_frames=True flag enables passing GPU tensors directly as frames, avoiding CPU round-trips:
- Accepts PyTorch CUDA tensors and any DLPack-compatible object (CuPy, JAX, hardware video decoder outputs, etc.).
- Image downscaling and undistortion run on the GPU via
deltacamera.pt.reproject_image(singlegrid_samplekernel). - Enables a fully GPU-resident pipeline: decode (NVDEC) → resize/undistort (CUDA) → render (OpenGL) → encode (NVENC).
Scripted Camera with Interactive Override
When a viz_camera is passed to update(), it drives the view. But the user can take over at any time by dragging the mouse — the terrain camera picks up from the current view position. On mouse release, the next update() with a viz_camera resumes the scripted camera path.
Documentation
New Sphinx documentation with Diátaxis structure:
- How-to guides: Usage tips, video output, headless rendering, GPU frames, multi-view visualization.
- Explanations: Architecture (process model, shared memory), coordinate systems, rendering pipeline (scene graph, instancing, lighting, camera controls, split-screen viewports).
Breaking Changes
- Mayavi is no longer a dependency. The rendering backend is entirely new. If you were using Mayavi-specific features or accessing internal Mayavi objects, those no longer exist.
cameravisionreplaced bydeltacamerafor camera intrinsics/extrinsics.torch_framesrenamed togpu_framesto reflect that any DLPack-compatible GPU object is accepted, not just PyTorch tensors.- Python ≥ 3.10 required (was ≥ 3.8).
- New dependencies:
moderngl,glfw,deltacamera,framepump,simplepyutils,rlemasklib,numba. Removed:mayavi,cameravision,boxlib.