Skip to content

Releases: isarandi/poseviz

v0.3.2 — ModernGL Rendering Backend

21 Mar 11:35

Choose a tag to compare

What's New in v0.3

This is a major release that completely replaces the rendering backend — from Mayavi to a custom OpenGL renderer built on ModernGL and GLFW. The new renderer is faster, more interactive, and supports GPU-accelerated video encoding.

New Rendering Engine

The entire rendering pipeline has been rewritten from scratch:

  • ModernGL + GLFW: Replaces the Mayavi/VTK dependency with a lightweight OpenGL stack. Faster startup, lower memory usage, no Qt dependency.
  • Instanced rendering: Skeleton joints (spheres) and limbs (tubes) are rendered with GPU instancing — one draw call per color group instead of one per joint.
  • MSAA antialiasing: 4x multisampled framebuffer for smooth edges.
  • Multiple mesh color sources: Body meshes support uniform color, per-vertex RGB, scalar-to-colormap mapping, and UV-mapped textures — each with its own shader variant.
  • Raymond 3-point lighting: Camera-relative Lambertian lighting for consistent illumination from any angle.

Interactive Camera

The new free-fly terrain camera provides full interactive navigation:

  • Orbit (left drag): Rotate around a pivot point in the scene.
  • Look around (Shift + left drag): Rotate the viewing direction in place — the camera stays fixed while the view rotates. Useful for surveying a scene from a fixed vantage point.
  • Pan (middle drag): Move the pivot parallel to the view plane.
  • Zoom (right drag or scroll wheel): Change distance from pivot.
  • Fly (arrow keys, Page Up/Down): Move through the scene in the camera's look direction.
  • Field of view (+/−): Adjust the camera's FOV.
  • Snap to camera (number keys 1–9): Jump to a displayed camera and track it. Any manual movement unsnaps.
  • Camera history (mouse back/forward buttons): Navigate between previous camera positions, like a browser's back/forward.

Split-Screen Mode

Press Tab to toggle split-screen mode:

  • Left pane: The original camera view (locked to the recording camera or scripted viz_camera).
  • Right pane: The free-fly camera — fully interactive, independent of the left pane.

This lets you see exactly what the camera sees while simultaneously exploring the scene from a custom angle. Mouse and keyboard input only affects the right pane.

GPU Video Encoding

  • Zero-copy NVENC encoding: The rendered framebuffer texture is passed directly to the hardware encoder via CUDA-OpenGL interop. No pixel readback to CPU.
  • CPU fallback: Set gpu_encode=False for machines without NVENC support.
  • Independent render resolution: Set render_resolution=(1920, 1080) to render at high resolution for video output while the window displays at a lower resolution.
  • Audio copying: Pass audio_path to copy the audio track from the source video into the output.
  • Mid-session recording: Start and stop recording with new_sequence_output() and finalize_sequence_output(). Record multiple segments to different files in a single session.

Auto-Headless Detection

The headless parameter now defaults to None (auto-detect). If neither DISPLAY nor WAYLAND_DISPLAY is set (e.g., SSH without X forwarding), PoseViz automatically switches to headless mode. The same code works on a desktop (opens a window) and on a remote server (renders offscreen to video) without changes.

GPU Frame Input

The new gpu_frames=True flag enables passing GPU tensors directly as frames, avoiding CPU round-trips:

  • Accepts PyTorch CUDA tensors and any DLPack-compatible object (CuPy, JAX, hardware video decoder outputs, etc.).
  • Image downscaling and undistortion run on the GPU via deltacamera.pt.reproject_image (single grid_sample kernel).
  • Enables a fully GPU-resident pipeline: decode (NVDEC) → resize/undistort (CUDA) → render (OpenGL) → encode (NVENC).

Scripted Camera with Interactive Override

When a viz_camera is passed to update(), it drives the view. But the user can take over at any time by dragging the mouse — the terrain camera picks up from the current view position. On mouse release, the next update() with a viz_camera resumes the scripted camera path.

Documentation

New Sphinx documentation with Diátaxis structure:

  • How-to guides: Usage tips, video output, headless rendering, GPU frames, multi-view visualization.
  • Explanations: Architecture (process model, shared memory), coordinate systems, rendering pipeline (scene graph, instancing, lighting, camera controls, split-screen viewports).

Breaking Changes

  • Mayavi is no longer a dependency. The rendering backend is entirely new. If you were using Mayavi-specific features or accessing internal Mayavi objects, those no longer exist.
  • cameravision replaced by deltacamera for camera intrinsics/extrinsics.
  • torch_frames renamed to gpu_frames to reflect that any DLPack-compatible GPU object is accepted, not just PyTorch tensors.
  • Python ≥ 3.10 required (was ≥ 3.8).
  • New dependencies: moderngl, glfw, deltacamera, framepump, simplepyutils, rlemasklib, numba. Removed: mayavi, cameravision, boxlib.

v0.2.1

21 May 22:15

Choose a tag to compare

Add docs, update for mesh support, PyPI