diff --git a/examples/fiftyone-visualization/demo.ipynb b/examples/fiftyone-visualization/demo.ipynb new file mode 100644 index 0000000..566fe59 --- /dev/null +++ b/examples/fiftyone-visualization/demo.ipynb @@ -0,0 +1,354 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Visualizing ZOD in FiftyOne\n", + "\n", + "The ZOD is a large multi-modal autonomous driving dataset collected over a 2-year period in 14 different European countries, designed to support various aspects of autonomous driving research. This script facilitates the conversion of ZOD into FiftyOne format, enabling users to leverage FiftyOne's powerful tools for dataset exploration and analysis.\n", + "\n", + "In this notebook we will instructions on how to use the [Zenseact Open Dataset (ZOD)](https://zod.zenseact.com) with [FiftyOne](https://voxel51.com), a powerful open-source tool for computer vision dataset exploration and analysis. \n", + "\n", + "![](./docs/zod-fiftyone.gif)\n", + "\n", + "## Requirements\n", + "\n", + "Please install the packages in `requirements.txt` before getting started:\n", + "```\n", + "zod==0.3.6 \n", + "fiftyone==0.23.2\n", + "open3d==0.17.0\n", + "pyyaml==6.0.1 \n", + "tqdm==4.66.1\n", + "```\n", + "\n", + "Once the required packages are installed, we will check for any existing datasets." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import fiftyone as fo\n", + "\n", + "fo.list_datasets()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If this list is empty, we will start by importing ZOD into fiftyone by creating a dataset.\n", + "\n", + "## Create a new dataset\n", + "\n", + "With the requirements already installed, we have to:\n", + "\n", + "### 1. Update [`config.yaml`](./src/config.yaml)\n", + "\n", + "Specify the necessary configurations for your dataset. Here's a breakdown of the configuration options:\n", + "\n", + "- `dataset_root`: Point this to the root directory of the ZOD dataset, if you haven't downloaded it yet, see the download options [here](https://zod.zenseact.com/download/).\n", + "- `pcd_files_dir`: Directory where converted PCD files will be stored. If it does not already exist, it will be created.\n", + "- `dataset_version`: Specify whether you're using the \"mini\" or \"full\" version of the dataset.\n", + "- `dataset_split`: Choose the dataset split (\"train\", \"val\", or \"all\"). For the \"mini\" version, please use \"all\".\n", + "- `dataset_name`: The name you want to give to your dataset.\n", + "- `dataset_persistent`: Set to True if you want to save the dataset locally for later use (recommended).\n", + "- `test_run`: Set to True for a test run on the first 10 samples.\n", + "- `mapbox_token`: Optional. Provide a [Mapbox](https://www.mapbox.com) API token for map functionality in FiftyOne (you can create one for free [here](https://account.mapbox.com/access-tokens/create)).\n", + "\n", + "### 2. Run [`zod_to_fiftyone.py`](./src/zod_to_fiftyone.py)\n", + "\n", + "Please note that due to a FiftyOne limitation, we need to create `.pcd` files to visualize point clouds in the app. This conversion process will take a while due the size of the dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load the dataset\n", + "\n", + "Once you have created the datasets, choose an existing one from the list to load it. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import fiftyone as fo\n", + "\n", + "fo.list_datasets()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset = fo.load_dataset('zod_val')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Launch the app\n", + "\n", + "You can use the app inside a notebook, but also launch in your browser. Learn more [here](https://docs.voxel51.com/user_guide/app.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "session = fo.launch_app(dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Add new field to each sample\n", + "\n", + "This will take a few minutes to run depending on the size of the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for sample in dataset.iter_samples(progress=True):\n", + " sample[\"split\"] = \"Validation\"\n", + " sample.save()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.media_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can read more about grouped datasets in the FiftyOne documentation [here](https://docs.voxel51.com/user_guide/groups.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.stats(include_media=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get random sample\n", + "\n", + "This is useful to see the structure of your samples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sample = dataset.take(1).first()\n", + "sample" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# lists the names of the fields in the sample\n", + "sample.field_names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get coordinates for random samples" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.take(5).values(\"location.point\", _raw=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Filtering\n", + "\n", + "You can easily filter the samples in your dataset in the app GUI, but you can also do it programmatically. Here are some examples.\n", + "\n", + "### by label" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from fiftyone import ViewField as F" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "just_animals = dataset.filter_labels(\n", + " \"detections\",\n", + " F(\"label\").contains_str(\"Animal\"),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can also count how many samples exist in the filtered view we just created. You can read more about views in FiftyOne [here](https://docs.voxel51.com/user_guide/using_views.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# count of animal labels in the filtered view\n", + "just_animals.count(\"detections\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### by file path" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.match(F(\"filepath\").contains_str(\"india\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Saving a view" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# will save a view with all samples that contain an animal\n", + "dataset.save_view(\"detections_animals\", just_animals)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Export view as KITTI dataset\n", + "\n", + "You can export data in many common formats (YOLO, KITTI, COCO, etc.) as well as having the option to build [custom exporters](https://docs.voxel51.com/user_guide/export_datasets.html#custom-formats). Here is an example of exporting a view as a KITTI object detection dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "export_dir = \"./kitti\"\n", + "label_field = \"detections\"\n", + "\n", + "os.makedirs(export_dir, exist_ok=True)\n", + "\n", + "just_animals.export(\n", + " export_dir=export_dir,\n", + " dataset_type=fo.types.KITTIDetectionDataset,\n", + " label_field=label_field,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is just a preview on how FiftyOne can be used to interact with the ZOD dataset programmatically. For more information please refer to their [documentation](https://docs.voxel51.com)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "zod", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/examples/fiftyone-visualization/docs/zod-fiftyone.gif b/examples/fiftyone-visualization/docs/zod-fiftyone.gif new file mode 100644 index 0000000..552b306 Binary files /dev/null and b/examples/fiftyone-visualization/docs/zod-fiftyone.gif differ diff --git a/examples/fiftyone-visualization/requirements.txt b/examples/fiftyone-visualization/requirements.txt new file mode 100644 index 0000000..2fe21fb --- /dev/null +++ b/examples/fiftyone-visualization/requirements.txt @@ -0,0 +1,5 @@ +zod==0.3.6 +fiftyone==0.23.2 +open3d==0.17.0 +pyyaml==6.0.1 +tqdm==4.66.1 \ No newline at end of file diff --git a/examples/fiftyone-visualization/src/config.yaml b/examples/fiftyone-visualization/src/config.yaml new file mode 100644 index 0000000..cbc8437 --- /dev/null +++ b/examples/fiftyone-visualization/src/config.yaml @@ -0,0 +1,8 @@ +dataset_root: "/mnt/zod" # point this to the root directory of the zod dataset you are using +pcd_files_dir: "/mnt/zod/pcd_files" # directory where converted pcd files will be stored +dataset_version: "full" # "mini" or "full" +dataset_split: "val" # "train", "val", or "all" +dataset_name: "zod_val" # you can change this to anything you would like +dataset_persistent: True # if true, dataset will be saved locally for later use +test_run: True # if true, it will run for only the first 10 samples +mapbox_token: "" # optional, provide a Mapbox API token for map functionality in FiftyOne. \ No newline at end of file diff --git a/examples/fiftyone-visualization/src/utils.py b/examples/fiftyone-visualization/src/utils.py new file mode 100644 index 0000000..bdf7433 --- /dev/null +++ b/examples/fiftyone-visualization/src/utils.py @@ -0,0 +1,53 @@ +from __future__ import annotations + +import numpy as np + + +def quaternion_to_euler(qx: float, qy: float, qz: float, qw: float) -> list[float]: + """ + Convert a quaternion into euler angles (roll, pitch, yaw). + + Parameters: + qx (float): The x component of the quaternion. + qy (float): The y component of the quaternion. + qz (float): The z component of the quaternion. + qw (float): The w component of the quaternion. + + Returns: + list: The roll, pitch, and yaw euler angles in radians. + """ + t0 = +2.0 * (qw * qx + qy * qz) + t1 = +1.0 - 2.0 * (qx * qx + qy * qy) + roll_x = np.arctan2(t0, t1) + + t2 = +2.0 * (qw * qy - qz * qx) + t2 = np.clip(t2, a_min=-1.0, a_max=1.0) + pitch_y = np.arcsin(t2) + + t3 = +2.0 * (qw * qz + qx * qy) + t4 = +1.0 - 2.0 * (qy * qy + qz * qz) + yaw_z = np.arctan2(t3, t4) + + return [roll_x, pitch_y, yaw_z] + + +def normalize_bbox(bbox: list, image_width=3848, image_height=2168) -> list[float]: + """ + Normalize the bounding box coordinates. + + Parameters: + bbox (list or tuple): The bounding box in the format [x, y, width, height] + image_width (int): Width of the image + image_height (int): Height of the image + + Returns: + list: Normalized bounding box [x_min, y_min, width, height] + """ + x, y, width, height = bbox + + x_min = x / image_width + y_min = y / image_height + norm_width = width / image_width + norm_height = height / image_height + + return [x_min, y_min, norm_width, norm_height] diff --git a/examples/fiftyone-visualization/src/zod_to_fiftyone.py b/examples/fiftyone-visualization/src/zod_to_fiftyone.py new file mode 100644 index 0000000..abed5c3 --- /dev/null +++ b/examples/fiftyone-visualization/src/zod_to_fiftyone.py @@ -0,0 +1,258 @@ +from __future__ import annotations + +import argparse +import os +import sys + +import fiftyone as fo +import open3d as o3d +import yaml +from tqdm import tqdm +from utils import normalize_bbox, quaternion_to_euler + +import zod.constants as constants +from zod import ZodFrame, ZodFrames +from zod.constants import AnnotationProject, Anonymization + + +def get_dataset_config(config_path: str) -> dict: + """ + Loads configuration from a YAML file. + + Args: + config_path (str): Path to the configuration YAML file. + + Returns: + dict: The loaded configuration dictionary. + """ + with open(config_path, "r") as f: + config = yaml.safe_load(f) + return config + + +def filter_zod_frames(zod_frames: ZodFrames, dataset_split: str) -> list[str]: + """ + Filters ZOD frames based on the specified dataset split. + + Args: + zod_frames (ZodFrames): The ZodFrames object containing all frames. + dataset_split (str): The desired dataset split ("all", "train", or "val"). + + Returns: + list: A list of ZOD frame IDs based on the split. + """ + if dataset_split not in ["all", "train", "val"]: + print("Invalid dataset_split specified.") + sys.exit() + + if dataset_split == "all": + return list(zod_frames.get_all_ids()) + else: + return list( + zod_frames.get_split(constants.TRAIN if dataset_split == "train" else constants.VAL) + ) + + +def process_zod_frame(zod_frame: ZodFrame, pcd_files_dir: str) -> tuple[str, list, str]: + """ + Processes a single ZOD frame. Gets image path, annotations and converts the lidar data + from provided .npy file into a .pcd file. + + Args: + zod_frame (ZodFrame): The ZOD frame object. + pcd_files_dir (str): Path to the directory for storing point cloud files. + + Returns: + tuple: A tuple containing (core_image_path, annotations, pcd_filename). + """ + camera_core_frame = zod_frame.info.get_key_camera_frame(Anonymization.BLUR) + core_image_path = camera_core_frame.filepath + annotations = zod_frame.get_annotation(AnnotationProject.OBJECT_DETECTION) + pcd_filename = f"{pcd_files_dir}/{zod_frame.info.id}.pcd" + + if not os.path.exists(pcd_filename): + core_lidar = zod_frame.get_lidar()[0] + pcd = o3d.geometry.PointCloud() + pcd.points = o3d.utility.Vector3dVector(core_lidar.points) + o3d.io.write_point_cloud(pcd_filename, pcd) + return core_image_path, annotations, pcd_filename + return core_image_path, annotations, pcd_filename + + +def convert_annotations(annotations: list) -> tuple[list[fo.Detection], list[fo.Detection]]: + """ + Converts 2D and 3D ZOD annotations to FiftyOne detection formats. + + Args: + annotations (list): A list of ZOD annotations. + + Returns: + tuple: A tuple containing two lists (detections_3d, detections_2d). + """ + detections_3d, detections_2d = [], [] + for anno in annotations: + if anno.box3d is not None: + location = anno.box3d.center + dimensions = anno.box3d.size + qw = anno.box3d.orientation[0] + qx = anno.box3d.orientation[1] + qy = anno.box3d.orientation[2] + qz = anno.box3d.orientation[3] + rotation = quaternion_to_euler(qx, qy, qz, qw) + + detection_3d = fo.Detection( + dimensions=list(dimensions), + location=list(location), + rotation=list(rotation), + label=anno.name, + ) + detections_3d.append(detection_3d) + + detection_2d = fo.Detection( + bounding_box=normalize_bbox(anno.box2d.xywh), + label=anno.name, + ) + detections_2d.append(detection_2d) + else: + pass + + return detections_3d, detections_2d + + +def create_dataset_samples(zod_frame: ZodFrame, pcd_files_dir: str) -> list: + """ + Creates grouped FiftyOne samples with detections with image and point cloud. + + Args: + zod_frame (ZodFrame): The ZOD frame object. + pcd_filename (str): Path to the point cloud file. + config (dict): The dataset configuration dictionary. + + Returns: + list: A list of FiftyOne Sample objects. + """ + core_image_path, annotations, pcd_filename = process_zod_frame(zod_frame, pcd_files_dir) + detections_3d, detections_2d = convert_annotations(annotations) + + group = fo.Group() + samples = [ + fo.Sample( + filepath=core_image_path, + group=group.element("image"), + detections=fo.Detections(detections=detections_2d), + ), + fo.Sample( + filepath=pcd_filename, + group=group.element("pcd"), + detections=fo.Detections(detections=detections_3d), + ), + ] + + def add_metadata(slice_idx): + samples[slice_idx]["frame_id"] = zod_frame.metadata.frame_id + samples[slice_idx]["time_of_day"] = zod_frame.metadata.time_of_day + samples[slice_idx]["country_code"] = zod_frame.metadata.country_code + samples[slice_idx]["collection_car"] = zod_frame.metadata.collection_car + samples[slice_idx]["road_type"] = zod_frame.metadata.road_type + samples[slice_idx]["road_condition"] = zod_frame.metadata.road_condition + samples[slice_idx]["num_vehicles"] = zod_frame.metadata.num_vehicles + samples[slice_idx]["location"] = fo.GeoLocation( + point=[zod_frame.metadata.longitude, zod_frame.metadata.latitude] + ) + + add_metadata(0) # add metadata to images + add_metadata(1) # add metadata to point clouds (optional) + + return samples + + +def create_fiftyone_dataset(config: dict, samples: list) -> None: + """ + Creates a FiftyOne dataset from extracted ZOD frame samples. + + Args: + config (dict): Configuration dictonary. + samples (list) : List of datas loaded from ZOD. + """ + dataset = fo.Dataset(name=config["dataset_name"]) + dataset.add_samples(samples) + + # Colour by label values by default + # and change to colour blind friendly colour scheme + dataset.app_config.color_scheme = fo.ColorScheme( + color_by="value", + color_pool=[ + "#E69F00", + "#56b4e9", + "#009e74", + "#f0e442", + "#0072b2", + "#d55e00", + "#cc79a7", + ], + ) + + if config["mapbox_token"]: + print("Mapbox token found, enabling map plugin.") + dataset.app_config.plugins["map"] = {"mapboxAccessToken": config["mapbox_token"]} + else: + print("Mapbox token not found, map plugin not enabled.") + + dataset.save() + + dataset.persistent = config["dataset_persistent"] + + +def create_zod_to_fiftyone_dataset(config_path: str) -> None: + """ + Creates a FiftyOne dataset from ZOD frames with point clouds and annotations. + + Args: + config_path (str): Path to the configuration YAML file. + """ + config = get_dataset_config(config_path) + # Creates directory to store .pcd files + os.makedirs(config["pcd_files_dir"], exist_ok=True) + + zod_frames = ZodFrames(dataset_root=config["dataset_root"], version=config["dataset_version"]) + zod_frame_list = filter_zod_frames(zod_frames, config["dataset_split"]) + if config["test_run"]: + zod_frame_list = zod_frame_list[:10] + + samples = [] + + for idx in tqdm(zod_frame_list): + zod_frame = zod_frames[idx] + sample_list = create_dataset_samples(zod_frame, config["pcd_files_dir"]) + samples.extend(sample_list) + + create_fiftyone_dataset(config=config, samples=samples) + + +def parse_arguments(): + """Parses command-line arguments using argparse. + + Returns: + Namespace: An object containing parsed arguments. + """ + + parser = argparse.ArgumentParser( + add_help=False, + description="A script that converts the ZOD dataset \ + into a FiftyOne dataset.", + ) + parser.add_argument("path", type=str, help="The path for the config file.") + parser.add_argument( + "-h", + "--help", + action="help", + default=argparse.SUPPRESS, + help="Show this help message and exit.", + ) + + return parser.parse_args() + + +if __name__ == "__main__": + args = parse_arguments() + create_zod_to_fiftyone_dataset(config_path=args.path)