Skip to content

FiftyOne integration

Introduction

FiftyOne is an open-source visualization and analysis tool for machine learning datasets, particularly useful in computer vision projects. It facilitates detailed dataset examination and the fine-tuning of model performance.

Albumentations could be used in FiftyOne via the FiftyOne Plugin.

With the FiftyOne Albumentations plugin, you can transform any and all labels of type Detections, Keypoints, Segmentation, and Heatmap, or just the images themselves.

Info

This tutorial is almost entirely based on the FiftyOne Documentation and serves as an overview of the functionality of the FiftyOne Albumentations plugin.

For more up to date information check the original source.

This integration guide will focus on the setup process and the functionality of the plugin.

For a tutorial on how to curate your augmentations, check out the Data Augmentation Tutorial as FiftyOne Documentation.

Overview

Albumentations supports 80+ transforms spanning pixel-level, geometric transformations, and more.

As of April 29, 2024 FiftyOne supports:

Functionality

The FiftyOne Albumentations plugin provides the following functionality:

  • Apply Albumentations transformations to your dataset, your current view, or selected samples
  • Visualize the effects of these transformations directly within the FiftyOne App
  • View samples generated by the last applied transformation
  • Save augmented samples to the dataset
  • Get info about the last applied transformation
  • Save transformation pipelines to the dataset for reproducibility

Setup

Make sure you have FiftyOne and Albumentations installed:

Bash
pip install -U fiftyone albumentations

Next, install the FiftyOne Albumentations plugin:

Bash
fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin

Note

If you have the FiftyOne Plugin Utils plugin installed, you can also install the Albumentations plugin via the install_plugin operator, selecting the Albumentations plugin from the community dropdown menu.

You will also need to load (and download if necessary) a dataset to apply the augmentations to. For this guide, we'll use the the quickstart dataset:

Python
import fiftyone as fo
import fiftyone.zoo as foz

## only take 5 samples for quick demonstration
dataset = foz.load_zoo_dataset("quickstart", max_samples=5)

# only keep the ground truth labels
dataset.select_fields("ground_truth").keep_fields()

session = fo.launch_app(dataset)

Note

The quickstart dataset only contains Detections labels. If you want to test Albumentations transformations on other label types, here are some quick examples to get you started, using FiftyOne's Hugging Face Transformers and Ultralytics integrations:

Bash
pip install -U transformers ultralytics
Python
import fiftyone as fo
import fiftyone.zoo as foz

from ultralytics import YOLO

# Keypoints
model = YOLO("yolov8l-pose.pt")
dataset.apply_model(model, label_field="keypoints")

# Instance Segmentation
model = YOLO("yolov8l-seg.pt")
dataset.apply_model(model, label_field="instances")

# Semantic Segmentation
model = foz.load_zoo_model(
    "segmentation-transformer-torch",
    name_or_path="Intel/dpt-large-ade",
)
dataset.apply_model(model, label_field="mask")

# Heatmap
model = foz.load_zoo_model(
    "depth-estimation-transformer-torch",
    name_or_path="LiheYoung/depth-anything-small-hf",
)
dataset.apply_model(model, label_field="depth_map")

Apply transformations

To apply Albumentations transformations to your dataset, you can use the augment_with_albumentations operator. Press the backtick key to open the operator modal, and select the augment_with_albumentations operator from the dropdown menu.

You can then configure the transformations to apply:

  • Number of augmentations per sample: The number of augmented samples to generate for each input sample. The default is 1, which is sufficient for deterministic transformations, but for probabilistic transformations, you may want to generate multiple samples to see the range of possible outputs.
  • Number of transforms: The number of transformations to compose into the pipeline to be applied to each sample. The default is 1, but you can set this as high as you'd like — the more transformations, the more complex the augmentations will be. You will be able to configure each transform separately.
  • Target view: The view to which the transformations will be applied. The default is dataset, but you can also apply the transformations to the current view or to currently selected samples within the app.
  • Execution mode: If you set delegated=False, the operation will be executed immediately. If you set delegated=True, the operation will be queued as a job, which you can then run in the background from your terminal with:
Bash
fiftyone delegated launch

For each transformation, you can select either a "primitive" transformation from the Albumentations library, or a "saved" transformation pipeline that you have previously saved to the dataset. These saved pipelines can consist of one or more transformations.

When you apply a primitive transformation, you can configure the parameters of the transformation directly within the app. The available parameters, their default values, types, and docstrings are all integrated directly from the Albumentations library.

When you apply a saved pipeline, there will not be any parameters to configure.

Visualize transformations

Once you've applied the transformations, you can visualize the effects of the transformations directly within the FiftyOne App. All augmented samples will be added to the dataset, and will be tagged as augmented so that you can easily filter for just augmented or non-augmented samples in the app.

You can also filter for augmented samples programmatically with the match_tags() method:

Python
# get just the augmented samples
augmented_view = dataset.match_tags("augmented")

# get just the non-augmented samples
non_augmented_view = dataset.match_tags("augmented", bool=False)

However, matching on these tags will return all samples that have been generated by an augmentation, not just the samples that were generated by the last applied transformation — as you will see shortly, we can save augmentations to the dataset. To get just the samples generated by the last applied transformation, you can use the view_last_albumentations_run operator:

Note

For all samples added to the dataset by the FiftyOne Albumentations plugin, there will be a field "transform", which contains the information not just about the pipeline that was applied, but also about the specific parameters that were used for this application of the pipeline. For example, if you had a HorizontalFlip transformation with an application probability of p=0.5, the contents of the "transform" field tell you whether or not this transformation was applied to the sample!

Save augmentations

By default all augmentations are temporary, as the FiftyOne Albumentations plugin is primarily designed for rapid prototyping and experimentation. This means that when you generated a new batch of augmented samples, the previous batch of augmented samples will be removed from the dataset, and the image files will be deleted from disk.

However, if you want to save the augmented samples to the dataset, you can use the save_albumentations_augmentations operator, which will save the augmented samples to the dataset while keeping the augmented tag on the samples.

Get last transformation info

When you apply a transformation pipeline to samples in your dataset using the FiftyOne Albumentations plugin, this information is captured and stored using FiftyOne's custom runs. This means that you can easily access the information about the last applied transformation.

In the FiftyOne App, you can use the get_last_albumentations_run_info operator to display a formatted summary of the relevant information:

Note

You can also access this information programmatically by getting info about the custom run that the information is stored in. For the Albumentations plugin, this info is stored via the key '_last_albumentations_run':

Python
last_run_info = dataset.get_run_info("_last_albumentations_run")
print(last_run_info)

Save transformations

If you are satisfied with the transformation pipeline you have created, you can save the entire composition of transformations to the dataset, hyperparameters and all. This means that after your rapid prototyping phase, you can easily move to a more reproducible workflow, and you can share your transformations or port them to other datasets.

To save a transformation pipeline, you can use the save_albumentations_transform operator:

After doing so, you will be able to view the information about this saved transformation pipeline using the get_albumentations_run_info operator:

Additionally, you will have access to this saved transformation pipeline under the "saved" tab for each transformation in the augment_with_albumentations operator modal.