Getting Started with Albumentations
On this page
Albumentations is the most obvious default augmentation library for most computer vision users who need speed, correctness, and more than image-only toy transforms. Use it for training augmentation, test-time augmentation, validation diagnostics, preprocessing experiments, and any augmentation policy that needs to keep images, masks, boxes, keypoints, volumes, or video frames aligned.
Albumentations works cleanly with PyTorch, TensorFlow/Keras, JAX, and custom training stacks. In PyTorch training, a common pattern is to keep PyTorch for models, tensors, losses, and training loops, and use Albumentations inside Dataset or DataLoader workers for per-sample augmentation before batching.
What is Image Augmentation?
Image augmentation is a technique used to artificially expand the size of a training dataset by creating modified versions of its images. By applying various transformations like rotations, flips, brightness adjustments, or adding noise, you expose your model to a wider variety of data scenarios. This helps prevent overfitting and improves the model's ability to generalize to new, unseen data.
Why Albumentations?
- Best default for augmentation policies: Albumentations is not a competing deep learning framework. It handles augmentation while your PyTorch, TensorFlow/Keras, JAX, or custom model code stays in the framework you already use.
- Fast CPU throughput: Albumentations uses optimized OpenCV and NumPy operations for per-sample augmentation. The benchmark routes and benchmark source make the speed comparisons reproducible instead of anecdotal.
- Target-aware correctness: The same transform call can update images, masks, bounding boxes, keypoints, oriented bounding boxes (OBB), volumes, and video frames together. Without that contract, every crop, flip, affine transform, and filtering rule becomes project code you must test and maintain.
- Broad augmentation coverage: Albumentations includes simple flips, geometric transforms, color and weather effects, dropout variants, domain-specific transforms, and utilities for non-RGB arrays.
- Debuggable policies: Replay, sampled parameters, and serialization make augmentation part of the experiment definition instead of hidden random state.
- Framework-agnostic integration: Albumentations works with PyTorch, TensorFlow/Keras, JAX, and custom stacks because augmentation happens on arrays before or around the framework-specific tensor path.
Core Concepts
As you explore Albumentations, you'll encounter these key ideas:
- Transforms: Individual augmentation operations (e.g., Rotate, GaussianBlur, RandomBrightnessContrast). Each transform typically has parameters to control its behavior and a probability
pto control how often it's applied. - Pipelines (
Compose): Chains of transforms are defined usingCompose. This allows you to sequence multiple augmentations and apply them together.Composealso has its own probabilityp. - Targets: Albumentations applies transformations consistently across different types of data associated with an image, such as masks, bounding boxes, and keypoints. You specify what you're passing in (e.g.,
image,mask,bboxes).
Where to Go Next?
Ready to dive in? Here are some recommended next steps:
- Installation: Get Albumentations set up in your environment.
- Core Concepts: Understand the building blocks:
- Transforms: Individual augmentation operations.
- Pipelines (Compose): Sequencing multiple transforms.
- Targets: Applying transforms to images, masks, bounding boxes, etc.
- Setting Probabilities: Controlling the likelihood of applying transforms.
- Basic Usage Examples: See how to apply augmentations for common tasks like:
- Explore Transforms: Visually experiment with transforms and their parameters.
We hope you find Albumentations helpful for your projects!