AlbumentationsX vs PIL/Pillow
Compare AlbumentationsX with PIL/Pillow for image augmentation: API differences, CPU benchmark results, and migration examples.
What Is Different?
Pillow is an image toolkit. AlbumentationsX is an augmentation pipeline library. That sounds subtle, but it changes the shape of the code: Pillow gives you image operations; AlbumentationsX gives you randomized, reproducible transforms that keep images, masks, bounding boxes, and keypoints synchronized.
- Pillow works with PIL Image objects; AlbumentationsX works with NumPy arrays and returns a dictionary of augmented targets.
- Pillow is great for loading, saving, drawing, and simple image edits; AlbumentationsX is built for training-time augmentation pipelines.
- AlbumentationsX has first-class random composition, probabilities, target synchronization, and bbox/keypoint parameter handling.
- Pillow code usually becomes manual orchestration when masks or labels must follow the image; AlbumentationsX keeps that in the pipeline contract.
Benchmark vs PIL/Pillow
The benchmark below compares single-threaded RGB image augmentation throughput. Pillow receives PIL images; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.
| Transform | albumentationsx 2.2.5 CPU · macOS arm64 | PIL/Pillow 12.2.0 CPU · macOS arm64 | Speedup Albx / best other (range) |
|---|---|---|---|
| MedianBlur | 1546 ± 16 | 11 ± 0 | 140-147x |
| Contrast | 10045 ± 119 | 1055 ± 6 | 9.4-9.7x |
| Brightness | 9849 ± 99 | 1340 ± 6 | 7.2-7.5x |
| UnsharpMask | 3063 ± 37 | 478 ± 2 | 6.3-6.5x |
| Invert | 31753 ± 1327 | 5503 ± 17 | 5.5-6.0x |
| Posterize | 28724 ± 3259 | 5429 ± 10 | 4.7-5.9x |
| Blur | 7544 ± 134 | 1870 ± 13 | 3.9-4.1x |
| Resize | 3542 ± 11 | 1087 ± 9 | 3.2-3.3x |
| GaussianBlur | 2462 ± 11 | 765 ± 3 | 3.2-3.2x |
| Shear | 1322 ± 7 | 502 ± 2 | 2.6-2.7x |
| Solarize | 13505 ± 442 | 5403 ± 13 | 2.4-2.6x |
| Affine | 1456 ± 23 | 613 ± 1 | 2.3-2.4x |
| Pad | 34979 ± 3274 | 27167 ± 282 | 1.2-1.4x |
| Saturation | 1389 ± 27 | 1324 ± 6 | 1.0-1.1x |
| Colorize | 3858 ± 11 | 3697 ± 18 | 1.0-1.1x |
| JpegCompression | 1351 ± 11 | 1305 ± 5 | 1.0-1.0x |
| Grayscale | 19593 ± 350 | 19267 ± 61 | 1.0-1.0x |
| HorizontalFlip | 13200 ± 430 | 14680 ± 194 | 0.9-0.9x |
| Transpose | 8184 ± 199 | 11038 ± 172 | 0.7-0.8x |
| Rotate | 2996 ± 12 | 4101 ± 119 | 0.7-0.8x |
| AutoContrast | 1619 ± 44 | 2239 ± 3 | 0.7-0.7x |
| VerticalFlip | 29169 ± 2657 | 41794 ± 189 | 0.6-0.8x |
| Equalize | 1086 ± 12 | 2204 ± 2 | 0.5-0.5x |
| Dithering | 6 ± 0 | 1426 ± 16 | 0.0-0.0x |
| CLAHE | 644 ± 5 | — | — |
| CenterCrop128 | 95346 ± 1281 | — | — |
| ChannelDropout | 11971 ± 434 | — | — |
| ChannelShuffle | 8235 ± 86 | — | — |
| ColorJiggle | 1208 ± 16 | — | — |
| ColorJitter | 1221 ± 10 | — | — |
| CornerIllumination | 866 ± 28 | — | — |
| Elastic | 453 ± 2 | — | — |
| Erasing | 27849 ± 4028 | — | — |
| GaussianIllumination | 773 ± 21 | — | — |
| GaussianNoise | 328 ± 20 | — | — |
| Hue | 1908 ± 18 | — | — |
| LinearIllumination | 557 ± 18 | — | — |
| LongestMaxSize | 3847 ± 62 | — | — |
| MotionBlur | 3847 ± 49 | — | — |
| Normalize | 1642 ± 26 | — | — |
| OpticalDistortion | 395 ± 4 | — | — |
| Perspective | 1185 ± 9 | — | — |
| PhotoMetricDistort | 1070 ± 19 | — | — |
| PlankianJitter | 3278 ± 13 | — | — |
| PlasmaBrightness | 394 ± 9 | — | — |
| PlasmaContrast | 250 ± 6 | — | — |
| PlasmaShadow | 526 ± 8 | — | — |
| RGBShift | 5025 ± 48 | — | — |
| Rain | 2169 ± 27 | — | — |
| RandomCrop128 | 93574 ± 1964 | — | — |
| RandomGamma | 14482 ± 424 | — | — |
| RandomJigsaw | 9413 ± 136 | — | — |
| RandomResizedCrop | 4354 ± 22 | — | — |
| RandomRotate90 | 8652 ± 167 | — | — |
| SaltAndPepper | 946 ± 4 | — | — |
| Sharpen | 2221 ± 35 | — | — |
| SmallestMaxSize | 2676 ± 7 | — | — |
| Snow | 754 ± 4 | — | — |
| ThinPlateSpline | 92 ± 1 | — | — |
See the aggregate image benchmark or inspect the benchmark source code.
Conversion Guide
The main conversion is to move from calling Pillow methods one at a time to defining an AlbumentationsX Compose pipeline.
- Convert PIL images to NumPy arrays before augmentation.
- Replace manual randomness with transform-level p values.
- Keep image-adjacent targets in the same Compose call instead of transforming them separately.
- Convert back to PIL only if downstream code specifically needs PIL objects.
from PIL import Image, ImageEnhance
import random
image = Image.open("image.jpg").convert("RGB")
if random.random() < 0.5:
image = image.transpose(Image.Transpose.FLIP_LEFT_RIGHT)
if random.random() < 0.5:
image = ImageEnhance.Brightness(image).enhance(1.2)import albumentations as A
import cv2
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.0, p=0.5),
])
image = cv2.cvtColor(cv2.imread("image.jpg"), cv2.COLOR_BGR2RGB)
image = transform(image=image)["image"]Use AlbumentationsX When
- Training-time augmentation where randomness, replayability, and target synchronization matter.
- Segmentation, detection, keypoint, OCR, document, satellite, medical, or any multi-target computer vision workflow.
- CPU data-loader pipelines where augmentation speed can become the training bottleneck.
Use PIL/Pillow When
- Image IO, format conversion, drawing, thumbnails, and lightweight one-off image manipulation.
- Small scripts where you only touch a single image and do not need labels, masks, or reproducible random policies.