AlbumentationsX vs PIL/Pillow

Compare AlbumentationsX with PIL/Pillow for image augmentation: API differences, CPU benchmark results, and migration examples.

What Is Different?

Pillow is an image toolkit. AlbumentationsX is an augmentation pipeline library. That sounds subtle, but it changes the shape of the code: Pillow gives you image operations; AlbumentationsX gives you randomized, reproducible transforms that keep images, masks, bounding boxes, and keypoints synchronized.

Pillow works with PIL Image objects; AlbumentationsX works with NumPy arrays and returns a dictionary of augmented targets.
Pillow is great for loading, saving, drawing, and simple image edits; AlbumentationsX is built for training-time augmentation pipelines.
AlbumentationsX has first-class random composition, probabilities, target synchronization, and bbox/keypoint parameter handling.
Pillow code usually becomes manual orchestration when masks or labels must follow the image; AlbumentationsX keeps that in the pipeline contract.

Benchmark vs PIL/Pillow

The benchmark data is generated during the site build from the published benchmark artifacts. Results are shown for the full PyTorch DataLoader pipeline and for the single-operation RGB micro benchmark. Published GPU artifacts are shown in the DataLoader table where they exist; the RGB micro table on this page is CPU.

DataLoader Pipeline

End-to-end in-memory DataLoader throughput with 8 workers, batch size 256, normalization, tensor conversion, batching, and collation included. The table shows the variable transform name only. Higher images/second is better. CPU and GPU columns are shown where published artifacts exist.

26 / 26

pipelines where AlbumentationsX is faster

1.41x

median speedup

1.35x-1.70x IQR

2.2.6 vs 12.2.0

library versions

from generated benchmark metadata

Transform	Speedup Albx / best	AlbumentationsX CPU · 2.2.6	PIL/Pillow CPU · 12.2.0
Resize	≥67x	1350 ± 0	≤20
MedianBlur	24x	4038 ± 0	167 ± 0
UnsharpMask	2.2x	4635 ± 0	2129 ± 0
GaussianBlur	2.0x	5029 ± 0	2474 ± 0
EnhanceEdge	1.8x	4923 ± 0	2688 ± 0
EnhanceDetail	1.7x	5033 ± 0	2922 ± 0
RandomResizedCrop	1.7x	5056 ± 0	2961 ± 0
Contrast	1.7x	5435 ± 0	3244 ± 0
Blur	1.7x	5507 ± 0	3326 ± 0
Shear	1.6x	4141 ± 0	2621 ± 0
Affine	1.6x	4528 ± 0	2902 ± 0
Brightness	1.5x	5310 ± 0	3551 ± 0
Invert	1.4x	5569 ± 0	3890 ± 0
Posterize	1.4x	5431 ± 0	3897 ± 0
Solarize	1.4x	5407 ± 0	3890 ± 0
Pad	1.4x	4996 ± 0	3603 ± 0
JpegCompression	1.4x	4267 ± 0	3140 ± 0
Saturation	1.4x	4646 ± 0	3436 ± 0
VerticalFlip	1.4x	5512 ± 0	4078 ± 0
Grayscale	1.3x	5378 ± 0	3984 ± 0
Transpose	1.3x	5338 ± 0	4081 ± 0
AutoContrast	1.3x	4646 ± 0	3577 ± 0
Equalize	1.3x	4511 ± 0	3528 ± 0
RandomCrop224	1.3x	5130 ± 0	4034 ± 0
HorizontalFlip	1.2x	5002 ± 0	4074 ± 0
Rotate	1.2x	4865 ± 0	3981 ± 0
CLAHE	—	3505 ± 0	—
ChannelDropout	—	5314 ± 0	—
ChannelShuffle	—	5467 ± 0	—
ColorJiggle	—	4255 ± 0	—
ColorJitter	—	4299 ± 0	—
CornerIllumination	—	3969 ± 0	—
Elastic	—	2878 ± 0	—
Erasing	—	5415 ± 0	—
GaussianIllumination	—	3656 ± 0	—
GaussianNoise	—	3415 ± 0	—
Hue	—	4723 ± 0	—
LinearIllumination	—	4179 ± 0	—
LongestMaxSize	—	1363 ± 0	—
MotionBlur	—	4785 ± 0	—
OpticalDistortion	—	3579 ± 0	—
Perspective	—	4132 ± 0	—
PhotoMetricDistort	—	4269 ± 0	—
PlankianJitter	—	4979 ± 0	—
PlasmaBrightness	—	2727 ± 0	—
PlasmaContrast	—	2291 ± 0	—
PlasmaShadow	—	2797 ± 0	—
RGBShift	—	4977 ± 0	—
Rain	—	4198 ± 0	—
RandomGamma	—	5450 ± 0	—
RandomJigsaw	—	4863 ± 0	—
RandomRotate90	—	5087 ± 0	—
SaltAndPepper	—	4508 ± 0	—
Sharpen	—	4978 ± 0	—
SmallestMaxSize	—	1370 ± 0	—
Snow	—	4135 ± 0	—
ThinPlateSpline	—	743 ± 0	—

Micro Benchmark

The benchmark below compares single-threaded RGB image augmentation throughput. Pillow receives PIL images; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.

23 / 24

transforms where AlbumentationsX is faster

4.56x

median speedup

3.01x-7.69x IQR

2.2.6 vs 12.2.0

library versions

from generated benchmark metadata

Transform	Speedup Albx / best	albumentationsx 2.2.6	PIL/Pillow 12.2.0
MedianBlur	≥42x	843 ± 4	≤20
Contrast	16x	6933 ± 30	443 ± 1
GaussianBlur	14x	2343 ± 4	169 ± 1
Brightness	11x	6912 ± 13	609 ± 4
Blur	11x	4449 ± 17	409 ± 3
EnhanceDetail	7.7-7.9x	2148 ± 13	275 ± 1
Invert	7.5-7.8x	15095 ± 61	1974 ± 26
Posterize	7.2-7.3x	14399 ± 58	1977 ± 7
UnsharpMask	6.7-6.8x	906 ± 2	134 ± 0
EnhanceEdge	6.2-6.4x	1373 ± 16	219 ± 0
Resize	6.1-6.4x	2463 ± 37	396 ± 4
Solarize	4.9-5.0x	9760 ± 34	1966 ± 6
Pad	4.1-4.2x	13181 ± 118	3167 ± 37
VerticalFlip	3.8-3.9x	14051 ± 55	3670 ± 21
Shear	3.6x	784 ± 6	217 ± 0
Affine	3.3-3.4x	872 ± 8	264 ± 2
Grayscale	3.2-3.3x	5194 ± 1	1591 ± 15
HorizontalFlip	3.2-3.3x	8416 ± 19	2612 ± 21
Transpose	2.3-2.4x	4627 ± 26	1934 ± 35
Saturation	1.7x	847 ± 17	500 ± 3
AutoContrast	1.4x	1243 ± 19	899 ± 4
Rotate	1.3-1.4x	1408 ± 40	1045 ± 13
JpegCompression	1.3-1.4x	692 ± 7	515 ± 1
Equalize	0.9x	807 ± 3	882 ± 12
CLAHE	—	283 ± 1	—
ChannelDropout	—	6810 ± 65	—
ChannelShuffle	—	4337 ± 13	—
ColorJiggle	—	639 ± 5	—
ColorJitter	—	641 ± 1	—
CornerIllumination	—	425 ± 2	—
Elastic	—	191 ± 0	—
Erasing	—	9511 ± 74	—
GaussianIllumination	—	388 ± 1	—
GaussianNoise	—	225 ± 0	—
Hue	—	967 ± 1	—
LinearIllumination	—	521 ± 1	—
LongestMaxSize	—	2825 ± 42	—
MotionBlur	—	1953 ± 21	—
OpticalDistortion	—	274 ± 1	—
Perspective	—	559 ± 2	—
PhotoMetricDistort	—	581 ± 4	—
PlankianJitter	—	2253 ± 17	—
PlasmaBrightness	—	267 ± 1	—
PlasmaContrast	—	143 ± 0	—
PlasmaShadow	—	420 ± 3	—
RGBShift	—	2292 ± 3	—
Rain	—	1259 ± 2	—
RandomCrop224	—	38380 ± 192	—
RandomGamma	—	9938 ± 46	—
RandomJigsaw	—	5172 ± 16	—
RandomResizedCrop	—	7150 ± 19	—
RandomRotate90	—	5990 ± 85	—
SaltAndPepper	—	738 ± 10	—
Sharpen	—	1388 ± 5	—
SmallestMaxSize	—	2017 ± 25	—
Snow	—	489 ± 3	—
ThinPlateSpline	—	52 ± 0	—

See the aggregate image benchmark or inspect the benchmark source code.

Conversion Guide

The main conversion is to move from calling Pillow methods one at a time to defining an AlbumentationsX Compose pipeline.

Convert PIL images to NumPy arrays before augmentation.
Replace manual randomness with transform-level p values.
Keep image-adjacent targets in the same Compose call instead of transforming them separately.
Convert back to PIL only if downstream code specifically needs PIL objects.

Pillow

from PIL import Image, ImageEnhance
import random

image = Image.open("image.jpg").convert("RGB")

if random.random() < 0.5:
    image = image.transpose(Image.Transpose.FLIP_LEFT_RIGHT)

if random.random() < 0.5:
    image = ImageEnhance.Brightness(image).enhance(1.2)

AlbumentationsX

import albumentations as A
import cv2

transform = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.0, p=0.5),
])

image = cv2.cvtColor(cv2.imread("image.jpg"), cv2.COLOR_BGR2RGB)
image = transform(image=image)["image"]

Use AlbumentationsX When

Training-time augmentation where randomness, replayability, and target synchronization matter.
Segmentation, detection, keypoint, OCR, document, satellite, medical, or any multi-target computer vision workflow.
CPU data-loader pipelines where augmentation speed can become the training bottleneck.

Use PIL/Pillow When

Image IO, format conversion, drawing, thumbnails, and lightweight one-off image manipulation.
Small scripts where you only touch a single image and do not need labels, masks, or reproducible random policies.