AlbumentationsX vs Kornia
Compare AlbumentationsX with Kornia for image augmentation: CPU/GPU tradeoffs, RGB benchmark results, and conversion examples.
What Is Different?
Kornia is a differentiable computer vision library for PyTorch tensors. AlbumentationsX is a fast CPU augmentation library for NumPy arrays before data reaches the model.
- Kornia is tensor-first and shines on batched GPU augmentation; AlbumentationsX is NumPy-first and shines in CPU data-loading pipelines.
- Kornia transforms can be differentiable and participate in model graphs; AlbumentationsX transforms are preprocessing/data augmentation steps.
- AlbumentationsX has broad task-level target handling for images, masks, boxes, keypoints, volumes, and multiple related inputs.
- Kornia is a better fit when augmentation must happen on-device after batching; AlbumentationsX is usually simpler before batching.
Benchmark vs Kornia
The benchmark data is generated during the site build from the published benchmark artifacts. Results are shown for the full PyTorch DataLoader pipeline and for the single-operation RGB micro benchmark. Published GPU artifacts are shown in the DataLoader table where they exist; the RGB micro table on this page is CPU.
DataLoader Pipeline
End-to-end in-memory DataLoader throughput with 8 workers, batch size 256, normalization, tensor conversion, batching, and collation included. The table shows the variable transform name only. Higher images/second is better. CPU and GPU columns are shown where published artifacts exist.
| Transform | Speedup Albx / best | AlbumentationsX CPU · 2.2.6 | Kornia CPU · 0.8.2 | Kornia GPU · 0.8.2 |
|---|---|---|---|---|
| MedianBlur | 12x | 4038 ± 0 | 89 ± 0 | 327 ± 2 |
| Elastic | 9.0-9.2x | 2878 ± 0 | 103 ± 0 | 316 ± 2 |
| JpegCompression | 5.5x | 4267 ± 0 | 771 ± 0 | 590 ± 2 |
| ColorJiggle | 5.3x | 4255 ± 0 | 800 ± 0 | 529 ± 2 |
| PlasmaBrightness | 4.7-5.1x | 2727 ± 0 | 458 ± 0 | 562 ± 22 |
| CLAHE | 4.4x | 3505 ± 0 | 789 ± 0 | 165 ± 0 |
| ColorJitter | 4.4x | 4299 ± 0 | 987 ± 0 | 573 ± 3 |
| Blur | 4.2x | 5507 ± 0 | 1321 ± 0 | 651 ± 2 |
| Hue | 4.1x | 4723 ± 0 | 1144 ± 0 | 586 ± 6 |
| PlasmaContrast | 4.1x | 2291 ± 0 | 455 ± 0 | 557 ± 2 |
| Saturation | 4.1x | 4646 ± 0 | 1141 ± 0 | 587 ± 4 |
| Sharpen | 3.9x | 4978 ± 0 | 1267 ± 0 | 627 ± 5 |
| GaussianBlur | 3.9x | 5029 ± 0 | 1289 ± 0 | 650 ± 5 |
| MotionBlur | 3.9x | 4785 ± 0 | 1238 ± 0 | 657 ± 2 |
| Snow | 3.8x | 4135 ± 0 | 1102 ± 0 | 596 ± 3 |
| Erasing | 3.6x | 5415 ± 0 | 1522 ± 0 | 503 ± 2 |
| Solarize | 3.4x | 5407 ± 0 | 1609 ± 0 | 644 ± 2 |
| RandomGamma | 3.4x | 5450 ± 0 | 1624 ± 0 | 653 ± 5 |
| Equalize | 3.4x | 4511 ± 0 | 1344 ± 0 | 321 ± 0 |
| RandomRotate90 | 3.3x | 5087 ± 0 | 1520 ± 0 | 632 ± 8 |
| Posterize | 3.3x | 5431 ± 0 | 1660 ± 0 | 562 ± 3 |
| Rotate | 3.3x | 4865 ± 0 | 1488 ± 0 | 642 ± 5 |
| ChannelShuffle | 3.1x | 5467 ± 0 | 1739 ± 0 | 643 ± 2 |
| Perspective | 3.1x | 4132 ± 0 | 1325 ± 0 | 635 ± 1 |
| SaltAndPepper | 3.1x | 4508 ± 0 | 1462 ± 0 | 375 ± 11 |
| ChannelDropout | 3.0x | 5314 ± 0 | 1742 ± 0 | 660 ± 3 |
| RandomResizedCrop | 3.0x | 5056 ± 0 | 1659 ± 0 | 905 ± 7 |
| Contrast | 3.0x | 5435 ± 0 | 1829 ± 0 | 649 ± 5 |
| Invert | 2.9x | 5569 ± 0 | 1898 ± 0 | 659 ± 3 |
| PlasmaShadow | 2.9x | 2797 ± 0 | 956 ± 0 | 612 ± 22 |
| RandomJigsaw | 2.9x | 4863 ± 0 | 1664 ± 0 | 636 ± 2 |
| VerticalFlip | 2.9x | 5512 ± 0 | 1887 ± 0 | 669 ± 2 |
| Grayscale | 2.9x | 5378 ± 0 | 1845 ± 0 | 655 ± 2 |
| Brightness | 2.9x | 5310 ± 0 | 1832 ± 0 | 651 ± 5 |
| Affine | 2.9x | 4528 ± 0 | 1566 ± 0 | 646 ± 2 |
| PlankianJitter | 2.8x | 4979 ± 0 | 1763 ± 0 | 654 ± 8 |
| RGBShift | 2.8x | 4977 ± 0 | 1770 ± 0 | 658 ± 3 |
| RandomCrop224 | 2.8x | 5130 ± 0 | 1856 ± 0 | 959 ± 9 |
| HorizontalFlip | 2.8x | 5002 ± 0 | 1819 ± 0 | 670 ± 2 |
| Rain | 2.7x | 4198 ± 0 | 1541 ± 0 | 306 ± 2 |
| AutoContrast | 2.7x | 4646 ± 0 | 1721 ± 0 | 651 ± 4 |
| Shear | 2.7x | 4141 ± 0 | 1557 ± 0 | — |
| CornerIllumination | 2.6x | 3969 ± 0 | 1502 ± 0 | 395 ± 2 |
| GaussianIllumination | 2.5x | 3656 ± 0 | 1458 ± 0 | — |
| Resize | 2.5x | 1350 ± 0 | 538 ± 0 | 181 ± 0 |
| OpticalDistortion | 2.5x | 3579 ± 0 | 1445 ± 0 | 641 ± 6 |
| SmallestMaxSize | 2.5x | 1370 ± 0 | 555 ± 0 | 179 ± 0 |
| LongestMaxSize | 2.4x | 1363 ± 0 | 560 ± 0 | 179 ± 0 |
| LinearIllumination | 2.4x | 4179 ± 0 | 1726 ± 0 | 503 ± 2 |
| GaussianNoise | 2.1x | 3415 ± 0 | 1634 ± 0 | 660 ± 7 |
| ThinPlateSpline | 0.9x | 743 ± 0 | 793 ± 0 | 598 ± 2 |
| EnhanceDetail | — | 5033 ± 0 | — | — |
| EnhanceEdge | — | 4923 ± 0 | — | — |
| Pad | — | 4996 ± 0 | — | — |
| PhotoMetricDistort | — | 4269 ± 0 | — | — |
| Transpose | — | 5338 ± 0 | — | — |
| UnsharpMask | — | 4635 ± 0 | — | — |
Micro Benchmark
The benchmark below compares single-threaded RGB image augmentation throughput. Kornia receives normalized tensors; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.
| Transform | Speedup Albx / best | albumentationsx 2.2.6 | Kornia 0.8.2 |
|---|---|---|---|
| Blur | 78-79x | 4449 ± 17 | 57 ± 0 |
| Posterize | 48-51x | 14399 ± 58 | 290 ± 9 |
| Solarize | 45-46x | 9760 ± 34 | 214 ± 1 |
| MedianBlur | ≥42x | 843 ± 4 | ≤20 |
| GaussianBlur | 41x | 2343 ± 4 | 57 ± 0 |
| RandomCrop224 | 39-40x | 38380 ± 192 | 981 ± 5 |
| RandomGamma | 32-33x | 9938 ± 46 | 308 ± 2 |
| Erasing | 32x | 9511 ± 74 | 298 ± 1 |
| MotionBlur | 24-25x | 1953 ± 21 | 81 ± 1 |
| Sharpen | 24x | 1388 ± 5 | 58 ± 0 |
| RandomJigsaw | 23-24x | 5172 ± 16 | 219 ± 2 |
| ColorJiggle | 19x | 639 ± 5 | 34 ± 0 |
| RandomRotate90 | 18x | 5990 ± 85 | 333 ± 4 |
| JpegCompression | 16x | 692 ± 7 | 43 ± 0 |
| Invert | 15x | 15095 ± 61 | 1015 ± 2 |
| Hue | 15x | 967 ± 1 | 66 ± 0 |
| PlasmaBrightness | ≥13x | 267 ± 1 | ≤20 |
| VerticalFlip | 13x | 14051 ± 55 | 1067 ± 2 |
| Saturation | 12-13x | 847 ± 17 | 67 ± 0 |
| Grayscale | 12x | 5194 ± 1 | 418 ± 1 |
| ColorJitter | 12-13x | 641 ± 1 | 52 ± 1 |
| RandomResizedCrop | 11-12x | 7150 ± 19 | 622 ± 3 |
| Elastic | ≥9.5x | 191 ± 0 | ≤20 |
| SmallestMaxSize | 9.3-9.6x | 2017 ± 25 | 214 ± 1 |
| HorizontalFlip | 9.0-9.3x | 8416 ± 19 | 920 ± 10 |
| Resize | 8.9-9.3x | 2463 ± 37 | 271 ± 1 |
| Brightness | 8.9-9.1x | 6912 ± 13 | 766 ± 7 |
| Contrast | 8.8-9.1x | 6933 ± 30 | 771 ± 9 |
| ChannelShuffle | 8.9-9.0x | 4337 ± 13 | 487 ± 2 |
| LongestMaxSize | 8.4-8.7x | 2825 ± 42 | 330 ± 1 |
| ChannelDropout | 8.1-8.4x | 6810 ± 65 | 828 ± 6 |
| PlasmaShadow | 7.9-8.0x | 420 ± 3 | 53 ± 0 |
| Snow | 7.8-7.9x | 489 ± 3 | 62 ± 0 |
| PlasmaContrast | ≥7.1x | 143 ± 0 | ≤20 |
| Equalize | 6.3x | 807 ± 3 | 128 ± 0 |
| AutoContrast | 5.3-5.5x | 1243 ± 19 | 231 ± 1 |
| SaltAndPepper | 4.7-4.9x | 738 ± 10 | 154 ± 1 |
| GaussianNoise | 4.6x | 225 ± 0 | 49 ± 0 |
| CLAHE | 4.6x | 283 ± 1 | 62 ± 0 |
| Rotate | 4.2-4.5x | 1408 ± 40 | 325 ± 2 |
| PlankianJitter | 3.8-3.9x | 2253 ± 17 | 580 ± 2 |
| RGBShift | 3.8-3.9x | 2292 ± 3 | 597 ± 3 |
| Perspective | 3.1x | 559 ± 2 | 181 ± 1 |
| CornerIllumination | 2.7x | 425 ± 2 | 157 ± 0 |
| Rain | 2.4x | 1259 ± 2 | 527 ± 5 |
| Affine | 2.1-2.2x | 872 ± 8 | 402 ± 3 |
| GaussianIllumination | 2.1x | 388 ± 1 | 188 ± 0 |
| Shear | 1.9-2.0x | 784 ± 6 | 403 ± 1 |
| LinearIllumination | 1.6x | 521 ± 1 | 327 ± 3 |
| ThinPlateSpline | 1.4x | 52 ± 0 | 36 ± 0 |
| OpticalDistortion | 1.4x | 274 ± 1 | 201 ± 1 |
| EnhanceDetail | — | 2148 ± 13 | — |
| EnhanceEdge | — | 1373 ± 16 | — |
| Pad | — | 13181 ± 118 | — |
| PhotoMetricDistort | — | 581 ± 4 | — |
| Transpose | — | 4627 ± 26 | — |
| UnsharpMask | — | 906 ± 2 | — |
See the aggregate image benchmark or inspect the benchmark source code.
Conversion Guide
The conversion usually moves augmentation from tensor batches in the training step into the Dataset/DataLoader preprocessing path.
- Apply AlbumentationsX before converting the sample to a tensor.
- Use ToTensorV2 at the end of the pipeline if the model expects PyTorch tensors.
- Move per-batch GPU-only transforms to AlbumentationsX only when they do not require differentiability or batched tensor semantics.
- Keep Kornia for model-integrated or differentiable computer vision operations.
import kornia.augmentation as K
import torch.nn as nn
augment = nn.Sequential(
K.RandomHorizontalFlip(p=0.5),
K.ColorJitter(brightness=0.2, contrast=0.2, p=0.5),
)
batch = augment(batch)import albumentations as A
from albumentations.pytorch import ToTensorV2
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
ToTensorV2(),
])
sample = transform(image=image_np)
image = sample["image"]Use AlbumentationsX When
- CPU-side data augmentation before batching.
- Classic supervised CV pipelines with masks, bounding boxes, keypoints, or multiple aligned images.
- Workloads where augmentation speed in the input pipeline matters more than differentiability.
Use Kornia When
- Differentiable image processing inside PyTorch models.
- GPU batched augmentation, especially when the input pipeline is already tensor-native.
- Research code that needs gradients through geometric or photometric image operations.