Interested in advertising?

Contact us

Stay updated

News & Insights
Lib ComparisonTargets by TransformFAQ
API Reference

Video Benchmark Results 🔗

Video Benchmark Summary

Video Augmentation Benchmarks 🔗

This directory contains benchmark results for video augmentation libraries.

Overview 🔗

The video benchmarks measure the performance of various augmentation libraries on video transformations. The benchmarks compare CPU-based processing (Albumentations) with GPU-accelerated processing (Kornia).

Dataset 🔗

The benchmarks use the UCF101 dataset, which contains 13,320 videos from 101 action categories. The videos are realistic, collected from YouTube, and include a wide variety of camera motion, object appearance, pose, scale, viewpoint, and background. This makes it an excellent dataset for benchmarking video augmentation performance across diverse real-world scenarios.

You can download the dataset from: https://www.crcv.ucf.edu/data/UCF101/UCF101.rar

Methodology 🔗

  1. Video Loading: Videos are loaded using library-specific loaders:

    • OpenCV for Albumentations
    • PyTorch tensors for Kornia
  2. Warmup Phase:

    • Performs adaptive warmup until performance variance stabilizes
    • Uses configurable parameters for stability detection
    • Implements early stopping for slow transforms
  3. Measurement Phase:

    • Multiple runs of each transform
    • Measures throughput (videos/second)
    • Calculates statistical metrics (median, standard deviation)
  4. Environment Control:

    • CPU benchmarks are run single-threaded
    • GPU benchmarks utilize the specified GPU device
    • Thread settings are controlled for consistent results

Hardware Comparison 🔗

The benchmarks compare:

  • Albumentations: CPU-based processing (single thread)
  • Kornia: GPU-accelerated processing (NVIDIA GPUs)

This provides insights into the trade-offs between CPU and GPU processing for video augmentation.

Running the Benchmarks 🔗

To run the video benchmarks:

./run_video_single.sh -l albumentations -d /path/to/videos -o /path/to/output

To run all libraries and generate a comparison:

./run_video_all.sh -d /path/to/videos -o /path/to/output

Benchmark Results 🔗

Video Benchmark Results 🔗

Number shows how many videos per second can be processed. Larger is better. The Speedup column shows how many times faster Albumentations is compared to the fastest other library for each transform.

Transformalbumentations (videos per second)
arm (1 core)
kornia (videos per second)
NVIDIA GeForce RTX 4090
torchvision (videos per second)
NVIDIA GeForce RTX 4090
Speedup
(Alb/fastest other)
Affine4.05 ± 0.1621.39 ± 0.05452.58 ± 0.140.01x
AutoContrast19.51 ± 0.2121.41 ± 0.02577.72 ± 16.860.03x
Blur43.58 ± 1.9520.61 ± 0.06N/A2.11x
Brightness177.48 ± 8.7821.85 ± 0.02755.52 ± 435.170.23x
CLAHE8.49 ± 0.24N/AN/AN/A
CenterCrop128781.08 ± 31.0270.12 ± 1.291133.39 ± 234.600.69x
ChannelDropout59.40 ± 1.7321.81 ± 0.03N/A2.72x
ChannelShuffle22.11 ± 0.1319.99 ± 0.03958.35 ± 0.200.02x
CoarseDropout299.85 ± 7.78N/AN/AN/A
ColorJitter10.38 ± 0.3918.79 ± 0.0368.75 ± 0.130.15x
Contrast175.39 ± 6.6021.69 ± 0.04546.55 ± 13.230.32x
CornerIllumination4.86 ± 0.152.60 ± 0.07N/A1.87x
Elastic4.23 ± 0.02N/A126.83 ± 1.280.03x
Equalize12.90 ± 0.374.21 ± 0.00191.55 ± 1.250.07x
Erasing365.07 ± 7.73N/A254.59 ± 6.571.43x
GaussianBlur25.92 ± 0.2621.61 ± 0.05543.44 ± 11.500.05x
GaussianIllumination6.47 ± 0.3820.33 ± 0.08N/A0.32x
GaussianNoise9.00 ± 0.2822.38 ± 0.08N/A0.40x
Grayscale147.73 ± 2.6522.24 ± 0.04838.40 ± 466.760.18x
HSV6.62 ± 0.07N/AN/AN/A
HorizontalFlip26.98 ± 0.1821.86 ± 0.07977.87 ± 49.030.03x
Hue13.68 ± 0.5019.53 ± 0.02N/A0.70x
Invert306.14 ± 23.1321.91 ± 0.23843.27 ± 176.000.36x
JpegCompression19.85 ± 0.25N/AN/AN/A
LinearIllumination4.74 ± 0.174.29 ± 0.19N/A1.10x
MedianBlur13.15 ± 0.158.39 ± 0.09N/A1.57x
MotionBlur40.11 ± 0.83N/AN/AN/A
Normalize21.13 ± 0.3121.82 ± 0.02460.80 ± 0.180.05x
OpticalDistortion4.62 ± 0.02N/AN/AN/A
Pad217.81 ± 1.31N/A759.68 ± 337.780.29x
Perspective4.14 ± 0.13N/A434.75 ± 0.140.01x
PlankianJitter24.86 ± 1.6710.85 ± 0.01N/A2.29x
PlasmaBrightness3.32 ± 0.0516.94 ± 0.36N/A0.20x
PlasmaContrast2.50 ± 0.0216.97 ± 0.03N/A0.15x
PlasmaShadow5.74 ± 0.2419.03 ± 0.50N/A0.30x
Posterize58.97 ± 1.09N/A631.46 ± 14.740.09x
RGBShift31.06 ± 0.7222.27 ± 0.04N/A1.39x
Rain24.82 ± 0.293.77 ± 0.00N/A6.58x
RandomCrop128755.55 ± 24.2065.33 ± 0.351132.79 ± 15.230.67x
RandomGamma184.46 ± 2.8521.63 ± 0.02N/A8.53x
RandomResizedCrop15.95 ± 0.926.29 ± 0.03182.09 ± 15.750.09x
Resize14.08 ± 0.615.87 ± 0.03139.96 ± 35.040.10x
Rotate28.29 ± 1.8221.53 ± 0.05534.18 ± 0.160.05x
SaltAndPepper10.00 ± 0.068.82 ± 0.12N/A1.13x
Saturation9.01 ± 0.0536.56 ± 0.12N/A0.25x
Sharpen24.77 ± 0.4717.86 ± 0.03420.09 ± 8.990.06x
Shear4.47 ± 0.02N/AN/AN/A
Snow12.62 ± 0.28N/AN/AN/A
Solarize51.74 ± 1.3020.73 ± 0.02628.42 ± 5.910.08x
ThinPlateSpline4.33 ± 0.0244.90 ± 0.67N/A0.10x
VerticalFlip394.56 ± 5.9621.96 ± 0.24977.92 ± 5.220.40x

Torchvision Metadata 🔗

system_info:
  python_version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27)
    [GCC 11.2.0]
  platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.31
  processor: x86_64
  cpu_count: '64'
  timestamp: '2025-03-11T11:14:57.765540+00:00'
library_versions:
  torchvision: 0.21.0
  numpy: 2.2.3
  pillow: 11.1.0
  opencv-python-headless: not installed
  torch: 2.6.0
  opencv-python: not installed
thread_settings:
  environment:
    OMP_NUM_THREADS: '1'
    OPENBLAS_NUM_THREADS: '1'
    MKL_NUM_THREADS: '1'
    VECLIB_MAXIMUM_THREADS: '1'
    NUMEXPR_NUM_THREADS: '1'
  opencv: not installed
  pytorch:
    threads: 32
    gpu_available: true
    gpu_device: 0
    gpu_name: NVIDIA GeForce RTX 4090
    gpu_memory_total: 23.55084228515625
    gpu_memory_allocated: 15.05643081665039
  pillow:
    threads: unknown
    simd: false
benchmark_params:
  num_videos: 200
  num_runs: 10
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3
precision: torch.float16

Kornia Metadata 🔗

system_info:
  python_version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27)
    [GCC 11.2.0]
  platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.31
  processor: x86_64
  cpu_count: '64'
  timestamp: '2025-03-11T00:46:14.791885+00:00'
library_versions:
  kornia: 0.8.0
  numpy: 2.2.3
  pillow: 11.1.0
  opencv-python-headless: not installed
  torch: 2.6.0
  opencv-python: not installed
thread_settings:
  environment:
    OMP_NUM_THREADS: '1'
    OPENBLAS_NUM_THREADS: '1'
    MKL_NUM_THREADS: '1'
    VECLIB_MAXIMUM_THREADS: '1'
    NUMEXPR_NUM_THREADS: '1'
  opencv: not installed
  pytorch:
    threads: 32
    gpu_available: true
    gpu_device: 0
    gpu_name: NVIDIA GeForce RTX 4090
    gpu_memory_total: 23.55084228515625
    gpu_memory_allocated: 15.05643081665039
  pillow:
    threads: unknown
    simd: false
benchmark_params:
  num_videos: 200
  num_runs: 5
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3
precision: torch.float16

Albumentations Metadata 🔗

system_info:
  python_version: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 10:37:40)
    [Clang 14.0.6 ]
  platform: macOS-15.5-arm64-arm-64bit
  processor: arm
  cpu_count: '16'
  timestamp: '2025-05-26T23:29:19.852599+00:00'
library_versions:
  albumentations: 2.0.7
  numpy: 2.2.6
  pillow: 11.2.1
  opencv-python-headless: 4.11.0.86
  torch: 2.7.0
  opencv-python: not installed
thread_settings:
  environment:
    OMP_NUM_THREADS: '1'
    OPENBLAS_NUM_THREADS: '1'
    MKL_NUM_THREADS: '1'
    VECLIB_MAXIMUM_THREADS: '1'
    NUMEXPR_NUM_THREADS: '1'
  opencv:
    threads: 1
    opencl: false
  pytorch:
    threads: 1
    gpu_available: false
    gpu_device: null
  pillow:
    threads: unknown
    simd: false
benchmark_params:
  num_videos: 200
  num_runs: 5
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3

Analysis 🔗

The benchmark results show interesting trade-offs between CPU and GPU processing:

  • CPU Advantages:
    • Better for simple transformations with low computational complexity
    • No data transfer overhead between CPU and GPU
    • More consistent performance across different transform types
  • GPU Advantages:
    • Significantly faster for complex transformations
    • Better scaling with video resolution
    • More efficient for batch processing

Recommendations 🔗

Based on the benchmark results, we recommend:

  1. For simple transformations on a small number of videos, CPU processing may be sufficient
  2. For complex transformations or batch processing, GPU acceleration provides significant benefits
  3. Consider the specific transformations you need and their relative performance on CPU vs GPU