Full API Reference on a single page¶
Pixel-level transforms¶
Here is a list of all available pixel-level transforms. You can apply a pixel-level transform to any target, and under the hood, the transform will change only the input image and return any other input targets such as masks, bounding boxes, or keypoints unchanged.
- AdvancedBlur
- Blur
- CLAHE
- ChannelDropout
- ChannelShuffle
- ColorJitter
- Defocus
- Downscale
- Emboss
- Equalize
- FDA
- FancyPCA
- FromFloat
- GaussNoise
- GaussianBlur
- GlassBlur
- HistogramMatching
- HueSaturationValue
- ISONoise
- ImageCompression
- InvertImg
- MedianBlur
- MotionBlur
- MultiplicativeNoise
- Normalize
- PixelDistributionAdaptation
- Posterize
- RGBShift
- RandomBrightnessContrast
- RandomFog
- RandomGamma
- RandomGravel
- RandomRain
- RandomShadow
- RandomSnow
- RandomSunFlare
- RandomToneCurve
- RingingOvershoot
- Sharpen
- Solarize
- Spatter
- Superpixels
- TemplateTransform
- ToFloat
- ToGray
- ToRGB
- ToSepia
- UnsharpMask
- ZoomBlur
Spatial-level transforms¶
Here is a table with spatial-level transforms and targets they support. If you try to apply a spatial-level transform to an unsupported target, Albumentations will raise an error.
Transform | Image | Masks | BBoxes | Keypoints |
---|---|---|---|---|
Affine | ✓ | ✓ | ✓ | ✓ |
BBoxSafeRandomCrop | ✓ | ✓ | ✓ | |
CenterCrop | ✓ | ✓ | ✓ | ✓ |
CoarseDropout | ✓ | ✓ | ✓ | |
Crop | ✓ | ✓ | ✓ | ✓ |
CropAndPad | ✓ | ✓ | ✓ | ✓ |
CropNonEmptyMaskIfExists | ✓ | ✓ | ✓ | ✓ |
ElasticTransform | ✓ | ✓ | ✓ | |
Flip | ✓ | ✓ | ✓ | ✓ |
GridDistortion | ✓ | ✓ | ✓ | |
GridDropout | ✓ | ✓ | ||
HorizontalFlip | ✓ | ✓ | ✓ | ✓ |
Lambda | ✓ | ✓ | ✓ | ✓ |
LongestMaxSize | ✓ | ✓ | ✓ | ✓ |
MaskDropout | ✓ | ✓ | ||
NoOp | ✓ | ✓ | ✓ | ✓ |
OpticalDistortion | ✓ | ✓ | ✓ | |
PadIfNeeded | ✓ | ✓ | ✓ | ✓ |
Perspective | ✓ | ✓ | ✓ | ✓ |
PiecewiseAffine | ✓ | ✓ | ✓ | ✓ |
PixelDropout | ✓ | ✓ | ✓ | ✓ |
RandomCrop | ✓ | ✓ | ✓ | ✓ |
RandomCropFromBorders | ✓ | ✓ | ✓ | ✓ |
RandomCropNearBBox | ✓ | ✓ | ✓ | ✓ |
RandomGridShuffle | ✓ | ✓ | ✓ | |
RandomResizedCrop | ✓ | ✓ | ✓ | ✓ |
RandomRotate90 | ✓ | ✓ | ✓ | ✓ |
RandomScale | ✓ | ✓ | ✓ | ✓ |
RandomSizedBBoxSafeCrop | ✓ | ✓ | ✓ | |
RandomSizedCrop | ✓ | ✓ | ✓ | ✓ |
Resize | ✓ | ✓ | ✓ | ✓ |
Rotate | ✓ | ✓ | ✓ | ✓ |
SafeRotate | ✓ | ✓ | ✓ | ✓ |
ShiftScaleRotate | ✓ | ✓ | ✓ | ✓ |
SmallestMaxSize | ✓ | ✓ | ✓ | ✓ |
Transpose | ✓ | ✓ | ✓ | ✓ |
VerticalFlip | ✓ | ✓ | ✓ | ✓ |
albumentations.augmentations
special
¶
albumentations.augmentations.blur
special
¶
albumentations.augmentations.blur.transforms
¶
class
albumentations.augmentations.blur.transforms.AdvancedBlur
(blur_limit=(3, 7), sigmaX_limit=(0.2, 1.0), sigmaY_limit=(0.2, 1.0), rotate_limit=90, beta_limit=(0.5, 8.0), noise_limit=(0.9, 1.1), always_apply=False, p=0.5)
[view source on GitHub]
¶
Blur the input image using a Generalized Normal filter with a randomly selected parameters. This transform also adds multiplicative noise to generated kernel before convolution.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
maximum Gaussian kernel size for blurring the input image.
Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma
as |
|
sigmaX_limit |
Gaussian kernel standard deviation. Must be in range [0, inf).
If set single value |
|
sigmaY_limit |
Same as |
|
rotate_limit |
Range from which a random angle used to rotate Gaussian kernel is picked. If limit is a single int an angle is picked from (-rotate_limit, rotate_limit). Default: (-90, 90). |
|
beta_limit |
Distribution shape parameter, 1 is the normal distribution. Values below 1.0 make distribution tails heavier than normal, values above 1.0 make it lighter than normal. Default: (0.5, 8.0). |
|
noise_limit |
Multiplicative factor that control strength of kernel noise. Must be positive and preferably
centered around 1.0. If set single value |
|
p |
float |
probability of applying the transform. Default: 0.5. |
Reference: https://arxiv.org/abs/2107.10833
Targets: image Image types: uint8, float32
class
albumentations.augmentations.blur.transforms.Blur
(blur_limit=7, always_apply=False, p=0.5)
[view source on GitHub]
¶
Blur the input image using a random-sized kernel.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int, [int, int] |
maximum kernel size for blurring the input image. Should be in range [3, inf). Default: (3, 7). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.blur.transforms.Defocus
(radius=(3, 10), alias_blur=(0.1, 0.5), always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply defocus transform. See https://arxiv.org/abs/1903.12261.
Parameters:
Name | Type | Description |
---|---|---|
radius |
[int, int] or int |
range for radius of defocusing. If limit is a single int, the range will be [1, limit]. Default: (3, 10). |
alias_blur |
[float, float] or float |
range for alias_blur of defocusing (sigma of gaussian blur). If limit is a single float, the range will be (0, limit). Default: (0.1, 0.5). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: Any
class
albumentations.augmentations.blur.transforms.GaussianBlur
(blur_limit=(3, 7), sigma_limit=0, always_apply=False, p=0.5)
[view source on GitHub]
¶
Blur the input image using a Gaussian filter with a random kernel size.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int, [int, int] |
maximum Gaussian kernel size for blurring the input image.
Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma
as |
sigma_limit |
float, [float, float] |
Gaussian kernel standard deviation. Must be in range [0, inf).
If set single value |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.blur.transforms.GlassBlur
(sigma=0.7, max_delta=4, iterations=2, always_apply=False, mode='fast', p=0.5)
[view source on GitHub]
¶
Apply glass noise to the input image.
Parameters:
Name | Type | Description |
---|---|---|
sigma |
float |
standard deviation for Gaussian kernel. |
max_delta |
int |
max distance between pixels which are swapped. |
iterations |
int |
number of repeats. Should be in range [1, inf). Default: (2). |
mode |
str |
mode of computation: fast or exact. Default: "fast". |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
Reference: | https://arxiv.org/abs/1903.12261 | https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py
class
albumentations.augmentations.blur.transforms.MedianBlur
(blur_limit=7, always_apply=False, p=0.5)
[view source on GitHub]
¶
Blur the input image using a median filter with a random aperture linear size.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int |
maximum aperture linear size for blurring the input image. Must be odd and in range [3, inf). Default: (3, 7). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.blur.transforms.MotionBlur
(blur_limit=7, allow_shifted=True, always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply motion blur to the input image using a random-sized kernel.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int |
maximum kernel size for blurring the input image. Should be in range [3, inf). Default: (3, 7). |
allow_shifted |
bool |
if set to true creates non shifted kernels only, otherwise creates randomly shifted kernels. Default: True. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.blur.transforms.ZoomBlur
(max_factor=1.31, step_factor=(0.01, 0.03), always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply zoom blur transform. See https://arxiv.org/abs/1903.12261.
Parameters:
Name | Type | Description |
---|---|---|
max_factor |
[float, float] or float |
range for max factor for blurring. If max_factor is a single float, the range will be (1, limit). Default: (1, 1.31). All max_factor values should be larger than 1. |
step_factor |
[float, float] or float |
If single float will be used as step parameter for np.arange.
If tuple of float step_factor will be in range |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: Any
albumentations.augmentations.crops
special
¶
albumentations.augmentations.crops.functional
¶
def
albumentations.augmentations.crops.functional.bbox_crop (bbox, x_min, y_min, x_max, y_max, rows, cols)
[view source on GitHub]¶
Crop a bounding box.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
x_min |
int |
|
y_min |
int |
|
x_max |
int |
|
y_max |
int |
|
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
tuple |
A cropped bounding box |
def
albumentations.augmentations.crops.functional.crop_bbox_by_coords (bbox, crop_coords, crop_height, crop_width, rows, cols)
[view source on GitHub]¶
Crop a bounding box using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A cropped box |
crop_coords |
Tuple[int, int, int, int] |
Crop coordinates |
crop_height |
int |
|
crop_width |
int |
|
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
tuple |
A cropped bounding box |
def
albumentations.augmentations.crops.functional.crop_keypoint_by_coords (keypoint, crop_coords)
[view source on GitHub]¶
Crop a keypoint using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
crop_coords |
Tuple[int, int, int, int] |
Crop box coords |
Returns:
Type | Description |
---|---|
|
A keypoint |
def
albumentations.augmentations.crops.functional.keypoint_center_crop (keypoint, crop_height, crop_width, rows, cols)
[view source on GitHub]¶
Keypoint center crop.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
crop_height |
int |
Crop height. |
crop_width |
int |
Crop width. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
tuple |
A keypoint |
def
albumentations.augmentations.crops.functional.keypoint_random_crop (keypoint, crop_height, crop_width, h_start, w_start, rows, cols)
[view source on GitHub]¶
Keypoint random crop.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
(tuple): A keypoint |
crop_height |
int |
Crop height. |
crop_width |
int |
Crop width. |
h_start |
float |
Crop height start. |
w_start |
float |
Crop width start. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
|
A keypoint |
albumentations.augmentations.crops.transforms
¶
class
albumentations.augmentations.crops.transforms.BBoxSafeRandomCrop
(erosion_rate=0.0, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop a random part of the input without loss of bboxes.
Parameters:
Name | Type | Description |
---|---|---|
erosion_rate |
float |
erosion rate applied on input image height before crop. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.CenterCrop
(height, width, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop the central part of the input.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
height of the crop. |
width |
int |
width of the crop. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
Note: It is recommended to use uint8 images as input. Otherwise the operation will require internal conversion float32 -> uint8 -> float32 that causes worse performance.
class
albumentations.augmentations.crops.transforms.Crop
(x_min=0, y_min=0, x_max=1024, y_max=1024, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop region from image.
Parameters:
Name | Type | Description |
---|---|---|
x_min |
int |
Minimum upper left x coordinate. |
y_min |
int |
Minimum upper left y coordinate. |
x_max |
int |
Maximum lower right x coordinate. |
y_max |
int |
Maximum lower right y coordinate. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.CropAndPad
(px=None, percent=None, pad_mode=0, pad_cval=0, pad_cval_mask=0, keep_size=True, sample_independently=True, interpolation=1, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop and pad images by pixel amounts or fractions of image sizes.
Cropping removes pixels at the sides (i.e. extracts a subimage from a given full image).
Padding adds pixels to the sides (e.g. black pixels).
This transformation will never crop images below a height or width of 1
.
Note:
This transformation automatically resizes images back to their original size. To deactivate this, add the
parameter keep_size=False
.
Parameters:
Name | Type | Description |
---|---|---|
px |
int or tuple |
The number of pixels to crop (negative values) or pad (positive values)
on each side of the image. Either this or the parameter |
percent |
float or tuple |
The number of pixels to crop (negative values) or pad (positive values)
on each side of the image given as a fraction of the image
height/width. E.g. if this is set to |
pad_mode |
int |
OpenCV border mode. |
pad_cval |
number, Sequence[number] |
The constant value to use if the pad mode is |
pad_cval_mask |
number, Sequence[number] |
Same as pad_cval but only for masks. |
keep_size |
bool |
After cropping and padding, the result image will usually have a
different height/width compared to the original input image. If this
parameter is set to |
sample_independently |
bool |
If |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
Targets: image, mask, bboxes, keypoints
Image types: any
class
albumentations.augmentations.crops.transforms.CropNonEmptyMaskIfExists
(height, width, ignore_values=None, ignore_channels=None, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop area with mask if mask is non-empty, else make random crop.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
vertical size of crop in pixels |
width |
int |
horizontal size of crop in pixels |
ignore_values |
list of int |
values to ignore in mask, |
ignore_channels |
list of int |
channels to ignore in mask
(e.g. if background is a first channel set |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.RandomCrop
(height, width, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop a random part of the input.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
height of the crop. |
width |
int |
width of the crop. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.RandomCropFromBorders
(crop_left=0.1, crop_right=0.1, crop_top=0.1, crop_bottom=0.1, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop bbox from image randomly cut parts from borders without resize at the end
Parameters:
Name | Type | Description |
---|---|---|
crop_left |
float |
single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut |
crop_right |
float |
single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut |
crop_top |
float |
singlefloat value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut |
crop_bottom |
float |
single float value in (0.0, 1.0) range. Default 0.1. Image will be randomly cut |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.RandomCropNearBBox
(max_part_shift=(0.3, 0.3), cropping_box_key='cropping_bbox', always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop bbox from image with random shift by x,y coordinates
Parameters:
Name | Type | Description |
---|---|---|
max_part_shift |
float, [float, float] |
Max shift in |
cropping_box_key |
str |
Additional target key for cropping box. Default |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
Examples:
>>> aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_box_key='test_box')],
>>> bbox_params=BboxParams("pascal_voc"))
>>> result = aug(image=image, bboxes=bboxes, test_box=[0, 5, 10, 20])
class
albumentations.augmentations.crops.transforms.RandomResizedCrop
(height, width, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=1, always_apply=False, p=1.0)
[view source on GitHub]
¶
Torchvision's variant of crop a random part of the input and rescale it to some size.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
height after crop and resize. |
width |
int |
width after crop and resize. |
scale |
[float, float] |
range of size of the origin size cropped |
ratio |
[float, float] |
range of aspect ratio of the origin aspect ratio cropped |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.RandomSizedBBoxSafeCrop
(height, width, erosion_rate=0.0, interpolation=1, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop a random part of the input and rescale it to some size without loss of bboxes.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
height after crop and resize. |
width |
int |
width after crop and resize. |
erosion_rate |
float |
erosion rate applied on input image height before crop. |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes Image types: uint8, float32
class
albumentations.augmentations.crops.transforms.RandomSizedCrop
(min_max_height, height, width, w2h_ratio=1.0, interpolation=1, always_apply=False, p=1.0)
[view source on GitHub]
¶
Crop a random part of the input and rescale it to some size.
Parameters:
Name | Type | Description |
---|---|---|
min_max_height |
[int, int] |
crop size limits. |
height |
int |
height after crop and resize. |
width |
int |
width after crop and resize. |
w2h_ratio |
float |
aspect ratio of crop. |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.domain_adaptation
¶
class
albumentations.augmentations.domain_adaptation.FDA
(reference_images, beta_limit=0.1, read_fn=<function read_rgb_image at 0x7f357e730160>, always_apply=False, p=0.5)
[view source on GitHub]
¶
Fourier Domain Adaptation from https://github.com/YanchaoYang/FDA Simple "style transfer".
Parameters:
Name | Type | Description |
---|---|---|
reference_images |
List[str] or List(np.ndarray |
List of file paths for reference images or list of reference images. |
beta_limit |
float or tuple of float |
coefficient beta from paper. Recommended less 0.3. |
read_fn |
Callable |
Used-defined function to read image. Function should get image path and return numpy array of image pixels. |
Targets: image
Image types: uint8, float32
Reference: https://github.com/YanchaoYang/FDA https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf
Examples:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)
>>> target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)
>>> aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])
>>> result = aug(image=image)
class
albumentations.augmentations.domain_adaptation.HistogramMatching
(reference_images, blend_ratio=(0.5, 1.0), read_fn=<function read_rgb_image at 0x7f357e730160>, always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply histogram matching. It manipulates the pixels of an input image so that its histogram matches the histogram of the reference image. If the images have multiple channels, the matching is done independently for each channel, as long as the number of channels is equal in the input image and the reference.
Histogram matching can be used as a lightweight normalisation for image processing, such as feature matching, especially in circumstances where the images have been taken from different sources or in different conditions (i.e. lighting).
See: https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html
Parameters:
Name | Type | Description |
---|---|---|
reference_images |
List[str] or List(np.ndarray |
List of file paths for reference images or list of reference images. |
blend_ratio |
[float, float] |
Tuple of min and max blend ratio. Matched image will be blended with original with random blend factor for increased diversity of generated images. |
read_fn |
Callable |
Used-defined function to read image. Function should get image path and return numpy array of image pixels. |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image
Image types: uint8, uint16, float32
class
albumentations.augmentations.domain_adaptation.PixelDistributionAdaptation
(reference_images, blend_ratio=(0.25, 1.0), read_fn=<function read_rgb_image at 0x7f357e730160>, transform_type='pca', always_apply=False, p=0.5)
[view source on GitHub]
¶
Another naive and quick pixel-level domain adaptation. It fits a simple transform (such as PCA, StandardScaler or MinMaxScaler) on both original and reference image, transforms original image with transform trained on this image and then performs inverse transformation using transform fitted on reference image.
Parameters:
Name | Type | Description |
---|---|---|
reference_images |
List[str] or List(np.ndarray |
List of file paths for reference images or list of reference images. |
blend_ratio |
[float, float] |
Tuple of min and max blend ratio. Matched image will be blended with original with random blend factor for increased diversity of generated images. |
read_fn |
Callable |
Used-defined function to read image. Function should get image path and return numpy
array of image pixels. Usually it's default |
transform_type |
str |
type of transform; "pca", "standard", "minmax" are allowed. |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image
Image types: uint8, float32
See also: https://github.com/arsenyinfo/qudida
def
albumentations.augmentations.domain_adaptation.fourier_domain_adaptation (img, target_img, beta)
[view source on GitHub]¶
Fourier Domain Adaptation from https://github.com/YanchaoYang/FDA
Parameters:
Name | Type | Description |
---|---|---|
img |
ndarray |
source image |
target_img |
ndarray |
target image for domain adaptation |
beta |
float |
coefficient from source paper |
Returns:
Type | Description |
---|---|
ndarray |
transformed image |
albumentations.augmentations.dropout
special
¶
albumentations.augmentations.dropout.channel_dropout
¶
class
albumentations.augmentations.dropout.channel_dropout.ChannelDropout
(channel_drop_range=(1, 1), fill_value=0, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly Drop Channels in the input Image.
Parameters:
Name | Type | Description |
---|---|---|
channel_drop_range |
[int, int] |
range from which we choose the number of channels to drop. |
fill_value |
int, float |
pixel value for the dropped channel. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, uint16, unit32, float32
albumentations.augmentations.dropout.coarse_dropout
¶
class
albumentations.augmentations.dropout.coarse_dropout.CoarseDropout
(max_holes=8, max_height=8, max_width=8, min_holes=None, min_height=None, min_width=None, fill_value=0, mask_fill_value=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
CoarseDropout of the rectangular regions in the image.
Parameters:
Name | Type | Description |
---|---|---|
max_holes |
int |
Maximum number of regions to zero out. |
max_height |
int, float |
Maximum height of the hole. |
max_width |
int, float |
Maximum width of the hole. |
min_holes |
int |
Minimum number of regions to zero out. If |
min_height |
int, float |
Minimum height of the hole. Default: None. If |
min_width |
int, float |
Minimum width of the hole. If |
fill_value |
int, float, list of int, list of float |
value for dropped pixels. |
mask_fill_value |
int, float, list of int, list of float |
fill value for dropped pixels
in mask. If |
Targets: image, mask, keypoints
Image types: uint8, float32
Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py
albumentations.augmentations.dropout.cutout
¶
class
albumentations.augmentations.dropout.cutout.Cutout
(num_holes=8, max_h_size=8, max_w_size=8, fill_value=0, always_apply=False, p=0.5)
[view source on GitHub]
¶
CoarseDropout of the square regions in the image.
Parameters:
Name | Type | Description |
---|---|---|
num_holes |
int |
number of regions to zero out |
max_h_size |
int |
maximum height of the hole |
max_w_size |
int |
maximum width of the hole |
fill_value |
int, float, list of int, list of float |
value for dropped pixels. |
Targets: image
Image types: uint8, float32
Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py
albumentations.augmentations.dropout.grid_dropout
¶
class
albumentations.augmentations.dropout.grid_dropout.GridDropout
(ratio=0.5, unit_size_min=None, unit_size_max=None, holes_number_x=None, holes_number_y=None, shift_x=0, shift_y=0, random_offset=False, fill_value=0, mask_fill_value=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
GridDropout, drops out rectangular regions of an image and the corresponding mask in a grid fashion.
Parameters:
Name | Type | Description |
---|---|---|
ratio |
float |
the ratio of the mask holes to the unit_size (same for horizontal and vertical directions). Must be between 0 and 1. Default: 0.5. |
unit_size_min |
int |
minimum size of the grid unit. Must be between 2 and the image shorter edge.
If 'None', holes_number_x and holes_number_y are used to setup the grid. Default: |
unit_size_max |
int |
maximum size of the grid unit. Must be between 2 and the image shorter edge.
If 'None', holes_number_x and holes_number_y are used to setup the grid. Default: |
holes_number_x |
int |
the number of grid units in x direction. Must be between 1 and image width//2.
If 'None', grid unit width is set as image_width//10. Default: |
holes_number_y |
int |
the number of grid units in y direction. Must be between 1 and image height//2.
If |
shift_x |
int |
offsets of the grid start in x direction from (0,0) coordinate. Clipped between 0 and grid unit_width - hole_width. Default: 0. |
shift_y |
int |
offsets of the grid start in y direction from (0,0) coordinate. Clipped between 0 and grid unit height - hole_height. Default: 0. |
random_offset |
boolean |
weather to offset the grid randomly between 0 and grid unit size - hole size
If 'True', entered shift_x, shift_y are ignored and set randomly. Default: |
fill_value |
int |
value for the dropped pixels. Default = 0 |
mask_fill_value |
int |
value for the dropped pixels in mask.
If |
Targets: image, mask
Image types: uint8, float32
References: https://arxiv.org/abs/2001.04086
albumentations.augmentations.dropout.mask_dropout
¶
class
albumentations.augmentations.dropout.mask_dropout.MaskDropout
(max_objects=1, image_fill_value=0, mask_fill_value=0, always_apply=False, p=0.5)
[view source on GitHub]
¶
Image & mask augmentation that zero out mask and image regions corresponding to randomly chosen object instance from mask.
Mask must be single-channel image, zero values treated as background. Image can be any number of channels.
Inspired by https://www.kaggle.com/c/severstal-steel-defect-detection/discussion/114254
Parameters:
Name | Type | Description |
---|---|---|
max_objects |
Maximum number of labels that can be zeroed out. Can be tuple, in this case it's [min, max] |
|
image_fill_value |
Fill value to use when filling image. Can be 'inpaint' to apply inpaining (works only for 3-chahnel images) |
|
mask_fill_value |
Fill value to use when filling mask. |
Targets: image, mask
Image types: uint8, float32
albumentations.augmentations.functional
¶
def
albumentations.augmentations.functional.add_fog (img, fog_coef, alpha_coef, haze_list)
[view source on GitHub]¶
Add fog to the image.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
Image. |
fog_coef |
float |
Fog coefficient. |
alpha_coef |
float |
Alpha coefficient. |
haze_list |
list |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Image. |
def
albumentations.augmentations.functional.add_gravel (img, gravels)
[view source on GitHub]¶
Add gravel to the image.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
ndarray |
image to add gravel to |
gravels |
list |
list of gravel parameters. (float, float, float, float): (top-left x, top-left y, bottom-right x, bottom right y) |
Returns:
Type | Description |
---|---|
numpy.ndarray |
def
albumentations.augmentations.functional.add_rain (img, slant, drop_length, drop_width, drop_color, blur_value, brightness_coefficient, rain_drops)
[view source on GitHub]¶
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
Image. |
slant |
int |
|
drop_length |
||
drop_width |
||
drop_color |
||
blur_value |
int |
Rainy view are blurry. |
brightness_coefficient |
float |
Rainy days are usually shady. |
rain_drops |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Image. |
def
albumentations.augmentations.functional.add_shadow (img, vertices_list)
[view source on GitHub]¶
Add shadows to the image.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
|
vertices_list |
list |
Returns:
Type | Description |
---|---|
numpy.ndarray |
def
albumentations.augmentations.functional.add_snow (img, snow_point, brightness_coeff)
[view source on GitHub]¶
Bleaches out pixels, imitation snow.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
Image. |
snow_point |
Number of show points. |
|
brightness_coeff |
Brightness coefficient. |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Image. |
def
albumentations.augmentations.functional.add_sun_flare (img, flare_center_x, flare_center_y, src_radius, src_color, circles)
[view source on GitHub]¶
Add sun flare.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
|
flare_center_x |
float |
|
flare_center_y |
float |
|
src_radius |
||
src_color |
int, int, int |
|
circles |
list |
Returns:
Type | Description |
---|---|
numpy.ndarray |
def
albumentations.augmentations.functional.bbox_from_mask (mask)
[view source on GitHub]¶
Create bounding box from binary mask (fast version)
Parameters:
Name | Type | Description |
---|---|---|
mask |
numpy.ndarray |
binary mask. |
Returns:
Type | Description |
---|---|
tuple |
A bounding box tuple |
def
albumentations.augmentations.functional.equalize (img, mask=None, mode='cv', by_channels=True)
[view source on GitHub]¶
Equalize the image histogram.
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
RGB or grayscale image. |
mask |
numpy.ndarray |
An optional mask. If given, only the pixels selected by the mask are included in the analysis. Maybe 1 channel or 3 channel array. |
mode |
str |
{'cv', 'pil'}. Use OpenCV or Pillow equalization method. |
by_channels |
bool |
If True, use equalization by channels separately,
else convert image to YCbCr representation and use equalization by |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Equalized image. |
def
albumentations.augmentations.functional.fancy_pca (img, alpha=0.1)
[view source on GitHub]¶
Perform 'Fancy PCA' augmentation from: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
numpy array with (h, w, rgb) shape, as ints between 0-255 |
alpha |
float |
how much to perturb/scale the eigen vecs and vals the paper used std=0.1 |
Returns:
Type | Description |
---|---|
numpy.ndarray |
numpy image-like array as uint8 range(0, 255) |
def
albumentations.augmentations.functional.iso_noise (image, color_shift=0.05, intensity=0.5, random_state=None, **
kwargs)
[view source on GitHub]¶
Apply poisson noise to image to simulate camera sensor noise.
Parameters:
Name | Type | Description |
---|---|---|
image |
numpy.ndarray |
Input image, currently, only RGB, uint8 images are supported. |
color_shift |
float |
|
intensity |
float |
Multiplication factor for noise values. Values of ~0.5 are produce noticeable, yet acceptable level of noise. |
random_state |
||
**kwargs |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Noised image |
def
albumentations.augmentations.functional.mask_from_bbox (img, bbox)
[view source on GitHub]¶
Create binary mask from bounding box
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
input image |
bbox |
A bounding box tuple |
Returns:
Type | Description |
---|---|
mask (numpy.ndarray) |
binary mask |
def
albumentations.augmentations.functional.move_tone_curve (img, low_y, high_y)
[view source on GitHub]¶
Rescales the relationship between bright and dark areas of the image by manipulating its tone curve.
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
RGB or grayscale image. |
low_y |
float |
y-position of a Bezier control point used to adjust the tone curve, must be in range [0, 1] |
high_y |
float |
y-position of a Bezier control point used to adjust image tone curve, must be in range [0, 1] |
def
albumentations.augmentations.functional.multiply (img, multiplier)
[view source on GitHub]¶
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
Image. |
multiplier |
numpy.ndarray |
Multiplier coefficient. |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Image multiplied by |
def
albumentations.augmentations.functional.posterize (img, bits)
[view source on GitHub]¶
Reduce the number of bits for each color channel.
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
image to posterize. |
bits |
int |
number of high bits. Must be in range [0, 8] |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Image with reduced color channels. |
def
albumentations.augmentations.functional.solarize (img, threshold=128)
[view source on GitHub]¶
Invert all pixel values above a threshold.
Parameters:
Name | Type | Description |
---|---|---|
img |
numpy.ndarray |
The image to solarize. |
threshold |
int |
All pixels above this greyscale level are inverted. |
Returns:
Type | Description |
---|---|
numpy.ndarray |
Solarized image. |
def
albumentations.augmentations.functional.swap_tiles_on_image (image, tiles)
[view source on GitHub]¶
Swap tiles on image.
Parameters:
Name | Type | Description |
---|---|---|
image |
np.ndarray |
Input image. |
tiles |
np.ndarray |
array of tuples( current_left_up_corner_row, current_left_up_corner_col, old_left_up_corner_row, old_left_up_corner_col, height_tile, width_tile) |
Returns:
Type | Description |
---|---|
np.ndarray |
Output image. |
albumentations.augmentations.geometric
special
¶
albumentations.augmentations.geometric.functional
¶
def
albumentations.augmentations.geometric.functional.bbox_flip (bbox, d, rows, cols)
[view source on GitHub]¶
Flip a bounding box either vertically, horizontally or both depending on the value of d
.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
d |
int |
dimension. 0 for vertical flip, 1 for horizontal, -1 for transpose |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A bounding box |
Exceptions:
Type | Description |
---|---|
ValueError |
if value of |
def
albumentations.augmentations.geometric.functional.bbox_hflip (bbox, rows, cols)
[view source on GitHub]¶
Flip a bounding box horizontally around the y-axis.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A bounding box |
def
albumentations.augmentations.geometric.functional.bbox_rot90 (bbox, factor, rows, cols)
[view source on GitHub]¶
Rotates a bounding box by 90 degrees CCW (see np.rot90)
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box tuple (x_min, y_min, x_max, y_max). |
factor |
int |
Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90. |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
tuple: A bounding box tuple (x_min, y_min, x_max, y_max). |
def
albumentations.augmentations.geometric.functional.bbox_rotate (bbox, angle, method, rows, cols)
[view source on GitHub]¶
Rotates a bounding box by angle degrees.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
angle |
float |
Angle of rotation in degrees. |
method |
str |
Rotation method used. Should be one of: "largest_box", "ellipse". Default: "largest_box". |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A bounding box |
References: https://arxiv.org/abs/2109.13488
def
albumentations.augmentations.geometric.functional.bbox_shift_scale_rotate (bbox, angle, scale, dx, dy, rotate_method, rows, cols, **
kwargs)
[view source on GitHub]¶
Rotates, shifts and scales a bounding box. Rotation is made by angle degrees, scaling is made by scale factor and shifting is made by dx and dy.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
tuple |
A bounding box |
angle |
int |
Angle of rotation in degrees. |
scale |
int |
Scale factor. |
dx |
int |
Shift along x-axis in pixel units. |
dy |
int |
Shift along y-axis in pixel units. |
rotate_method(str) |
Rotation method used. Should be one of: "largest_box", "ellipse". Default: "largest_box". |
|
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
|
A bounding box |
def
albumentations.augmentations.geometric.functional.bbox_transpose (bbox, axis, rows, cols)
[view source on GitHub]¶
Transposes a bounding box along given axis.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
axis |
int |
0 - main axis, 1 - secondary axis. |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A bounding box tuple |
Exceptions:
Type | Description |
---|---|
ValueError |
If axis not equal to 0 or 1. |
def
albumentations.augmentations.geometric.functional.bbox_vflip (bbox, rows, cols)
[view source on GitHub]¶
Flip a bounding box vertically around the x-axis.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Tuple[float, float, float, float] |
A bounding box |
rows |
int |
Image rows. |
cols |
int |
Image cols. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
tuple: A bounding box |
def
albumentations.augmentations.geometric.functional.elastic_transform (img, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, value=None, random_state=None, approximate=False, same_dxdy=False)
[view source on GitHub]¶
Elastic deformation of images as described in [Simard2003]_ (with modifications). Based on https://gist.github.com/ernestum/601cdf56d2b424757de5
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for Convolutional Neural Networks applied to Visual Document Analysis", in Proc. of the International Conference on Document Analysis and Recognition, 2003.
def
albumentations.augmentations.geometric.functional.elastic_transform_approx (img, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, value=None, random_state=None)
[view source on GitHub]¶
Elastic deformation of images as described in [Simard2003]_ (with modifications for speed). Based on https://gist.github.com/ernestum/601cdf56d2b424757de5
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for Convolutional Neural Networks applied to Visual Document Analysis", in Proc. of the International Conference on Document Analysis and Recognition, 2003.
def
albumentations.augmentations.geometric.functional.from_distance_maps (distance_maps, inverted, if_not_found_coords, threshold=None)
[view source on GitHub]¶
Convert outputs of to_distance_maps()
to KeypointsOnImage
.
This is the inverse of to_distance_maps
.
Parameters:
Name | Type | Description |
---|---|---|
distance_maps |
ndarray |
The distance maps. |
inverted |
bool |
Whether the given distance maps were generated in inverted mode
(i.e. :func: |
if_not_found_coords |
Union[Sequence[int], dict] |
Coordinates to use for keypoints that cannot be found in
|
threshold |
Optional[float] |
The search for keypoints works by searching for the
argmin (non-inverted) or argmax (inverted) in each channel. This
parameters contains the maximum (non-inverted) or minimum (inverted) value to accept in order to view a hit
as a keypoint. Use |
nb_channels |
None, int |
Number of channels of the image on which the keypoints are placed.
Some keypoint augmenters require that information. If set to |
def
albumentations.augmentations.geometric.functional.grid_distortion (img, num_steps=10, xsteps=(), ysteps=(), interpolation=1, border_mode=4, value=None)
[view source on GitHub]¶
Perform a grid distortion of an input image.
Reference: http://pythology.blogspot.sg/2014/03/interpolation-on-regular-distorted-grid.html
def
albumentations.augmentations.geometric.functional.keypoint_flip (keypoint, d, rows, cols)
[view source on GitHub]¶
Flip a keypoint either vertically, horizontally or both depending on the value of d
.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
d |
int |
Number of flip. Must be -1, 0 or 1: * 0 - vertical flip, * 1 - horizontal flip, * -1 - vertical and horizontal flip. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A keypoint |
Exceptions:
Type | Description |
---|---|
ValueError |
if value of |
def
albumentations.augmentations.geometric.functional.keypoint_hflip (keypoint, rows, cols)
[view source on GitHub]¶
Flip a keypoint horizontally around the y-axis.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A keypoint |
def
albumentations.augmentations.geometric.functional.keypoint_rot90 (keypoint, factor, rows, cols, **
params)
[view source on GitHub]¶
Rotates a keypoint by 90 degrees CCW (see np.rot90)
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
factor |
int |
Number of CCW rotations. Must be in range [0;3] See np.rot90. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
tuple: A keypoint |
Exceptions:
Type | Description |
---|---|
ValueError |
if factor not in set {0, 1, 2, 3} |
def
albumentations.augmentations.geometric.functional.keypoint_rotate (keypoint, angle, rows, cols, **
params)
[view source on GitHub]¶
Rotate a keypoint by angle.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
tuple |
A keypoint |
angle |
float |
Rotation angle. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
tuple |
A keypoint |
def
albumentations.augmentations.geometric.functional.keypoint_scale (keypoint, scale_x, scale_y)
[view source on GitHub]¶
Scales a keypoint by scale_x and scale_y.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
scale_x |
float |
Scale coefficient x-axis. |
scale_y |
float |
Scale coefficient y-axis. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A keypoint |
def
albumentations.augmentations.geometric.functional.keypoint_transpose (keypoint)
[view source on GitHub]¶
Rotate a keypoint by angle.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
A keypoint |
def
albumentations.augmentations.geometric.functional.keypoint_vflip (keypoint, rows, cols)
[view source on GitHub]¶
Flip a keypoint vertically around the x-axis.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
Tuple[float, float, float, float] |
A keypoint |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
Tuple[float, float, float, float] |
tuple: A keypoint |
def
albumentations.augmentations.geometric.functional.optical_distortion (img, k=0, dx=0, dy=0, interpolation=1, border_mode=4, value=None)
[view source on GitHub]¶
Barrel / pincushion distortion. Unconventional augment.
Reference: | https://stackoverflow.com/questions/6199636/formulas-for-barrel-pincushion-distortion | https://stackoverflow.com/questions/10364201/image-transformation-in-opencv | https://stackoverflow.com/questions/2477774/correcting-fisheye-distortion-programmatically | http://www.coldvision.io/2017/03/02/advanced-lane-finding-using-opencv/
def
albumentations.augmentations.geometric.functional.py3round (number)
[view source on GitHub]¶
Unified rounding in all python versions.
def
albumentations.augmentations.geometric.functional.rotation2DMatrixToEulerAngles (matrix, y_up=False)
[view source on GitHub]¶
Parameters:
Name | Type | Description |
---|---|---|
matrix |
ndarray |
Rotation matrix |
y_up |
bool |
is Y axis looks up or down |
def
albumentations.augmentations.geometric.functional.to_distance_maps (keypoints, height, width, inverted=False)
[view source on GitHub]¶
Generate a (H,W,N)
array of distance maps for N
keypoints.
The n
-th distance map contains at every location (y, x)
the
euclidean distance to the n
-th keypoint.
This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.
Parameters:
Name | Type | Description |
---|---|---|
keypoint |
keypoint coordinates |
|
height |
int |
image height |
width |
int |
image width |
inverted |
bool |
If |
Returns:
Type | Description |
---|---|
ndarray |
(H, W, N) ndarray
A |
albumentations.augmentations.geometric.resize
¶
class
albumentations.augmentations.geometric.resize.LongestMaxSize
(max_size=1024, interpolation=1, always_apply=False, p=1)
[view source on GitHub]
¶
Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.
Parameters:
Name | Type | Description |
---|---|---|
max_size |
int, list of int |
maximum size of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. |
interpolation |
OpenCV flag |
interpolation method. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.resize.RandomScale
(scale_limit=0.1, interpolation=1, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly resize the input. Output image size is different from the input image size.
Parameters:
Name | Type | Description |
---|---|---|
scale_limit |
[float, float] or float |
scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1). |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.resize.Resize
(height, width, interpolation=1, always_apply=False, p=1)
[view source on GitHub]
¶
Resize the input to the given height and width.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
desired height of the output. |
width |
int |
desired width of the output. |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.resize.SmallestMaxSize
(max_size=1024, interpolation=1, always_apply=False, p=1)
[view source on GitHub]
¶
Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.
Parameters:
Name | Type | Description |
---|---|---|
max_size |
int, list of int |
maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. |
interpolation |
OpenCV flag |
interpolation method. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 1. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.geometric.rotate
¶
class
albumentations.augmentations.geometric.rotate.RandomRotate90
[view source on GitHub]
¶
Randomly rotate the input by 90 degrees zero or more times.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.geometric.rotate.RandomRotate90.apply (self, img, factor=0, **params)
¶
Parameters:
Name | Type | Description |
---|---|---|
factor |
int |
number of times the input will be rotated by 90 degrees. |
class
albumentations.augmentations.geometric.rotate.Rotate
(limit=90, interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, always_apply=False, p=0.5)
[view source on GitHub]
¶
Rotate the input by an angle selected randomly from the uniform distribution.
Parameters:
Name | Type | Description |
---|---|---|
limit |
[int, int] or int |
range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: (-90, 90) |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of ints, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of ints,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
rotate_method |
str |
rotation method used for the bounding boxes. Should be one of "largest_box" or "ellipse". Default: "largest_box" |
crop_border |
bool |
If True would make a largest possible crop within rotated image |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.rotate.SafeRotate
(limit=90, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.
The resulting image may have artifacts in it. After rotation, the image may have a different aspect ratio, and after resizing, it returns to its original shape with the original aspect ratio of the image. For these reason we may see some artifacts.
Parameters:
Name | Type | Description |
---|---|---|
limit |
[int, int] or int |
range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: (-90, 90) |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of ints, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of ints,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.geometric.transforms
¶
class
albumentations.augmentations.geometric.transforms.Affine
(scale=None, translate_percent=None, translate_px=None, rotate=None, shear=None, interpolation=1, mask_interpolation=0, cval=0, cval_mask=0, mode=0, fit_output=False, keep_ratio=False, rotate_method='largest_box', always_apply=False, p=0.5)
[view source on GitHub]
¶
Augmentation to apply affine transformations to images. This is mostly a wrapper around the corresponding classes and functions in OpenCV.
Affine transformations involve:
- Translation ("move" image on the x-/y-axis)
- Rotation
- Scaling ("zoom" in/out)
- Shear (move one side of the image, turning a square into a trapezoid)
All such transformations can create "new" pixels in the image without a defined content, e.g.
if the image is translated to the left, pixels are created on the right.
A method has to be defined to deal with these pixel values.
The parameters cval
and mode
of this class deal with this.
Some transformations involve interpolations between several pixels
of the input image to generate output pixel values. The parameters interpolation
and
mask_interpolation
deals with the method of interpolation used for this.
Parameters:
Name | Type | Description |
---|---|---|
scale |
number, tuple of number or dict |
Scaling factor to use, where |
translate_percent |
None, number, tuple of number or dict |
Translation as a fraction of the image height/width
(x-translation, y-translation), where |
translate_px |
None, int, tuple of int or dict |
Translation in pixels.
* If |
rotate |
number or tuple of number |
Rotation in degrees (NOT radians), i.e. expected value range is
around |
shear |
number, tuple of number or dict |
Shear in degrees (NOT radians), i.e. expected value range is
around |
interpolation |
int |
OpenCV interpolation flag. |
mask_interpolation |
int |
OpenCV interpolation flag. |
cval |
number or sequence of number |
The constant value to use when filling in newly created pixels.
(E.g. translating by 1px to the right will create a new 1px-wide column of pixels
on the left of the image).
The value is only used when |
cval_mask |
number or tuple of number |
Same as cval but only for masks. |
mode |
int |
OpenCV border flag. |
fit_output |
bool |
If True, the image plane size and position will be adjusted to tightly capture
the whole image after affine transformation ( |
keep_ratio |
bool |
When True, the original aspect ratio will be kept when the random scale is applied. Default: False. |
rotate_method |
str |
rotation method used for the bounding boxes. Should be one of "largest_box" or "ellipse"[1]. Default: "largest_box" |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, keypoints, bboxes
Image types: uint8, float32
Reference: [1] https://arxiv.org/abs/2109.13488
class
albumentations.augmentations.geometric.transforms.ElasticTransform
(alpha=1, sigma=50, alpha_affine=50, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, approximate=False, same_dxdy=False, p=0.5)
[view source on GitHub]
¶
Elastic deformation of images as described in [Simard2003]_ (with modifications). Based on https://gist.github.com/ernestum/601cdf56d2b424757de5
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for Convolutional Neural Networks applied to Visual Document Analysis", in Proc. of the International Conference on Document Analysis and Recognition, 2003.
Parameters:
Name | Type | Description |
---|---|---|
alpha |
float |
|
sigma |
float |
Gaussian filter parameter. |
alpha_affine |
float |
The range will be (-alpha_affine, alpha_affine) |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of ints, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of ints,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
approximate |
boolean |
Whether to smooth displacement map with fixed kernel size. Enabling this option gives ~2X speedup on large images. |
same_dxdy |
boolean |
Whether to use same random generated shift for x and y. Enabling this option gives ~2X speedup. |
Targets: image, mask, bbox
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.Flip
[view source on GitHub]
¶
Flip the input either horizontally, vertically or both horizontally and vertically.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.geometric.transforms.Flip.apply (self, img, d=0, **params)
¶
d (int): code that specifies how to flip the input. 0 for vertical flipping, 1 for horizontal flipping, -1 for both vertical and horizontal flipping (which is also could be seen as rotating the input by 180 degrees).
class
albumentations.augmentations.geometric.transforms.GridDistortion
(num_steps=5, distort_limit=0.3, interpolation=1, border_mode=4, value=None, mask_value=None, normalized=False, always_apply=False, p=0.5)
[view source on GitHub]
¶
Parameters:
Name | Type | Description |
---|---|---|
num_steps |
int |
count of grid cells on each side. |
distort_limit |
float, [float, float] |
If distort_limit is a single float, the range will be (-distort_limit, distort_limit). Default: (-0.03, 0.03). |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of ints, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of ints,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
normalized |
bool |
if true, distortion will be normalized to do not go outside the image. Default: False See for more information: https://github.com/albumentations-team/albumentations/pull/722 |
Targets: image, mask
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.HorizontalFlip
[view source on GitHub]
¶
Flip the input horizontally around the y-axis.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.OpticalDistortion
(distort_limit=0.05, shift_limit=0.05, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Parameters:
Name | Type | Description |
---|---|---|
distort_limit |
float, [float, float] |
If distort_limit is a single float, the range will be (-distort_limit, distort_limit). Default: (-0.05, 0.05). |
shift_limit |
float, [float, float] |
If shift_limit is a single float, the range will be (-shift_limit, shift_limit). Default: (-0.05, 0.05). |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of ints, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of ints,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
Targets: image, mask, bbox
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.PadIfNeeded
(min_height=1024, min_width=1024, pad_height_divisor=None, pad_width_divisor=None, position=<PositionType.CENTER: 'center'>, border_mode=4, value=None, mask_value=None, always_apply=False, p=1.0)
[view source on GitHub]
¶
Pad side of the image / max if side is less than desired number.
Parameters:
Name | Type | Description |
---|---|---|
min_height |
int |
minimal result image height. |
min_width |
int |
minimal result image width. |
pad_height_divisor |
int |
if not None, ensures image height is dividable by value of this argument. |
pad_width_divisor |
int |
if not None, ensures image width is dividable by value of this argument. |
position |
Union[str, PositionType] |
Position of the image. should be PositionType.CENTER or PositionType.TOP_LEFT or PositionType.TOP_RIGHT or PositionType.BOTTOM_LEFT or PositionType.BOTTOM_RIGHT. or PositionType.RANDOM. Default: PositionType.CENTER. |
border_mode |
OpenCV flag |
OpenCV border mode. |
value |
int, float, list of int, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of int,
list of float |
padding value for mask if border_mode is cv2.BORDER_CONSTANT. |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image, mask, bbox, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.Perspective
(scale=(0.05, 0.1), keep_size=True, pad_mode=0, pad_val=0, mask_pad_val=0, fit_output=False, interpolation=1, always_apply=False, p=0.5)
[view source on GitHub]
¶
Perform a random four point perspective transform of the input.
Parameters:
Name | Type | Description |
---|---|---|
scale |
float or [float, float] |
standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Default: (0.05, 0.1). |
keep_size |
bool |
Whether to resize image’s back to their original size after applying the perspective transform. If set to False, the resulting images may end up having different shapes and will always be a list, never an array. Default: True |
pad_mode |
OpenCV flag |
OpenCV border mode. |
pad_val |
int, float, list of int, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0 |
mask_pad_val |
int, float, list of int, list of float |
padding value for mask if border_mode is cv2.BORDER_CONSTANT. Default: 0 |
fit_output |
bool |
If True, the image plane size and position will be adjusted to still capture the whole image after perspective transformation. (Followed by image resizing if keep_size is set to True.) Otherwise, parts of the transformed image may be outside of the image plane. This setting should not be set to True when using large scale values as it could lead to very large images. Default: False |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, keypoints, bboxes
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.PiecewiseAffine
(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, interpolation=1, mask_interpolation=0, cval=0, cval_mask=0, mode='constant', absolute_scale=False, always_apply=False, keypoints_threshold=0.01, p=0.5)
[view source on GitHub]
¶
Apply affine transformations that differ between local neighbourhoods. This augmentation places a regular grid of points on an image and randomly moves the neighbourhood of these point around via affine transformations. This leads to local distortions.
This is mostly a wrapper around scikit-image's PiecewiseAffine
.
See also Affine
for a similar technique.
Note:
This augmenter is very slow. Try to use ElasticTransformation
instead, which is at least 10x faster.
Note: For coordinate-based inputs (keypoints, bounding boxes, polygons, ...), this augmenter still has to perform an image-based augmentation, which will make it significantly slower and not fully correct for such inputs than other transforms.
Parameters:
Name | Type | Description |
---|---|---|
scale |
float, tuple of float |
Each point on the regular grid is moved around via a normal distribution.
This scale factor is equivalent to the normal distribution's sigma.
Note that the jitter (how far each point is moved in which direction) is multiplied by the height/width of
the image if |
nb_rows |
int, tuple of int |
Number of rows of points that the regular grid should have.
Must be at least |
nb_cols |
int, tuple of int |
Number of columns. Analogous to |
interpolation |
int |
The order of interpolation. The order has to be in the range 0-5: - 0: Nearest-neighbor - 1: Bi-linear (default) - 2: Bi-quadratic - 3: Bi-cubic - 4: Bi-quartic - 5: Bi-quintic |
mask_interpolation |
int |
same as interpolation but for mask. |
cval |
number |
The constant value to use when filling in newly created pixels. |
cval_mask |
number |
Same as cval but only for masks. |
mode |
str |
{'constant', 'edge', 'symmetric', 'reflect', 'wrap'}, optional
Points outside the boundaries of the input are filled according
to the given mode. Modes match the behaviour of |
absolute_scale |
bool |
Take |
keypoints_threshold |
float |
Used as threshold in conversion from distance maps to keypoints.
The search for keypoints works by searching for the
argmin (non-inverted) or argmax (inverted) in each channel. This
parameters contains the maximum (non-inverted) or minimum (inverted) value to accept in order to view a hit
as a keypoint. Use |
Targets: image, mask, keypoints, bboxes
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.ShiftScaleRotate
(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, value=None, mask_value=None, shift_limit_x=None, shift_limit_y=None, rotate_method='largest_box', always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly apply affine transforms: translate, scale and rotate the input.
Parameters:
Name | Type | Description |
---|---|---|
shift_limit |
[float, float] or float |
shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [0, 1]. Default: (-0.0625, 0.0625). |
scale_limit |
[float, float] or float |
scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1). |
rotate_limit |
[int, int] or int |
rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45). |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode |
OpenCV flag |
flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
value |
int, float, list of int, list of float |
padding value if border_mode is cv2.BORDER_CONSTANT. |
mask_value |
int, float,
list of int,
list of float |
padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
shift_limit_x |
[float, float] or float |
shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width. If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [0, 1]. Default: None. |
shift_limit_y |
[float, float] or float |
shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height. If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [0, 1]. Default: None. |
rotate_method |
str |
rotation method used for the bounding boxes. Should be one of "largest_box" or "ellipse". Default: "largest_box" |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.Transpose
[view source on GitHub]
¶
Transpose the input by swapping rows and columns.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
class
albumentations.augmentations.geometric.transforms.VerticalFlip
[view source on GitHub]
¶
Flip the input vertically around the x-axis.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask, bboxes, keypoints
Image types: uint8, float32
albumentations.augmentations.transforms
¶
class
albumentations.augmentations.transforms.ChannelShuffle
[view source on GitHub]
¶
Randomly rearrange channels of the input RGB image.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.CLAHE
(clip_limit=4.0, tile_grid_size=(8, 8), always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply Contrast Limited Adaptive Histogram Equalization to the input image.
Parameters:
Name | Type | Description |
---|---|---|
clip_limit |
float or [float, float] |
upper threshold value for contrast limiting. If clip_limit is a single float value, the range will be (1, clip_limit). Default: (1, 4). |
tile_grid_size |
[int, int] |
size of grid for histogram equalization. Default: (8, 8). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8
class
albumentations.augmentations.transforms.ColorJitter
(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.2, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly changes the brightness, contrast, and saturation of an image. Compared to ColorJitter from torchvision, this transform gives a little bit different results because Pillow (used in torchvision) and OpenCV (used in Albumentations) transform an image to HSV format by different formulas. Another difference - Pillow uses uint8 overflow, but we use value saturation.
Parameters:
Name | Type | Description |
---|---|---|
brightness |
float or tuple of float (min, max |
How much to jitter brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers. |
contrast |
float or tuple of float (min, max |
How much to jitter contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers. |
saturation |
float or tuple of float (min, max |
How much to jitter saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers. |
hue |
float or tuple of float (min, max |
How much to jitter hue. hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0 <= hue <= 0.5 or -0.5 <= min <= max <= 0.5. |
class
albumentations.augmentations.transforms.Downscale
(scale_min=0.25, scale_max=0.25, interpolation=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Decreases image quality by downscaling and upscaling back.
Parameters:
Name | Type | Description |
---|---|---|
scale_min |
float |
lower bound on the image scale. Should be < 1. |
scale_max |
float |
lower bound on the image scale. Should be . |
interpolation |
cv2 interpolation method. Could be: - single cv2 interpolation flag - selected method will be used for downscale and upscale. - dict(downscale=flag, upscale=flag) - Downscale.Interpolation(downscale=flag, upscale=flag) - Default: Interpolation(downscale=cv2.INTER_NEAREST, upscale=cv2.INTER_NEAREST) |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.Emboss
(alpha=(0.2, 0.5), strength=(0.2, 0.7), always_apply=False, p=0.5)
[view source on GitHub]
¶
Emboss the input image and overlays the result with the original image.
Parameters:
Name | Type | Description |
---|---|---|
alpha |
[float, float] |
range to choose the visibility of the embossed image. At 0, only the original image is visible,at 1.0 only its embossed version is visible. Default: (0.2, 0.5). |
strength |
[float, float] |
strength range of the embossing. Default: (0.2, 0.7). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.augmentations.transforms.Equalize
(mode='cv', by_channels=True, mask=None, mask_params=(), always_apply=False, p=0.5)
[view source on GitHub]
¶
Equalize the image histogram.
Parameters:
Name | Type | Description |
---|---|---|
mode |
str |
{'cv', 'pil'}. Use OpenCV or Pillow equalization method. |
by_channels |
bool |
If True, use equalization by channels separately,
else convert image to YCbCr representation and use equalization by |
mask |
np.ndarray, callable |
If given, only the pixels selected by
the mask are included in the analysis. Maybe 1 channel or 3 channel array or callable.
Function signature must include |
mask_params |
list of str |
Params for mask function. |
Targets: image
Image types: uint8
class
albumentations.augmentations.transforms.FancyPCA
(alpha=0.1, always_apply=False, p=0.5)
[view source on GitHub]
¶
Augment RGB image using FancyPCA from Krizhevsky's paper "ImageNet Classification with Deep Convolutional Neural Networks"
Parameters:
Name | Type | Description |
---|---|---|
alpha |
float |
how much to perturb/scale the eigen vecs and vals. scale is samples from gaussian distribution (mu=0, sigma=alpha) |
Targets: image
Image types: 3-channel uint8 images only
Credit: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf https://deshanadesai.github.io/notes/Fancy-PCA-with-Scikit-Image https://pixelatedbrian.github.io/2018-04-29-fancy_pca/
class
albumentations.augmentations.transforms.FromFloat
(dtype='uint16', max_value=None, always_apply=False, p=1.0)
[view source on GitHub]
¶
Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value
and then
cast the resulted value to a type specified by dtype
. If max_value
is None the transform will try to infer
the maximum value for the data type from the dtype
argument.
This is the inverse transform for :class:~albumentations.augmentations.transforms.ToFloat
.
Parameters:
Name | Type | Description |
---|---|---|
max_value |
float |
maximum possible input value. Default: None. |
dtype |
string or numpy data type |
data type of the output. See the |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image
Image types: float32
.. _'Data types' page from the NumPy docs: https://docs.scipy.org/doc/numpy/user/basics.types.html
class
albumentations.augmentations.transforms.GaussNoise
(var_limit=(10.0, 50.0), mean=0, per_channel=True, always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply gaussian noise to the input image.
Parameters:
Name | Type | Description |
---|---|---|
var_limit |
[float, float] or float |
variance range for noise. If var_limit is a single float, the range will be (0, var_limit). Default: (10.0, 50.0). |
mean |
float |
mean of the noise. Default: 0 |
per_channel |
bool |
if set to True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Default: True |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.HueSaturationValue
(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly change hue, saturation and value of the input image.
Parameters:
Name | Type | Description |
---|---|---|
hue_shift_limit |
[int, int] or int |
range for changing hue. If hue_shift_limit is a single int, the range will be (-hue_shift_limit, hue_shift_limit). Default: (-20, 20). |
sat_shift_limit |
[int, int] or int |
range for changing saturation. If sat_shift_limit is a single int, the range will be (-sat_shift_limit, sat_shift_limit). Default: (-30, 30). |
val_shift_limit |
[int, int] or int |
range for changing value. If val_shift_limit is a single int, the range will be (-val_shift_limit, val_shift_limit). Default: (-20, 20). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.ImageCompression
(quality_lower=99, quality_upper=100, compression_type=<ImageCompressionType.JPEG: 0>, always_apply=False, p=0.5)
[view source on GitHub]
¶
Decreases image quality by Jpeg, WebP compression of an image.
Parameters:
Name | Type | Description |
---|---|---|
quality_lower |
float |
lower bound on the image quality. Should be in [0, 100] range for jpeg and [1, 100] for webp. |
quality_upper |
float |
upper bound on the image quality. Should be in [0, 100] range for jpeg and [1, 100] for webp. |
compression_type |
ImageCompressionType |
should be ImageCompressionType.JPEG or ImageCompressionType.WEBP. Default: ImageCompressionType.JPEG |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.ImageCompression.ImageCompressionType
¶
An enumeration.
class
albumentations.augmentations.transforms.InvertImg
[view source on GitHub]
¶
Invert the input image by subtracting pixel values from 255.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.ISONoise
(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply camera sensor noise.
Parameters:
Name | Type | Description |
---|---|---|
color_shift |
[float, float] |
variance range for color hue change. Measured as a fraction of 360 degree Hue angle in HLS colorspace. |
intensity |
[float, float] |
Multiplicative factor that control strength of color and luminace noise. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8
class
albumentations.augmentations.transforms.JpegCompression
(quality_lower=99, quality_upper=100, always_apply=False, p=0.5)
[view source on GitHub]
¶
Decreases image quality by Jpeg compression of an image.
Parameters:
Name | Type | Description |
---|---|---|
quality_lower |
float |
lower bound on the jpeg quality. Should be in [0, 100] range |
quality_upper |
float |
upper bound on the jpeg quality. Should be in [0, 100] range |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.Lambda
(image=None, mask=None, keypoint=None, bbox=None, name=None, always_apply=False, p=1.0)
[view source on GitHub]
¶
A flexible transformation class for using user-defined transformation functions per targets. Function signature must include **kwargs to accept optinal arguments like interpolation method, image size, etc:
Parameters:
Name | Type | Description |
---|---|---|
image |
callable |
Image transformation function. |
mask |
callable |
Mask transformation function. |
keypoint |
callable |
Keypoint transformation function. |
bbox |
callable |
BBox transformation function. |
always_apply |
bool |
Indicates whether this transformation should be always applied. |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image, mask, bboxes, keypoints
Image types: Any
class
albumentations.augmentations.transforms.MultiplicativeNoise
(multiplier=(0.9, 1.1), per_channel=False, elementwise=False, always_apply=False, p=0.5)
[view source on GitHub]
¶
Multiply image to random number or array of numbers.
Parameters:
Name | Type | Description |
---|---|---|
multiplier |
float or tuple of floats |
If single float image will be multiplied to this number.
If tuple of float multiplier will be in range |
per_channel |
bool |
If |
elementwise |
bool |
If |
Targets: image
Image types: Any
class
albumentations.augmentations.transforms.Normalize
(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, always_apply=False, p=1.0)
[view source on GitHub]
¶
Normalization is applied by the formula: img = (img - mean * max_pixel_value) / (std * max_pixel_value)
Parameters:
Name | Type | Description |
---|---|---|
mean |
float, list of float |
mean values |
std |
(float, list of float |
std values |
max_pixel_value |
float |
maximum possible pixel value |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.PixelDropout
(dropout_prob=0.01, per_channel=False, drop_value=0, mask_drop_value=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Set pixels to 0 with some probability.
Parameters:
Name | Type | Description |
---|---|---|
dropout_prob |
float |
pixel drop probability. Default: 0.01 |
per_channel |
bool |
if set to |
drop_value |
number or sequence of numbers or None |
Value that will be set in dropped place. If set to None value will be sampled randomly, default ranges will be used: - uint8 - [0, 255] - uint16 - [0, 65535] - uint32 - [0, 4294967295] - float, double - [0, 1] Default: 0 |
mask_drop_value |
number or sequence of numbers or None |
Value that will be set in dropped place in masks. If set to None masks will be unchanged. Default: 0 |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask Image types: any
class
albumentations.augmentations.transforms.Posterize
(num_bits=4, always_apply=False, p=0.5)
[view source on GitHub]
¶
Reduce the number of bits for each color channel.
Parameters:
Name | Type | Description |
---|---|---|
num_bits |
[int, int] or int,
or list of ints [r, g, b],
or list of ints [[r1, r1], [g1, g2], [b1, b2]] |
number of high bits. If num_bits is a single value, the range will be [num_bits, num_bits]. Must be in range [0, 8]. Default: 4. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8
class
albumentations.augmentations.transforms.RandomBrightness
(limit=0.2, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly change brightness of the input image.
Parameters:
Name | Type | Description |
---|---|---|
limit |
[float, float] or float |
factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomBrightnessContrast
(brightness_limit=0.2, contrast_limit=0.2, brightness_by_max=True, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly change brightness and contrast of the input image.
Parameters:
Name | Type | Description |
---|---|---|
brightness_limit |
[float, float] or float |
factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2). |
contrast_limit |
[float, float] or float |
factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2). |
brightness_by_max |
Boolean |
If True adjust contrast by image dtype maximum, else adjust contrast by image mean. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomContrast
(limit=0.2, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly change contrast of the input image.
Parameters:
Name | Type | Description |
---|---|---|
limit |
[float, float] or float |
factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomFog
(fog_coef_lower=0.3, fog_coef_upper=1, alpha_coef=0.08, always_apply=False, p=0.5)
[view source on GitHub]
¶
Simulates fog for the image
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
fog_coef_lower |
float |
lower limit for fog intensity coefficient. Should be in [0, 1] range. |
fog_coef_upper |
float |
upper limit for fog intensity coefficient. Should be in [0, 1] range. |
alpha_coef |
float |
transparency of the fog circles. Should be in [0, 1] range. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomGamma
(gamma_limit=(80, 120), eps=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Parameters:
Name | Type | Description |
---|---|---|
gamma_limit |
float or [float, float] |
If gamma_limit is a single float value, the range will be (-gamma_limit, gamma_limit). Default: (80, 120). |
eps |
Deprecated. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomGravel
(gravel_roi=(0.1, 0.4, 0.9, 0.9), number_of_patches=2, always_apply=False, p=0.5)
[view source on GitHub]
¶
Add gravels.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
gravel_roi |
float, float, float, float |
(top-left x, top-left y, bottom-right x, bottom right y). Should be in [0, 1] range |
number_of_patches |
int |
no. of gravel patches required |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomGridShuffle
(grid=(3, 3), always_apply=False, p=0.5)
[view source on GitHub]
¶
Random shuffle grid's cells on image.
Parameters:
Name | Type | Description |
---|---|---|
grid |
[int, int] |
size of grid for splitting image. |
Targets: image, mask, keypoints
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomRain
(slant_lower=-10, slant_upper=10, drop_length=20, drop_width=1, drop_color=(200, 200, 200), blur_value=7, brightness_coefficient=0.7, rain_type=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Adds rain effects.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
slant_lower |
should be in range [-20, 20]. |
|
slant_upper |
should be in range [-20, 20]. |
|
drop_length |
should be in range [0, 100]. |
|
drop_width |
should be in range [1, 5]. |
|
drop_color |
list of (r, g, b |
rain lines color. |
blur_value |
int |
rainy view are blurry |
brightness_coefficient |
float |
rainy days are usually shady. Should be in range [0, 1]. |
rain_type |
One of [None, "drizzle", "heavy", "torrential"] |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomShadow
(shadow_roi=(0, 0.5, 1, 1), num_shadows_lower=1, num_shadows_upper=2, shadow_dimension=5, always_apply=False, p=0.5)
[view source on GitHub]
¶
Simulates shadows for the image
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
shadow_roi |
float, float, float, float |
region of the image where shadows will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1]. |
num_shadows_lower |
int |
Lower limit for the possible number of shadows.
Should be in range [0, |
num_shadows_upper |
int |
Lower limit for the possible number of shadows.
Should be in range [ |
shadow_dimension |
int |
number of edges in the shadow polygons |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomSnow
(snow_point_lower=0.1, snow_point_upper=0.3, brightness_coeff=2.5, always_apply=False, p=0.5)
[view source on GitHub]
¶
Bleach out some pixel values simulating snow.
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
snow_point_lower |
float |
lower_bond of the amount of snow. Should be in [0, 1] range |
snow_point_upper |
float |
upper_bond of the amount of snow. Should be in [0, 1] range |
brightness_coeff |
float |
larger number will lead to a more snow on the image. Should be >= 0 |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomSunFlare
(flare_roi=(0, 0, 1, 0.5), angle_lower=0, angle_upper=1, num_flare_circles_lower=6, num_flare_circles_upper=10, src_radius=400, src_color=(255, 255, 255), always_apply=False, p=0.5)
[view source on GitHub]
¶
Simulates Sun Flare for the image
From https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Parameters:
Name | Type | Description |
---|---|---|
flare_roi |
float, float, float, float |
region of the image where flare will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1]. |
angle_lower |
float |
should be in range [0, |
angle_upper |
float |
should be in range [ |
num_flare_circles_lower |
int |
lower limit for the number of flare circles.
Should be in range [0, |
num_flare_circles_upper |
int |
upper limit for the number of flare circles.
Should be in range [ |
src_radius |
int |
|
src_color |
int, int, int |
color of the flare |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RandomToneCurve
(scale=0.1, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve.
Parameters:
Name | Type | Description |
---|---|---|
scale |
float |
standard deviation of the normal distribution. Used to sample random distances to move two control points that modify the image's curve. Values should be in range [0, 1]. Default: 0.1 |
Targets: image
Image types: uint8
class
albumentations.augmentations.transforms.RGBShift
(r_shift_limit=20, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=0.5)
[view source on GitHub]
¶
Randomly shift values for each channel of the input RGB image.
Parameters:
Name | Type | Description |
---|---|---|
r_shift_limit |
[int, int] or int |
range for changing values for the red channel. If r_shift_limit is a single int, the range will be (-r_shift_limit, r_shift_limit). Default: (-20, 20). |
g_shift_limit |
[int, int] or int |
range for changing values for the green channel. If g_shift_limit is a single int, the range will be (-g_shift_limit, g_shift_limit). Default: (-20, 20). |
b_shift_limit |
[int, int] or int |
range for changing values for the blue channel. If b_shift_limit is a single int, the range will be (-b_shift_limit, b_shift_limit). Default: (-20, 20). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.RingingOvershoot
(blur_limit=(7, 15), cutoff=(0.7853981633974483, 1.5707963267948966), always_apply=False, p=0.5)
[view source on GitHub]
¶
Create ringing or overshoot artefacts by conlvolving image with 2D sinc filter.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int, [int, int] |
maximum kernel size for sinc filter. Should be in range [3, inf). Default: (7, 15). |
cutoff |
float, [float, float] |
range to choose the cutoff frequency in radians. Should be in range (0, np.pi) Default: (np.pi / 4, np.pi / 2). |
p |
float |
probability of applying the transform. Default: 0.5. |
Reference: dsp.stackexchange.com/questions/58301/2-d-circularly-symmetric-low-pass-filter https://arxiv.org/abs/2107.10833
Targets: image
class
albumentations.augmentations.transforms.Sharpen
(alpha=(0.2, 0.5), lightness=(0.5, 1.0), always_apply=False, p=0.5)
[view source on GitHub]
¶
Sharpen the input image and overlays the result with the original image.
Parameters:
Name | Type | Description |
---|---|---|
alpha |
[float, float] |
range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5). |
lightness |
[float, float] |
range to choose the lightness of the sharpened image. Default: (0.5, 1.0). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.augmentations.transforms.Solarize
(threshold=128, always_apply=False, p=0.5)
[view source on GitHub]
¶
Invert all pixel values above a threshold.
Parameters:
Name | Type | Description |
---|---|---|
threshold |
[int, int] or int, or [float, float] or float |
range for solarizing threshold. If threshold is a single value, the range will be [threshold, threshold]. Default: 128. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: any
class
albumentations.augmentations.transforms.Spatter
(mean=0.65, std=0.3, gauss_sigma=2, cutout_threshold=0.68, intensity=0.6, mode='rain', color=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.
Parameters:
Name | Type | Description |
---|---|---|
mean |
float, or tuple of floats |
Mean value of normal distribution for generating liquid layer.
If single float it will be used as mean.
If tuple of float mean will be sampled from range |
std |
float, or tuple of floats |
Standard deviation value of normal distribution for generating liquid layer.
If single float it will be used as std.
If tuple of float std will be sampled from range |
gauss_sigma |
float, or tuple of floats |
Sigma value for gaussian filtering of liquid layer.
If single float it will be used as gauss_sigma.
If tuple of float gauss_sigma will be sampled from range |
cutout_threshold |
float, or tuple of floats |
Threshold for filtering liqued layer
(determines number of drops). If single float it will used as cutout_threshold.
If tuple of float cutout_threshold will be sampled from range |
intensity |
float, or tuple of floats |
Intensity of corruption.
If single float it will be used as intensity.
If tuple of float intensity will be sampled from range |
mode |
string, or list of strings |
Type of corruption. Currently, supported options are 'rain' and 'mud'. If list is provided type of corruption will be sampled list. Default: ("rain"). |
color |
list of (r, g, b) or dict or None |
Corruption elements color. If list uses provided list as color for specified mode. If dict uses provided color for specified mode. Color for each specified mode should be provided in dict. If None uses default colors (rain: (238, 238, 175), mud: (20, 42, 63)). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
Reference: | https://arxiv.org/pdf/1903.12261.pdf | https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py
class
albumentations.augmentations.transforms.Superpixels
(p_replace=0.1, n_segments=100, max_size=128, interpolation=1, always_apply=False, p=0.5)
[view source on GitHub]
¶
Transform images partially/completely to their superpixel representation. This implementation uses skimage's version of the SLIC algorithm.
Parameters:
Name | Type | Description |
---|---|---|
p_replace |
float or tuple of float |
Defines for any segment the probability that the pixels within that
segment are replaced by their average color (otherwise, the pixels are not changed).
Examples:
* A probability of |
n_segments |
int, or tuple of int |
Rough target number of how many superpixels to generate (the algorithm
may deviate from this number). Lower value will lead to coarser superpixels.
Higher values are computationally more intensive and will hence lead to a slowdown
* If a single |
max_size |
int or None |
Maximum image size at which the augmentation is performed.
If the width or height of an image exceeds this value, it will be
downscaled before the augmentation so that the longest side matches |
interpolation |
OpenCV flag |
flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.augmentations.transforms.TemplateTransform
(templates, img_weight=0.5, template_weight=0.5, template_transform=None, name=None, always_apply=False, p=0.5)
[view source on GitHub]
¶
Apply blending of input image with specified templates
Parameters:
Name | Type | Description |
---|---|---|
templates |
numpy array or list of numpy arrays |
Images as template for transform. |
img_weight |
[float, float] or float |
If single float will be used as weight for input image.
If tuple of float img_weight will be in range |
template_weight |
[float, float] or float |
If single float will be used as weight for template.
If tuple of float template_weight will be in range |
template_transform |
transformation object which could be applied to template, must produce template the same size as input image. |
|
name |
string |
(Optional) Name of transform, used only for deserialization. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image Image types: uint8, float32
class
albumentations.augmentations.transforms.ToFloat
(max_value=None, always_apply=False, p=1.0)
[view source on GitHub]
¶
Divide pixel values by max_value
to get a float32 output array where all values lie in the range [0, 1.0].
If max_value
is None the transform will try to infer the maximum value by inspecting the data type of the input
image.
See Also:
:class:~albumentations.augmentations.transforms.FromFloat
Parameters:
Name | Type | Description |
---|---|---|
max_value |
float |
maximum possible input value. Default: None. |
p |
float |
probability of applying the transform. Default: 1.0. |
Targets: image
Image types: any type
class
albumentations.augmentations.transforms.ToGray
[view source on GitHub]
¶
Convert the input RGB image to grayscale. If the mean pixel value for the resulting image is greater than 127, invert the resulting grayscale image.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.ToRGB
(always_apply=True, p=1.0)
[view source on GitHub]
¶
Convert the input grayscale image to RGB.
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 1. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.ToSepia
(always_apply=False, p=0.5)
[view source on GitHub]
¶
Applies sepia filter to the input RGB image
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
Image types: uint8, float32
class
albumentations.augmentations.transforms.UnsharpMask
(blur_limit=(3, 7), sigma_limit=0.0, alpha=(0.2, 0.5), threshold=10, always_apply=False, p=0.5)
[view source on GitHub]
¶
Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.
Parameters:
Name | Type | Description |
---|---|---|
blur_limit |
int, [int, int] |
maximum Gaussian kernel size for blurring the input image.
Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma
as |
sigma_limit |
float, [float, float] |
Gaussian kernel standard deviation. Must be in range [0, inf).
If set single value |
alpha |
float, [float, float] |
range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5). |
threshold |
int |
Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10. |
p |
float |
probability of applying the transform. Default: 0.5. |
Reference: arxiv.org/pdf/2107.10833.pdf
Targets: image
albumentations.augmentations.utils
¶
def
albumentations.augmentations.utils.ensure_contiguous (func)
[view source on GitHub]¶
Ensure that input img is contiguous.
def
albumentations.augmentations.utils.get_opencv_dtype_from_numpy (value)
[view source on GitHub]¶
Return a corresponding OpenCV dtype for a numpy's dtype :param value: Input dtype of numpy array :return: Corresponding dtype for OpenCV
def
albumentations.augmentations.utils.preserve_channel_dim (func)
[view source on GitHub]¶
Preserve dummy channel dim.
def
albumentations.augmentations.utils.preserve_shape (func)
[view source on GitHub]¶
Preserve shape of the image
albumentations.core
special
¶
albumentations.core.bbox_utils
¶
class
albumentations.core.bbox_utils.BboxParams
(format, label_fields=None, min_area=0.0, min_visibility=0.0, min_width=0.0, min_height=0.0, check_each_transform=True)
[view source on GitHub]
¶
Parameters of bounding boxes
Parameters:
Name | Type | Description |
---|---|---|
format |
str |
format of bounding boxes. Should be 'coco', 'pascal_voc', 'albumentations' or 'yolo'. The |
label_fields |
list |
list of fields that are joined with boxes, e.g labels. Should be same type as boxes. |
min_area |
float |
minimum area of a bounding box. All bounding boxes whose visible area in pixels is less than this value will be removed. Default: 0.0. |
min_visibility |
float |
minimum fraction of area for a bounding box to remain this box in list. Default: 0.0. |
min_width |
float |
Minimum width of a bounding box. All bounding boxes whose width is less than this value will be removed. Default: 0.0. |
min_height |
float |
Minimum height of a bounding box. All bounding boxes whose height is less than this value will be removed. Default: 0.0. |
check_each_transform |
bool |
if |
def
albumentations.core.bbox_utils.calculate_bbox_area (bbox, rows, cols)
[view source on GitHub]¶
Calculate the area of a bounding box in (fractional) pixels.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
A bounding box |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
float |
Area in (fractional) pixels of the (denormalized) bounding box. |
def
albumentations.core.bbox_utils.check_bbox (bbox)
[view source on GitHub]¶
Check if bbox boundaries are in range 0, 1 and minimums are lesser then maximums
def
albumentations.core.bbox_utils.check_bboxes (bboxes)
[view source on GitHub]¶
Check if bboxes boundaries are in range 0, 1 and minimums are lesser then maximums
def
albumentations.core.bbox_utils.convert_bbox_from_albumentations (bbox, target_format, rows, cols, check_validity=False)
[view source on GitHub]¶
Convert a bounding box from the format used by albumentations to a format, specified in target_format
.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
An albumentations bounding box |
target_format |
str |
required format of the output bounding box. Should be 'coco', 'pascal_voc' or 'yolo'. |
rows |
int |
Image height. |
cols |
int |
Image width. |
check_validity |
bool |
Check if all boxes are valid boxes. |
Returns:
Type | Description |
---|---|
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
tuple: A bounding box. |
Note:
The coco
format of a bounding box looks like [x_min, y_min, width, height]
, e.g. [97, 12, 150, 200].
The pascal_voc
format of a bounding box looks like [x_min, y_min, x_max, y_max]
, e.g. [97, 12, 247, 212].
The yolo
format of a bounding box looks like [x, y, width, height]
, e.g. [0.3, 0.1, 0.05, 0.07].
Exceptions:
Type | Description |
---|---|
ValueError |
if |
def
albumentations.core.bbox_utils.convert_bbox_to_albumentations (bbox, source_format, rows, cols, check_validity=False)
[view source on GitHub]¶
Convert a bounding box from a format specified in source_format
to the format used by albumentations:
normalized coordinates of top-left and bottom-right corners of the bounding box in a form of
(x_min, y_min, x_max, y_max)
e.g. (0.15, 0.27, 0.67, 0.5)
.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
A bounding box tuple. |
source_format |
str |
format of the bounding box. Should be 'coco', 'pascal_voc', or 'yolo'. |
check_validity |
bool |
Check if all boxes are valid boxes. |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
tuple: A bounding box |
Note:
The coco
format of a bounding box looks like (x_min, y_min, width, height)
, e.g. (97, 12, 150, 200).
The pascal_voc
format of a bounding box looks like (x_min, y_min, x_max, y_max)
, e.g. (97, 12, 247, 212).
The yolo
format of a bounding box looks like (x, y, width, height)
, e.g. (0.3, 0.1, 0.05, 0.07);
where x
, y
coordinates of the center of the box, all values normalized to 1 by image height and width.
Exceptions:
Type | Description |
---|---|
ValueError |
if |
ValueError |
If in YOLO format all labels not in range (0, 1). |
def
albumentations.core.bbox_utils.convert_bboxes_from_albumentations (bboxes, target_format, rows, cols, check_validity=False)
[view source on GitHub]¶
Convert a list of bounding boxes from the format used by albumentations to a format, specified
in target_format
.
Parameters:
Name | Type | Description |
---|---|---|
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List of albumentation bounding box |
target_format |
str |
required format of the output bounding box. Should be 'coco', 'pascal_voc' or 'yolo'. |
rows |
int |
Image height. |
cols |
int |
Image width. |
check_validity |
bool |
Check if all boxes are valid boxes. |
Returns:
Type | Description |
---|---|
List[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List of bounding boxes. |
def
albumentations.core.bbox_utils.convert_bboxes_to_albumentations (bboxes, source_format, rows, cols, check_validity=False)
[view source on GitHub]¶
Convert a list bounding boxes from a format specified in source_format
to the format used by albumentations
def
albumentations.core.bbox_utils.denormalize_bbox (bbox, rows, cols)
[view source on GitHub]¶
Denormalize coordinates of a bounding box. Multiply x-coordinates by image width and y-coordinates
by image height. This is an inverse operation for :func:~albumentations.augmentations.bbox.normalize_bbox
.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
~TBox |
Normalized bounding box |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
~TBox |
Denormalized bounding box |
Exceptions:
Type | Description |
---|---|
ValueError |
If rows or cols is less or equal zero |
def
albumentations.core.bbox_utils.denormalize_bboxes (bboxes, rows, cols)
[view source on GitHub]¶
Denormalize a list of bounding boxes.
Parameters:
Name | Type | Description |
---|---|---|
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Normalized bounding boxes |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
List[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List: Denormalized bounding boxes |
def
albumentations.core.bbox_utils.filter_bboxes (bboxes, rows, cols, min_area=0.0, min_visibility=0.0, min_width=0.0, min_height=0.0)
[view source on GitHub]¶
Remove bounding boxes that either lie outside of the visible area by more then min_visibility
or whose area in pixels is under the threshold set by min_area
. Also it crops boxes to final image size.
Parameters:
Name | Type | Description |
---|---|---|
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List of albumentation bounding box |
rows |
int |
Image height. |
cols |
int |
Image width. |
min_area |
float |
Minimum area of a bounding box. All bounding boxes whose visible area in pixels. is less than this value will be removed. Default: 0.0. |
min_visibility |
float |
Minimum fraction of area for a bounding box to remain this box in list. Default: 0.0. |
min_width |
float |
Minimum width of a bounding box. All bounding boxes whose width is less than this value will be removed. Default: 0.0. |
min_height |
float |
Minimum height of a bounding box. All bounding boxes whose height is less than this value will be removed. Default: 0.0. |
Returns:
Type | Description |
---|---|
List[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List of bounding boxes. |
def
albumentations.core.bbox_utils.filter_bboxes_by_visibility (original_shape, bboxes, transformed_shape, transformed_bboxes, threshold=0.0, min_area=0.0)
[view source on GitHub]¶
Filter bounding boxes and return only those boxes whose visibility after transformation is above the threshold and minimal area of bounding box in pixels is more then min_area.
Parameters:
Name | Type | Description |
---|---|---|
original_shape |
Sequence[int] |
Original image shape |
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Original bounding boxes |
transformed_shape |
Sequence[int] |
Transformed image shape |
transformed_bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Transformed bounding boxes |
threshold |
float |
visibility threshold. Should be a value in the range [0.0, 1.0]. |
min_area |
float |
Minimal area threshold. |
Returns:
Type | Description |
---|---|
List[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Filtered bounding boxes |
def
albumentations.core.bbox_utils.normalize_bbox (bbox, rows, cols)
[view source on GitHub]¶
Normalize coordinates of a bounding box. Divide x-coordinates by image width and y-coordinates by image height.
Parameters:
Name | Type | Description |
---|---|---|
bbox |
~TBox |
Denormalized bounding box |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
~TBox |
Normalized bounding box |
Exceptions:
Type | Description |
---|---|
ValueError |
If rows or cols is less or equal zero |
def
albumentations.core.bbox_utils.normalize_bboxes (bboxes, rows, cols)
[view source on GitHub]¶
Normalize a list of bounding boxes.
Parameters:
Name | Type | Description |
---|---|---|
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Denormalized bounding boxes |
rows |
int |
Image height. |
cols |
int |
Image width. |
Returns:
Type | Description |
---|---|
List[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
Normalized bounding boxes |
def
albumentations.core.bbox_utils.union_of_bboxes (height, width, bboxes, erosion_rate=0.0)
[view source on GitHub]¶
Calculate union of bounding boxes.
Parameters:
Name | Type | Description |
---|---|---|
height |
int |
Height of image or space. |
width |
int |
Width of image or space. |
bboxes |
Sequence[Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]]] |
List like bounding boxes. Format is |
erosion_rate |
float |
How much each bounding box can be shrinked, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox to lose its volume. |
Returns:
Type | Description |
---|---|
Union[Tuple[float, float, float, float], Tuple[float, float, float, float, Any]] |
tuple: A bounding box |
albumentations.core.composition
¶
class
albumentations.core.composition.Compose
(transforms, bbox_params=None, keypoint_params=None, additional_targets=None, p=1.0, is_check_shapes=True)
[view source on GitHub]
¶
Compose transforms and handle all transformations regarding bounding boxes
Parameters:
Name | Type | Description |
---|---|---|
transforms |
list |
list of transformations to compose. |
bbox_params |
BboxParams |
Parameters for bounding boxes transforms |
keypoint_params |
KeypointParams |
Parameters for keypoints transforms |
additional_targets |
dict |
Dict with keys - new target name, values - old target name. ex: {'image2': 'image'} |
p |
float |
probability of applying all list of transforms. Default: 1.0. |
is_check_shapes |
bool |
If True shapes consistency of images/mask/masks would be checked on each call. If you would like to disable this check - pass False (do it only if you are sure in your data consistency). |
class
albumentations.core.composition.OneOf
(transforms, p=0.5)
[view source on GitHub]
¶
Select one of transforms to apply. Selected transform will be called with force_apply=True
.
Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.
Parameters:
Name | Type | Description |
---|---|---|
transforms |
list |
list of transformations to compose. |
p |
float |
probability of applying selected transform. Default: 0.5. |
class
albumentations.core.composition.OneOrOther
(first=None, second=None, transforms=None, p=0.5)
[view source on GitHub]
¶
Select one or another transform to apply. Selected transform will be called with force_apply=True
.
class
albumentations.core.composition.PerChannel
(transforms, channels=None, p=0.5)
[view source on GitHub]
¶
Apply transformations per-channel
Parameters:
Name | Type | Description |
---|---|---|
transforms |
list |
list of transformations to compose. |
channels |
sequence |
channels to apply the transform to. Pass None to apply to all. Default: None (apply to all) |
p |
float |
probability of applying the transform. Default: 0.5. |
class
albumentations.core.composition.Sequential
(transforms, p=0.5)
[view source on GitHub]
¶
Sequentially applies all transforms to targets.
Note:
This transform is not intended to be a replacement for Compose
. Instead, it should be used inside Compose
the same way OneOf
or OneOrOther
are used. For instance, you can combine OneOf
with Sequential
to
create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly
chose sequence to input data (see the Example
section for an example definition of such pipeline).
Examples:
>>> import albumentations as A
>>> transform = A.Compose([
>>> A.OneOf([
>>> A.Sequential([
>>> A.HorizontalFlip(p=0.5),
>>> A.ShiftScaleRotate(p=0.5),
>>> ]),
>>> A.Sequential([
>>> A.VerticalFlip(p=0.5),
>>> A.RandomBrightnessContrast(p=0.5),
>>> ]),
>>> ], p=1)
>>> ])
class
albumentations.core.composition.SomeOf
(transforms, n, replace=True, p=1)
[view source on GitHub]
¶
Select N transforms to apply. Selected transforms will be called with force_apply=True
.
Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.
Parameters:
Name | Type | Description |
---|---|---|
transforms |
list |
list of transformations to compose. |
n |
int |
number of transforms to apply. |
replace |
bool |
Whether the sampled transforms are with or without replacement. Default: True. |
p |
float |
probability of applying selected transform. Default: 1. |
albumentations.core.keypoints_utils
¶
class
albumentations.core.keypoints_utils.KeypointParams
(format, label_fields=None, remove_invisible=True, angle_in_degrees=True, check_each_transform=True)
[view source on GitHub]
¶
Parameters of keypoints
Parameters:
Name | Type | Description |
---|---|---|
format |
str |
format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa'. x - X coordinate, y - Y coordinate s - Keypoint scale a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees) |
label_fields |
list |
list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints. |
remove_invisible |
bool |
to remove invisible points after transform or not |
angle_in_degrees |
bool |
angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints |
check_each_transform |
bool |
if |
def
albumentations.core.keypoints_utils.check_keypoint (kp, rows, cols)
[view source on GitHub]¶
Check if keypoint coordinates are less than image shapes
def
albumentations.core.keypoints_utils.check_keypoints (keypoints, rows, cols)
[view source on GitHub]¶
Check if keypoints boundaries are less than image shapes
albumentations.core.serialization
¶
class
albumentations.core.serialization.Serializable
[view source on GitHub]
¶
albumentations.core.serialization.Serializable.to_dict (self, on_not_implemented_error='raise')
¶
Take a transform pipeline and convert it to a serializable representation that uses only standard python data types: dictionaries, lists, strings, integers, and floats.
Parameters:
Name | Type | Description |
---|---|---|
self |
A transform that should be serialized. If the transform doesn't implement the |
|
on_not_implemented_error |
str |
|
class
albumentations.core.serialization.SerializableMeta
[view source on GitHub]
¶
A metaclass that is used to register classes in SERIALIZABLE_REGISTRY
or NON_SERIALIZABLE_REGISTRY
so they can be found later while deserializing transformation pipeline using classes full names.
albumentations.core.serialization.SerializableMeta.__new__ (mcs, name, bases, *args, **kwargs)
special
staticmethod
¶
Create and return a new object. See help(type) for accurate signature.
def
albumentations.core.serialization.from_dict (transform_dict, nonserializable=None, lambda_transforms='deprecated')
[view source on GitHub]¶
Parameters:
Name | Type | Description |
---|---|---|
transform_dict |
Dict[str, Any] |
A dictionary with serialized transform pipeline. |
nonserializable |
Optional[Dict[str, Any]] |
A dictionary that contains non-serializable transforms.
This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.
Keys in that dictionary should be named same as |
lambda_transforms |
Union[Dict[str, Any], NoneType, str] |
Deprecated. Use 'nonserizalizable' instead. |
def
albumentations.core.serialization.load (filepath, data_format='json', nonserializable=None, lambda_transforms='deprecated')
[view source on GitHub]¶
Load a serialized pipeline from a json or yaml file and construct a transform pipeline.
Parameters:
Name | Type | Description |
---|---|---|
filepath |
str |
Filepath to read from. |
data_format |
str |
Serialization format. Should be either |
nonserializable |
Optional[Dict[str, Any]] |
A dictionary that contains non-serializable transforms.
This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.
Keys in that dictionary should be named same as |
lambda_transforms |
Union[Dict[str, Any], NoneType, str] |
Deprecated. Use 'nonserizalizable' instead. |
def
albumentations.core.serialization.register_additional_transforms ()
[view source on GitHub]¶
Register transforms that are not imported directly into the albumentations
module.
def
albumentations.core.serialization.save (transform, filepath, data_format='json', on_not_implemented_error='raise')
[view source on GitHub]¶
Take a transform pipeline, serialize it and save a serialized version to a file using either json or yaml format.
Parameters:
Name | Type | Description |
---|---|---|
transform |
Serializable |
Transform to serialize. |
filepath |
str |
Filepath to write to. |
data_format |
str |
Serialization format. Should be either |
on_not_implemented_error |
str |
Parameter that describes what to do if a transform doesn't implement
the |
def
albumentations.core.serialization.to_dict (transform, on_not_implemented_error='raise')
[view source on GitHub]¶
Take a transform pipeline and convert it to a serializable representation that uses only standard python data types: dictionaries, lists, strings, integers, and floats.
Parameters:
Name | Type | Description |
---|---|---|
transform |
Serializable |
A transform that should be serialized. If the transform doesn't implement the |
on_not_implemented_error |
str |
|
albumentations.core.transforms_interface
¶
class
albumentations.core.transforms_interface.BasicTransform
(always_apply=False, p=0.5)
[view source on GitHub]
¶
albumentations.core.transforms_interface.BasicTransform.add_targets (self, additional_targets)
¶
Add targets to transform them the same way as one of existing targets ex: {'target_image': 'image'} ex: {'obj1_mask': 'mask', 'obj2_mask': 'mask'} by the way you must have at least one object with key 'image'
Parameters:
Name | Type | Description |
---|---|---|
additional_targets |
Dict[str, str] |
keys - new target name, values - old target name. ex: {'image2': 'image'} |
class
albumentations.core.transforms_interface.DualTransform
[view source on GitHub]
¶
Transform for segmentation task.
class
albumentations.core.transforms_interface.ImageOnlyTransform
[view source on GitHub]
¶
Transform applied to image only.
def
albumentations.core.transforms_interface.to_tuple (param, low=None, bias=None)
[view source on GitHub]¶
Convert input argument to min-max tuple
Parameters:
Name | Type | Description |
---|---|---|
param |
scalar, tuple or list of 2+ elements |
Input value. If value is scalar, return value would be (offset - value, offset + value). If value is tuple, return value would be value + offset (broadcasted). |
low |
Second element of tuple can be passed as optional argument |
|
bias |
An offset factor added to each element |
albumentations.imgaug
special
¶
albumentations.imgaug.transforms
¶
class
albumentations.imgaug.transforms.IAAAdditiveGaussianNoise
(loc=0, scale=(2.5500000000000003, 12.75), per_channel=False, always_apply=False, p=0.5)
[view source on GitHub]
¶
Add gaussian noise to the input image.
This augmentation is deprecated. Please use GaussNoise instead.
Parameters:
Name | Type | Description |
---|---|---|
loc |
int |
mean of the normal distribution that generates the noise. Default: 0. |
scale |
[float, float] |
standard deviation of the normal distribution that generates the noise. Default: (0.01 * 255, 0.05 * 255). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.imgaug.transforms.IAAAffine
(scale=1.0, translate_percent=None, translate_px=None, rotate=0.0, shear=0.0, order=1, cval=0, mode='reflect', always_apply=False, p=0.5)
[view source on GitHub]
¶
Place a regular grid of points on the input and randomly move the neighbourhood of these point around via affine transformations.
This augmentation is deprecated. Please use Affine instead.
Note: This class introduce interpolation artifacts to mask if it has values other than {0;1}
Parameters:
Name | Type | Description |
---|---|---|
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask
class
albumentations.imgaug.transforms.IAACropAndPad
(px=None, percent=None, pad_mode='constant', pad_cval=0, keep_size=True, always_apply=False, p=1)
[view source on GitHub]
¶
This augmentation is deprecated. Please use CropAndPad instead.
class
albumentations.imgaug.transforms.IAAEmboss
(alpha=(0.2, 0.5), strength=(0.2, 0.7), always_apply=False, p=0.5)
[view source on GitHub]
¶
Emboss the input image and overlays the result with the original image. This augmentation is deprecated. Please use Emboss instead.
Parameters:
Name | Type | Description |
---|---|---|
alpha |
[float, float] |
range to choose the visibility of the embossed image. At 0, only the original image is visible,at 1.0 only its embossed version is visible. Default: (0.2, 0.5). |
strength |
[float, float] |
strength range of the embossing. Default: (0.2, 0.7). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.imgaug.transforms.IAAFliplr
(always_apply=False, p=0.5)
[view source on GitHub]
¶
This augmentation is deprecated. Please use HorizontalFlip instead.
class
albumentations.imgaug.transforms.IAAFlipud
(always_apply=False, p=0.5)
[view source on GitHub]
¶
This augmentation is deprecated. Please use VerticalFlip instead.
class
albumentations.imgaug.transforms.IAAPerspective
(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5)
[view source on GitHub]
¶
Perform a random four point perspective transform of the input. This augmentation is deprecated. Please use Perspective instead.
Note: This class introduce interpolation artifacts to mask if it has values other than {0;1}
Parameters:
Name | Type | Description |
---|---|---|
scale |
[float, float] |
standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. Default: (0.05, 0.1). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask
class
albumentations.imgaug.transforms.IAAPiecewiseAffine
(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, order=1, cval=0, mode='constant', always_apply=False, p=0.5)
[view source on GitHub]
¶
Place a regular grid of points on the input and randomly move the neighbourhood of these point around via affine transformations.
This augmentation is deprecated. Please use PiecewiseAffine instead.
Note: This class introduce interpolation artifacts to mask if it has values other than {0;1}
Parameters:
Name | Type | Description |
---|---|---|
scale |
[float, float] |
factor range that determines how far each point is moved. Default: (0.03, 0.05). |
nb_rows |
int |
number of rows of points that the regular grid should have. Default: 4. |
nb_cols |
int |
number of columns of points that the regular grid should have. Default: 4. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image, mask
class
albumentations.imgaug.transforms.IAASharpen
(alpha=(0.2, 0.5), lightness=(0.5, 1.0), always_apply=False, p=0.5)
[view source on GitHub]
¶
Sharpen the input image and overlays the result with the original image. This augmentation is deprecated. Please use Sharpen instead
Parameters:
Name | Type | Description |
---|---|---|
alpha |
[float, float] |
range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5). |
lightness |
[float, float] |
range to choose the lightness of the sharpened image. Default: (0.5, 1.0). |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
class
albumentations.imgaug.transforms.IAASuperpixels
(p_replace=0.1, n_segments=100, always_apply=False, p=0.5)
[view source on GitHub]
¶
Completely or partially transform the input image to its superpixel representation. Uses skimage's version of the SLIC algorithm. May be slow.
This augmentation is deprecated. Please use Superpixels instead.
Parameters:
Name | Type | Description |
---|---|---|
p_replace |
float |
defines the probability of any superpixel area being replaced by the superpixel, i.e. by the average pixel color within its area. Default: 0.1. |
n_segments |
int |
target number of superpixels to generate. Default: 100. |
p |
float |
probability of applying the transform. Default: 0.5. |
Targets: image
albumentations.pytorch
special
¶
albumentations.pytorch.transforms
¶
class
albumentations.pytorch.transforms.ToTensor
(num_classes=1, sigmoid=True, normalize=None)
[view source on GitHub]
¶
Convert image and mask to torch.Tensor
and divide by 255 if image or mask are uint8
type.
This transform is now removed from Albumentations. If you need it downgrade the library to version 0.5.2.
Parameters:
Name | Type | Description |
---|---|---|
num_classes |
int |
only for segmentation |
sigmoid |
bool |
only for segmentation, transform mask to LongTensor or not. |
normalize |
dict |
dict with keys [mean, std] to pass it into torchvision.normalize |
class
albumentations.pytorch.transforms.ToTensorV2
(transpose_mask=False, always_apply=True, p=1.0)
[view source on GitHub]
¶
Convert image and mask to torch.Tensor
. The numpy HWC
image is converted to pytorch CHW
tensor.
If the image is in HW
format (grayscale image), it will be converted to pytorch HW
tensor.
This is a simplified and improved version of the old ToTensor
transform (ToTensor
was deprecated, and now it is not present in Albumentations. You should use ToTensorV2
instead).
Parameters:
Name | Type | Description |
---|---|---|
transpose_mask |
bool |
If True and an input mask has three dimensions, this transform will transpose dimensions
so the shape |
always_apply |
bool |
Indicates whether this transformation should be always applied. Default: True. |
p |
float |
Probability of applying the transform. Default: 1.0. |