albumentations.augmentations.geometric.functional
Functional implementations of geometric image transformations. This module provides low-level functions for geometric operations such as rotation, resizing, flipping, perspective transforms, and affine transformations on images, bounding boxes and keypoints.
Members
- functionadjust_padding_by_position
- functionalmost_equal_intervals
- functionapply_affine_to_points
- functionbbox_distort_image
- functionbboxes_affine
- functionbboxes_affine_ellipse
- functionbboxes_affine_largest_box
- functionbboxes_d4
- functionbboxes_grid_shuffle
- functionbboxes_hflip
- functionbboxes_piecewise_affine
- functionbboxes_rot90
- functionbboxes_transpose
- functionbboxes_vflip
- functioncalculate_affine_transform_padding
- functioncenter
- functioncenter_bbox
- functioncompute_affine_warp_output_shape
- functioncompute_pairwise_distances
- functioncompute_perspective_params
- functioncompute_tps_weights
- functioncompute_transformed_image_bounds
- functioncopy_make_border_with_value_extension
- functioncreate_affine_transformation_matrix
- functioncreate_piecewise_affine_maps
- functioncreate_shape_groups
- functiond4
- functiondistort_image
- functiondistort_image_keypoints
- functionexpand_transform
- functionextend_value
- functionfind_keypoint
- functionflip_bboxes
- functionflip_keypoints
- functionfrom_distance_maps
- functiongenerate_control_points
- functiongenerate_displacement_fields
- functiongenerate_distorted_grid_polygons
- functiongenerate_grid
- functiongenerate_inverse_distortion_map
- functiongenerate_perspective_points
- functiongenerate_reflected_bboxes
- functiongenerate_reflected_keypoints
- functiongenerate_shuffled_splits
- functionget_camera_matrix_distortion_maps
- functionget_dimension_padding
- functionget_fisheye_distortion_maps
- functionget_pad_grid_dimensions
- functionget_padding_params
- functionis_identity_matrix
- functionis_valid_component
- functionkeypoints_affine
- functionkeypoints_d4
- functionkeypoints_hflip
- functionkeypoints_rot90
- functionkeypoints_scale
- functionkeypoints_transpose
- functionkeypoints_vflip
- functionnormalize_grid_distortion_steps
- functionorder_points
- functionpad
- functionpad_bboxes
- functionpad_keypoints
- functionpad_with_params
- functionperspective
- functionperspective_bboxes
- functionperspective_keypoints
- functionremap
- functionremap_bboxes
- functionremap_keypoints
- functionremap_keypoints_via_mask
- functionresize
- functionrot90
- functionrotation2d_matrix_to_euler_angles
- functionscale
- functionshift_bboxes
- functionshift_keypoints
- functionshuffle_tiles_within_shape_groups
- functionsplit_uniform_grid
- functionswap_tiles_on_image
- functionswap_tiles_on_keypoints
- functionto_distance_maps
- functiontps_transform
- functiontranspose
- functionvalidate_bboxes
- functionvalidate_if_not_found_coords
- functionvalidate_keypoints
- functionvolume_hflip
- functionvolume_rot90
- functionvolume_vflip
- functionvolumes_hflip
- functionvolumes_rot90
- functionvolumes_vflip
- functionwarp_affine
- functionwarp_affine_with_value_extension
adjust_padding_by_positionfunction
adjust_padding_by_position(
h_top: int,
h_bottom: int,
w_left: int,
w_right: int,
position: Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random'],
py_random: np.random.RandomState
)
Adjust padding values based on desired position.
Parameters
Name | Type | Default | Description |
---|---|---|---|
h_top | int | - | - |
h_bottom | int | - | - |
w_left | int | - | - |
w_right | int | - | - |
position | One of:
| - | - |
py_random | np.random.RandomState | - | - |
almost_equal_intervalsfunction
almost_equal_intervals(
n: int,
parts: int
)
Generates an array of nearly equal integer intervals that sum up to `n`. This function divides the number `n` into `parts` nearly equal parts. It ensures that the sum of all parts equals `n`, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.
Parameters
Name | Type | Default | Description |
---|---|---|---|
n | int | - | The total value to be split. |
parts | int | - | The number of parts to split into. |
Returns
- np.ndarray: An array of integers where each integer represents the size of a part.
Example
>>> almost_equal_intervals(20, 3)
array([7, 7, 6]) # Splits 20 into three parts: 7, 7, and 6
>>> almost_equal_intervals(16, 4)
array([4, 4, 4, 4]) # Splits 16 into four equal parts
apply_affine_to_pointsfunction
apply_affine_to_points(
points: np.ndarray,
matrix: np.ndarray
)
Apply affine transformation to a set of points. This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.
Parameters
Name | Type | Default | Description |
---|---|---|---|
points | np.ndarray | - | Array of points with shape (N, 2). |
matrix | np.ndarray | - | 3x3 affine transformation matrix. |
Returns
- np.ndarray: Transformed points with shape (N, 2).
bbox_distort_imagefunction
bbox_distort_image(
bboxes: np.ndarray,
generated_mesh: np.ndarray,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | - |
generated_mesh | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
bboxes_affinefunction
bboxes_affine(
bboxes: np.ndarray,
matrix: np.ndarray,
rotate_method: Literal['largest_box', 'ellipse'],
image_shape: tuple[int, int],
border_mode: int,
output_shape: tuple[int, int]
)
Apply an affine transformation to bounding boxes. For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes For other border modes, it directly applies the affine transformation without padding.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Input bounding boxes |
matrix | np.ndarray | - | Affine transformation matrix |
rotate_method | One of:
| - | Method for rotating bounding boxes ('largest_box' or 'ellipse') |
image_shape | tuple[int, int] | - | Shape of the input image |
border_mode | int | - | OpenCV border mode |
output_shape | tuple[int, int] | - | Shape of the output image |
Returns
- np.ndarray: Transformed and normalized bounding boxes
bboxes_affine_ellipsefunction
bboxes_affine_ellipse(
bboxes: np.ndarray,
matrix: np.ndarray
)
Apply an affine transformation to bounding boxes using an ellipse approximation method. This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels). |
matrix | np.ndarray | - | The 3x3 affine transformation matrix to apply. |
Returns
- np.ndarray: An array of transformed bounding boxes with the same shape as the input.
Notes
- This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max]. - The ellipse approximation method can provide a tighter bounding box compared to the largest box method, especially for rotations. - 360 points are used to approximate each ellipse, which provides a good balance between accuracy and computational efficiency. - Any additional attributes beyond the first 4 coordinates are preserved unchanged. - This method may be more suitable for objects that are roughly elliptical in shape.
bboxes_affine_largest_boxfunction
bboxes_affine_largest_box(
bboxes: np.ndarray,
matrix: np.ndarray
)
Apply an affine transformation to bounding boxes and return the largest enclosing boxes. This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels). |
matrix | np.ndarray | - | The 3x3 affine transformation matrix to apply. |
Returns
- np.ndarray: An array of transformed bounding boxes with the same shape as the input.
Example
>>> bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]]) # Two boxes with class labels
>>> matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]]) # Scale by 2 and translate by (5, 5)
>>> transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)
>>> print(transformed_bboxes)
[[ 25. 25. 45. 45. 1.]
[ 65. 65. 85. 85. 2.]]
Notes
- This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max]. - The resulting bounding boxes are the smallest axis-aligned boxes that completely enclose the transformed original boxes. They may be larger than the minimal possible bounding box if the original box becomes rotated. - Any additional attributes beyond the first 4 coordinates are preserved unchanged. - This method is called "largest box" because it returns the largest axis-aligned box that encloses all corners of the transformed bounding box.
bboxes_d4function
bboxes_d4(
bboxes: np.ndarray,
group_member: Literal['e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't']
)
Applies a `D_4` symmetry group transformation to a bounding box. The function transforms a bounding box according to the specified group member from the `D_4` group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions. Args: bboxes (np.ndarray): A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...). group_member (Literal["e", "r90", "r180", "r270", "v", "hvt", "h", "t"]): A string identifier for the `D_4` group transformation to apply. Returns: BoxInternalType: The transformed bounding box. Raises: ValueError: If an invalid group member is specified.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | - |
group_member | One of:
| - | - |
Returns
- BoxInternalType: The transformed bounding box.
bboxes_grid_shufflefunction
bboxes_grid_shuffle(
bboxes: np.ndarray,
tiles: np.ndarray,
mapping: list[int],
image_shape: tuple[int, int],
min_area: float,
min_visibility: float
)
Shuffle bounding boxes according to grid mapping.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (num_boxes, 4+) |
tiles | np.ndarray | - | Array of grid tiles |
mapping | list[int] | - | Mapping of tile indices |
image_shape | tuple[int, int] | - | Shape of the image (height, width) |
min_area | float | - | Minimum area of a bounding box to keep |
min_visibility | float | - | Minimum visibility ratio of a bounding box to keep |
Returns
- np.ndarray: Shuffled bounding boxes
bboxes_hflipfunction
bboxes_hflip(
bboxes: np.ndarray
)
Flip bounding boxes horizontally.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (num_boxes, 4+) |
Returns
- np.ndarray: Horizontally flipped bounding boxes
bboxes_piecewise_affinefunction
bboxes_piecewise_affine(
bboxes: np.ndarray,
map_x: np.ndarray,
map_y: np.ndarray,
border_mode: int,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | - |
map_x | np.ndarray | - | - |
map_y | np.ndarray | - | - |
border_mode | int | - | - |
image_shape | tuple[int, int] | - | - |
bboxes_rot90function
bboxes_rot90(
bboxes: np.ndarray,
factor: int
)
Rotates bounding boxes by 90 degrees CCW (see np.rot90)
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (num_boxes, 4+) |
factor | int | - | Number of 90-degree rotations (1, 2, or 3) |
Returns
- np.ndarray: Rotated bounding boxes
bboxes_transposefunction
bboxes_transpose(
bboxes: np.ndarray
)
Transpose bounding boxes along the main diagonal.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (num_boxes, 4+) |
Returns
- np.ndarray: Transposed bounding boxes
bboxes_vflipfunction
bboxes_vflip(
bboxes: np.ndarray
)
Flip bounding boxes vertically.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (num_boxes, 4+) |
Returns
- np.ndarray: Vertically flipped bounding boxes
calculate_affine_transform_paddingfunction
calculate_affine_transform_padding(
matrix: np.ndarray,
image_shape: tuple[int, int]
)
Calculate the necessary padding for an affine transformation to avoid empty spaces.
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
centerfunction
center(
image_shape: tuple[int, int]
)
Calculate the center coordinates if image. Used by images, masks and keypoints.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | The shape of the image. |
Returns
- tuple[float, float]: center_x, center_y
center_bboxfunction
center_bbox(
image_shape: tuple[int, int]
)
Calculate the center coordinates for of image for bounding boxes.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | The shape of the image. |
Returns
- tuple[float, float]: center_x, center_y
compute_affine_warp_output_shapefunction
compute_affine_warp_output_shape(
matrix: np.ndarray,
input_shape: tuple[int, ...]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | - |
input_shape | tuple[int, ...] | - | - |
compute_pairwise_distancesfunction
compute_pairwise_distances(
points1: np.ndarray,
points2: np.ndarray
)
Compute pairwise distances between two sets of points.
Parameters
Name | Type | Default | Description |
---|---|---|---|
points1 | np.ndarray | - | First set of points with shape (N, 2) |
points2 | np.ndarray | - | Second set of points with shape (M, 2) |
Returns
- np.ndarray: Matrix of pairwise distances with shape (N, M)
compute_perspective_paramsfunction
compute_perspective_params(
points: np.ndarray,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
points | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
compute_tps_weightsfunction
compute_tps_weights(
src_points: np.ndarray,
dst_points: np.ndarray
)
Compute Thin Plate Spline weights.
Parameters
Name | Type | Default | Description |
---|---|---|---|
src_points | np.ndarray | - | Source control points with shape (num_points, 2) |
dst_points | np.ndarray | - | Destination control points with shape (num_points, 2) |
Returns
- tuple[np.ndarray, np.ndarray]: Tuple of (nonlinear_weights, affine_weights)
compute_transformed_image_boundsfunction
compute_transformed_image_bounds(
matrix: np.ndarray,
image_shape: tuple[int, int]
)
Compute the bounds of an image after applying an affine transformation.
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | The 3x3 affine transformation matrix. |
image_shape | tuple[int, int] | - | The shape of the image as (height, width). |
Returns
- tuple[np.ndarray, np.ndarray]: A tuple containing:
copy_make_border_with_value_extensionfunction
copy_make_border_with_value_extension(
img: np.ndarray,
top: int,
bottom: int,
left: int,
right: int,
border_mode: int,
value: tuple[float, ...] | float
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | - |
top | int | - | - |
bottom | int | - | - |
left | int | - | - |
right | int | - | - |
border_mode | int | - | - |
value | One of:
| - | - |
create_affine_transformation_matrixfunction
create_affine_transformation_matrix(
translate: Mapping[str, float],
shear: dict[str, float],
scale: dict[str, float],
rotate: float,
shift: tuple[float, float]
)
Create an affine transformation matrix combining translation, shear, scale, and rotation.
Parameters
Name | Type | Default | Description |
---|---|---|---|
translate | Mapping[str, float] | - | Translation in x and y directions. |
shear | dict[str, float] | - | Shear in x and y directions (in degrees). |
scale | dict[str, float] | - | Scale factors for x and y directions. |
rotate | float | - | Rotation angle in degrees. |
shift | tuple[float, float] | - | Shift to apply before and after transformations. |
Returns
- np.ndarray: The resulting 3x3 affine transformation matrix.
create_piecewise_affine_mapsfunction
create_piecewise_affine_maps(
image_shape: tuple[int, int],
grid: tuple[int, int],
scale: float,
absolute_scale: bool,
random_generator: np.random.Generator
)
Create maps for piecewise affine transformation using OpenCV's remap function.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | - |
grid | tuple[int, int] | - | - |
scale | float | - | - |
absolute_scale | bool | - | - |
random_generator | np.random.Generator | - | - |
create_shape_groupsfunction
create_shape_groups(
tiles: np.ndarray
)
Groups tiles by their shape and stores the indices for each shape.
Parameters
Name | Type | Default | Description |
---|---|---|---|
tiles | np.ndarray | - | - |
d4function
d4(
img: np.ndarray,
group_member: Literal['e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't']
)
Applies a `D_4` symmetry group transformation to an image array. This function manipulates an image using transformations such as rotations and flips, corresponding to the `D_4` dihedral group symmetry operations. Each transformation is identified by a unique group member code.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | The input image array to transform. |
group_member | One of:
| - | A string identifier indicating the specific transformation to apply. Valid codes include: - 'e': Identity (no transformation). - 'r90': Rotate 90 degrees counterclockwise. - 'r180': Rotate 180 degrees. - 'r270': Rotate 270 degrees counterclockwise. - 'v': Vertical flip. - 'hvt': Transpose over second diagonal - 'h': Horizontal flip. - 't': Transpose (reflect over the main diagonal). |
Returns
- np.ndarray: The transformed image array.
distort_imagefunction
distort_image(
image: np.ndarray,
generated_mesh: np.ndarray,
interpolation: int
)
Apply perspective distortion to an image based on a generated mesh. This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image | np.ndarray | - | The input image to be distorted. Can be a 2D grayscale image or a 3D color image. |
generated_mesh | np.ndarray | - | A 2D array where each row represents a quadrilateral cell as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4]. The first four values define the source rectangle, and the last eight values define the destination quadrilateral. |
interpolation | int | - | Interpolation method to be used in the perspective transformation. Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR). |
Returns
- np.ndarray: The distorted image with the same shape and dtype as the input image.
Example
>>> image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)
>>> mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])
>>> distorted = distort_image(image, mesh, cv2.INTER_LINEAR)
>>> distorted.shape
(100, 100, 3)
Notes
- The function preserves the channel dimension of the input image. - Each cell of the generated mesh is transformed independently and then blended into the output image. - The distortion is applied using perspective transformation, which allows for more complex distortions compared to affine transformations.
distort_image_keypointsfunction
distort_image_keypoints(
keypoints: np.ndarray,
generated_mesh: np.ndarray,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | - |
generated_mesh | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
expand_transformfunction
expand_transform(
matrix: np.ndarray,
shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | - |
shape | tuple[int, int] | - | - |
extend_valuefunction
extend_value(
value: tuple[float, ...] | float,
num_channels: int
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
value | One of:
| - | - |
num_channels | int | - | - |
find_keypointfunction
find_keypoint(
position: tuple[int, int],
distance_map: np.ndarray,
threshold: float | None,
inverted: bool
)
Determine if a valid keypoint can be found at the given position.
Parameters
Name | Type | Default | Description |
---|---|---|---|
position | tuple[int, int] | - | - |
distance_map | np.ndarray | - | - |
threshold | One of:
| - | - |
inverted | bool | - | - |
flip_bboxesfunction
flip_bboxes(
bboxes: np.ndarray,
flip_horizontal: bool = False,
flip_vertical: bool = False,
image_shape: tuple[int, int] = (0, 0)
)
Flip bounding boxes horizontally and/or vertically.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...]. |
flip_horizontal | bool | False | Whether to flip horizontally. |
flip_vertical | bool | False | Whether to flip vertically. |
image_shape | tuple[int, int] | (0, 0) | Shape of the image as (height, width). |
Returns
- np.ndarray: Flipped bounding boxes.
flip_keypointsfunction
flip_keypoints(
keypoints: np.ndarray,
flip_horizontal: bool = False,
flip_vertical: bool = False,
image_shape: tuple[int, int] = (0, 0)
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | - |
flip_horizontal | bool | False | - |
flip_vertical | bool | False | - |
image_shape | tuple[int, int] | (0, 0) | - |
from_distance_mapsfunction
from_distance_maps(
distance_maps: np.ndarray,
inverted: bool,
if_not_found_coords: Sequence[int] | dict[str, Any] | None = None,
threshold: float | None = None
)
Convert distance maps back to keypoints coordinates. This function is the inverse of `to_distance_maps`. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.
Parameters
Name | Type | Default | Description |
---|---|---|---|
distance_maps | np.ndarray | - | A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint. |
inverted | bool | - | If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity). |
if_not_found_coords | One of:
| None | Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None. |
threshold | One of:
| None | A threshold value to determine valid keypoints. For inverted maps, values >= threshold are considered valid. For regular maps, values <= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None. |
Returns
- np.ndarray: A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates
Example
>>> distance_maps = np.random.rand(100, 100, 3) # 3 keypoints
>>> inverted = True
>>> if_not_found_coords = [0, 0]
>>> threshold = 0.5
>>> keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)
>>> print(keypoints.shape)
(3, 2)
Notes
- The function uses vectorized operations for improved performance, especially with large numbers of keypoints. - When `threshold` is None, all keypoints are considered valid, and `if_not_found_coords` is not used. - The function assumes that the input distance maps are properly normalized and scaled according to the original image dimensions.
generate_control_pointsfunction
generate_control_points(
num_control_points: int
)
Generate control points for TPS transformation.
Parameters
Name | Type | Default | Description |
---|---|---|---|
num_control_points | int | - | Number of control points per side |
Returns
- np.ndarray: Control points with shape (N, 2)
generate_displacement_fieldsfunction
generate_displacement_fields(
image_shape: tuple[int, int],
alpha: float,
sigma: float,
same_dxdy: bool,
kernel_size: tuple[int, int],
random_generator: np.random.Generator,
noise_distribution: Literal['gaussian', 'uniform']
)
Generate displacement fields for elastic transform.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | - |
alpha | float | - | - |
sigma | float | - | - |
same_dxdy | bool | - | - |
kernel_size | tuple[int, int] | - | - |
random_generator | np.random.Generator | - | - |
noise_distribution | One of:
| - | - |
generate_distorted_grid_polygonsfunction
generate_distorted_grid_polygons(
dimensions: np.ndarray,
magnitude: int,
random_generator: np.random.Generator
)
Generate distorted grid polygons based on input dimensions and magnitude. This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.
Parameters
Name | Type | Default | Description |
---|---|---|---|
dimensions | np.ndarray | - | A 3D array of shape (grid_height, grid_width, 4) where each element is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell. |
magnitude | int | - | Maximum pixel-wise displacement for distortion. The actual displacement will be randomly chosen in the range [-magnitude, magnitude]. |
random_generator | np.random.Generator | - | A random number generator. |
Returns
- np.ndarray: A 2D array of shape (total_cells, 8) where each row represents a distorted polygon
Example
>>> dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],
... [[0, 50, 50, 100], [50, 50, 100, 100]]])
>>> distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)
>>> distorted.shape
(4, 8)
Notes
- Only internal grid points are distorted; boundary points remain fixed. - The function ensures consistent distortion across shared vertices of adjacent cells. - The distortion is applied to the following points of each internal cell: * Bottom-right of the cell above and to the left * Bottom-left of the cell above * Top-right of the cell to the left * Top-left of the current cell - Each square represents a cell, and the X marks indicate the coordinates where displacement occurs. +--+--+--+--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--+--+--+--+ - For each X, the coordinates of the left, right, top, and bottom edges in the four adjacent cells are displaced.
generate_gridfunction
generate_grid(
image_shape: tuple[int, int],
steps_x: list[float],
steps_y: list[float],
num_steps: int
)
Generate a distorted grid for image transformation based on given step sizes. This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | The shape of the image as (height, width). |
steps_x | list[float] | - | List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction. |
steps_y | list[float] | - | List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction. |
num_steps | int | - | The number of steps to divide each axis into. This determines the granularity of the distortion grid. |
Returns
- tuple[np.ndarray, np.ndarray]: A tuple containing two 2D numpy arrays:
Example
>>> image_shape = (100, 100)
>>> steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]
>>> steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]
>>> num_steps = 5
>>> map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)
>>> distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)
Notes
- The function generates a grid where each cell can be distorted independently. - The distortion is controlled by the steps_x and steps_y parameters, which determine how much each grid line is shifted. - The resulting map_x and map_y can be used directly with cv2.remap() to apply the distortion to an image. - The distortion is applied smoothly across each grid cell using linear interpolation.
generate_inverse_distortion_mapfunction
generate_inverse_distortion_map(
map_x: np.ndarray,
map_y: np.ndarray,
shape: tuple[int, int]
)
Generate inverse mapping for strong distortions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
map_x | np.ndarray | - | - |
map_y | np.ndarray | - | - |
shape | tuple[int, int] | - | - |
generate_perspective_pointsfunction
generate_perspective_points(
image_shape: tuple[int, int],
scale: float,
random_generator: np.random.Generator
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | - |
scale | float | - | - |
random_generator | np.random.Generator | - | - |
generate_reflected_bboxesfunction
generate_reflected_bboxes(
bboxes: np.ndarray,
grid_dims: dict[str, tuple[int, int]],
image_shape: tuple[int, int],
center_in_origin: bool = False
)
Generate reflected bounding boxes for the entire reflection grid.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Original bounding boxes. |
grid_dims | dict[str, tuple[int, int]] | - | Grid dimensions and original position. |
image_shape | tuple[int, int] | - | Shape of the original image as (height, width). |
center_in_origin | bool | False | If True, center the grid at the origin. Default is False. |
Returns
- np.ndarray: Array of reflected and shifted bounding boxes for the entire grid.
generate_reflected_keypointsfunction
generate_reflected_keypoints(
keypoints: np.ndarray,
grid_dims: dict[str, tuple[int, int]],
image_shape: tuple[int, int],
center_in_origin: bool = False
)
Generate reflected keypoints for the entire reflection grid. This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the `center_in_origin` parameter.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Original keypoints array of shape (N, 4+), where N is the number of keypoints, and each keypoint is represented by at least 4 values (x, y, angle, scale, ...). |
grid_dims | dict[str, tuple[int, int]] | - | A dictionary containing grid dimensions and original position. It should have the following keys: - "grid_shape": tuple[int, int] representing (grid_rows, grid_cols) - "original_position": tuple[int, int] representing (original_row, original_col) |
image_shape | tuple[int, int] | - | Shape of the original image as (height, width). |
center_in_origin | bool | False | If True, center the grid at the origin. Default is False. |
Returns
- np.ndarray: Array of reflected and shifted keypoints for the entire grid. The shape is
Notes
- The function handles keypoint flipping and shifting to create a grid of reflected keypoints. - It preserves the angle and scale information of the keypoints during transformations. - The resulting grid can be either centered at the origin or positioned based on the original grid.
generate_shuffled_splitsfunction
generate_shuffled_splits(
size: int,
divisions: int,
random_generator: np.random.Generator
)
Generate shuffled splits for a given dimension size and number of divisions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
size | int | - | Total size of the dimension (height or width). |
divisions | int | - | Number of divisions (rows or columns). |
random_generator | np.random.Generator | - | The random generator to use for shuffling the splits. If None, the splits are not shuffled. |
Returns
- np.ndarray: Cumulative edges of the shuffled intervals.
get_camera_matrix_distortion_mapsfunction
get_camera_matrix_distortion_maps(
image_shape: tuple[int, int],
k: float
)
Generate distortion maps using camera matrix model.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | Image shape (height, width) |
k | float | - | Distortion coefficient |
Returns
- tuple[np.ndarray, np.ndarray]: Tuple of (map_x, map_y) distortion maps
get_dimension_paddingfunction
get_dimension_padding(
current_size: int,
min_size: int | None,
divisor: int | None
)
Calculate padding for a single dimension.
Parameters
Name | Type | Default | Description |
---|---|---|---|
current_size | int | - | Current size of the dimension |
min_size | One of:
| - | Minimum size requirement, if any |
divisor | One of:
| - | Divisor for padding to make size divisible, if any |
Returns
- tuple[int, int]: (pad_before, pad_after)
get_fisheye_distortion_mapsfunction
get_fisheye_distortion_maps(
image_shape: tuple[int, int],
k: float
)
Generate distortion maps using fisheye model.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | Image shape (height, width) |
k | float | - | Distortion coefficient |
Returns
- tuple[np.ndarray, np.ndarray]: Tuple of (map_x, map_y) distortion maps
get_pad_grid_dimensionsfunction
get_pad_grid_dimensions(
pad_top: int,
pad_bottom: int,
pad_left: int,
pad_right: int,
image_shape: tuple[int, int]
)
Calculate the dimensions of the grid needed for reflection padding and the position of the original image.
Parameters
Name | Type | Default | Description |
---|---|---|---|
pad_top | int | - | Number of pixels to pad above the image. |
pad_bottom | int | - | Number of pixels to pad below the image. |
pad_left | int | - | Number of pixels to pad to the left of the image. |
pad_right | int | - | Number of pixels to pad to the right of the image. |
image_shape | tuple[int, int] | - | Shape of the original image as (height, width). |
Returns
- dict[str, tuple[int, int]]: A dictionary containing:
get_padding_paramsfunction
get_padding_params(
image_shape: tuple[int, int],
min_height: int | None,
min_width: int | None,
pad_height_divisor: int | None,
pad_width_divisor: int | None
)
Calculate padding parameters based on target dimensions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | (height, width) of the image |
min_height | One of:
| - | Minimum height requirement, if any |
min_width | One of:
| - | Minimum width requirement, if any |
pad_height_divisor | One of:
| - | Divisor for height padding, if any |
pad_width_divisor | One of:
| - | Divisor for width padding, if any |
Returns
- tuple[int, int, int, int]: (pad_top, pad_bottom, pad_left, pad_right)
is_identity_matrixfunction
is_identity_matrix(
matrix: np.ndarray
)
Check if the given matrix is an identity matrix.
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | A 3x3 affine transformation matrix. |
Returns
- bool: True if the matrix is an identity matrix, False otherwise.
is_valid_componentfunction
is_valid_component(
component_area: float,
original_area: float,
min_area: float | None,
min_visibility: float | None
)
Validate if a component meets the minimum requirements.
Parameters
Name | Type | Default | Description |
---|---|---|---|
component_area | float | - | - |
original_area | float | - | - |
min_area | One of:
| - | - |
min_visibility | One of:
| - | - |
keypoints_affinefunction
keypoints_affine(
keypoints: np.ndarray,
matrix: np.ndarray,
image_shape: tuple[int, int],
scale: dict[str, float],
border_mode: int
)
Apply an affine transformation to keypoints. This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (N, 4+) where N is the number of keypoints. Each keypoint is represented as [x, y, angle, scale, ...]. |
matrix | np.ndarray | - | The 2x3 or 3x3 affine transformation matrix. |
image_shape | tuple[int, int] | - | Shape of the image (height, width). |
scale | dict[str, float] | - | Dictionary containing scale factors for x and y directions. Expected keys are 'x' and 'y'. |
border_mode | int | - | Border mode for handling keypoints near image edges. Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc. |
Returns
- np.ndarray: Transformed keypoints array with the same shape as input.
Example
>>> keypoints = np.array([[100, 100, 0, 1]])
>>> matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])
>>> scale = {'x': 1.5, 'y': 1.2}
>>> transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)
Notes
- The function applies reflection padding if the mode is in REFLECT_BORDER_MODES. - Coordinates (x, y) are transformed using the affine matrix. - Angles are adjusted based on the rotation component of the affine transformation. - Scales are multiplied by the maximum of x and y scale factors. - The @angle_2pi_range decorator ensures angles remain in the [0, 2π] range.
keypoints_d4function
keypoints_d4(
keypoints: np.ndarray,
group_member: Literal['e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'],
image_shape: tuple[int, int]
)
Applies a `D_4` symmetry group transformation to a keypoint. This function adjusts a keypoint's coordinates according to the specified `D_4` group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). |
group_member | One of:
| - | A string identifier for the `D_4` group transformation to apply. Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'. |
image_shape | tuple[int, int] | - | The shape of the image. |
Returns
- KeypointInternalType: The transformed keypoint.
keypoints_hflipfunction
keypoints_hflip(
keypoints: np.ndarray,
cols: int
)
Flip keypoints horizontally.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (num_keypoints, 2+) |
cols | int | - | Number of columns in the image |
Returns
- np.ndarray: Horizontally flipped keypoints
keypoints_rot90function
keypoints_rot90(
keypoints: np.ndarray,
factor: Literal[0, 1, 2, 3],
image_shape: tuple[int, int]
)
Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). |
factor | One of:
| - | The number of 90 degree CCW rotations to apply. Must be in the range [0, 3]. |
image_shape | tuple[int, int] | - | The shape of the image (height, width). |
Returns
- np.ndarray: The rotated keypoints with the same shape as the input.
keypoints_scalefunction
keypoints_scale(
keypoints: np.ndarray,
scale_x: float,
scale_y: float
)
Scale keypoints by given factors.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (num_keypoints, 2+) |
scale_x | float | - | Scale factor for x coordinates |
scale_y | float | - | Scale factor for y coordinates |
Returns
- np.ndarray: Scaled keypoints
keypoints_transposefunction
keypoints_transpose(
keypoints: np.ndarray
)
Transpose keypoints along the main diagonal.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (num_keypoints, 2+) |
Returns
- np.ndarray: Transposed keypoints
keypoints_vflipfunction
keypoints_vflip(
keypoints: np.ndarray,
rows: int
)
Flip keypoints vertically.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (num_keypoints, 2+) |
rows | int | - | Number of rows in the image |
Returns
- np.ndarray: Vertically flipped keypoints
normalize_grid_distortion_stepsfunction
normalize_grid_distortion_steps(
image_shape: tuple[int, int],
num_steps: int,
x_steps: list[float],
y_steps: list[float]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | - |
num_steps | int | - | - |
x_steps | list[float] | - | - |
y_steps | list[float] | - | - |
order_pointsfunction
order_points(
pts: np.ndarray
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
pts | np.ndarray | - | - |
padfunction
pad(
img: np.ndarray,
min_height: int,
min_width: int,
border_mode: int,
value: tuple[float, ...] | float | None
)
Pad an image to ensure minimum dimensions. This function adds padding to an image if its dimensions are smaller than the specified minimum dimensions. Padding is added evenly on all sides.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to pad. |
min_height | int | - | Minimum height of the output image. |
min_width | int | - | Minimum width of the output image. |
border_mode | int | - | OpenCV border mode for padding. |
value | One of:
| - | Value(s) to fill the border pixels. |
Returns
- np.ndarray: Padded image with dimensions at least (min_height, min_width).
pad_bboxesfunction
pad_bboxes(
bboxes: np.ndarray,
pad_top: int,
pad_bottom: int,
pad_left: int,
pad_right: int,
border_mode: int,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | - |
pad_top | int | - | - |
pad_bottom | int | - | - |
pad_left | int | - | - |
pad_right | int | - | - |
border_mode | int | - | - |
image_shape | tuple[int, int] | - | - |
pad_keypointsfunction
pad_keypoints(
keypoints: np.ndarray,
pad_top: int,
pad_bottom: int,
pad_left: int,
pad_right: int,
border_mode: int,
image_shape: tuple[int, int]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | - |
pad_top | int | - | - |
pad_bottom | int | - | - |
pad_left | int | - | - |
pad_right | int | - | - |
border_mode | int | - | - |
image_shape | tuple[int, int] | - | - |
pad_with_paramsfunction
pad_with_params(
img: np.ndarray,
h_pad_top: int,
h_pad_bottom: int,
w_pad_left: int,
w_pad_right: int,
border_mode: int,
value: tuple[float, ...] | float | None
)
Pad an image with explicitly defined padding on each side. This function adds specified amounts of padding to each side of the image.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to pad. |
h_pad_top | int | - | Number of pixels to add at the top. |
h_pad_bottom | int | - | Number of pixels to add at the bottom. |
w_pad_left | int | - | Number of pixels to add on the left. |
w_pad_right | int | - | Number of pixels to add on the right. |
border_mode | int | - | OpenCV border mode for padding. |
value | One of:
| - | Value(s) to fill the border pixels. |
Returns
- np.ndarray: Padded image.
perspectivefunction
perspective(
img: np.ndarray,
matrix: np.ndarray,
max_width: int,
max_height: int,
border_val: float | list[float] | np.ndarray,
border_mode: int,
keep_size: bool,
interpolation: int
)
Apply perspective transformation to an image. This function warps an image according to a perspective transformation matrix. It can either maintain the original dimensions or use the specified max dimensions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to transform. |
matrix | np.ndarray | - | 3x3 perspective transformation matrix. |
max_width | int | - | Maximum width of the output image if keep_size is False. |
max_height | int | - | Maximum height of the output image if keep_size is False. |
border_val | One of:
| - | Border value(s) to fill areas outside the transformed image. |
border_mode | int | - | OpenCV border mode (e.g., cv2.BORDER_CONSTANT, cv2.BORDER_REFLECT). |
keep_size | bool | - | If True, maintain the original image dimensions. |
interpolation | int | - | Interpolation method for resampling (cv2 interpolation flag). |
Returns
- np.ndarray: Perspective-transformed image.
perspective_bboxesfunction
perspective_bboxes(
bboxes: np.ndarray,
image_shape: tuple[int, int],
matrix: np.ndarray,
max_width: int,
max_height: int,
keep_size: bool
)
Applies perspective transformation to bounding boxes. This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | An array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...). Additional columns beyond the first 4 are preserved unchanged. |
image_shape | tuple[int, int] | - | The shape of the image (height, width). |
matrix | np.ndarray | - | The perspective transformation matrix. |
max_width | int | - | The maximum width of the output image. |
max_height | int | - | The maximum height of the output image. |
keep_size | bool | - | If True, maintains the original image size after transformation. |
Returns
- np.ndarray: An array of transformed bounding boxes with the same shape as input.
Example
>>> bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])
>>> image_shape = (100, 100)
>>> matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])
>>> transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)
Notes
- This function modifies only the coordinate columns (first 4) of the input bounding boxes. - Any additional attributes (columns beyond the first 4) are kept unchanged. - The function handles denormalization and renormalization of coordinates internally.
perspective_keypointsfunction
perspective_keypoints(
keypoints: np.ndarray,
image_shape: tuple[int, int],
matrix: np.ndarray,
max_width: int,
max_height: int,
keep_size: bool
)
Apply perspective transformation to keypoints.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of shape (N, 5+) in format [x, y, z, angle, scale, ...] |
image_shape | tuple[int, int] | - | Original image shape (height, width) |
matrix | np.ndarray | - | 3x3 perspective transformation matrix |
max_width | int | - | Maximum width after transformation |
max_height | int | - | Maximum height after transformation |
keep_size | bool | - | Whether to keep original size |
Returns
- np.ndarray: Transformed keypoints array with same shape as input
remapfunction
remap(
img: np.ndarray,
map_x: np.ndarray,
map_y: np.ndarray,
interpolation: int,
border_mode: int,
value: tuple[float, ...] | float | None = None
)
Remap an image according to given coordinate maps. This function applies a generic geometrical transformation using mapping functions that specify the position of each pixel in the output image.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to transform. |
map_x | np.ndarray | - | Map of x-coordinates with same height and width as the input image. |
map_y | np.ndarray | - | Map of y-coordinates with same height and width as the input image. |
interpolation | int | - | Interpolation method for resampling. |
border_mode | int | - | OpenCV border mode for handling pixels outside the image boundaries. |
value | One of:
| None | Border value(s) if border_mode is BORDER_CONSTANT. |
Returns
- np.ndarray: Remapped image with the same shape as the input image.
remap_bboxesfunction
remap_bboxes(
bboxes: np.ndarray,
map_x: np.ndarray,
map_y: np.ndarray,
image_shape: tuple[int, int]
)
Remap bounding boxes using displacement maps.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | - |
map_x | np.ndarray | - | - |
map_y | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
remap_keypointsfunction
remap_keypoints(
keypoints: np.ndarray,
map_x: np.ndarray,
map_y: np.ndarray,
image_shape: tuple[int, int]
)
Transform keypoints using coordinate mapping functions. This function applies the inverse of the mapping defined by map_x and map_y to keypoint coordinates. The inverse mapping is necessary because the mapping functions define how pixels move from the source to the destination image, while keypoints need to be transformed from the destination back to the source.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (N, 2+), where the first two columns are x and y coordinates. |
map_x | np.ndarray | - | Map of x-coordinates with shape equal to image_shape. |
map_y | np.ndarray | - | Map of y-coordinates with shape equal to image_shape. |
image_shape | tuple[int, int] | - | Shape (height, width) of the original image. |
Returns
- np.ndarray: Transformed keypoints with the same shape as the input keypoints.
remap_keypoints_via_maskfunction
remap_keypoints_via_mask(
keypoints: np.ndarray,
map_x: np.ndarray,
map_y: np.ndarray,
image_shape: tuple[int, int]
)
Remap keypoints using mask and cv2.remap method.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | - |
map_x | np.ndarray | - | - |
map_y | np.ndarray | - | - |
image_shape | tuple[int, int] | - | - |
resizefunction
resize(
img: np.ndarray,
target_shape: tuple[int, int],
interpolation: int
)
Resize an image to the specified dimensions. This function resizes an input image to the target shape using the specified interpolation method. If the image is already the target size, it is returned unchanged.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to resize. |
target_shape | tuple[int, int] | - | Target (height, width) dimensions. |
interpolation | int | - | Interpolation method to use (cv2 interpolation flag). Examples: cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_NEAREST, etc. |
Returns
- np.ndarray: Resized image with shape target_shape + original channel dimensions.
rot90function
rot90(
img: np.ndarray,
factor: Literal[0, 1, 2, 3]
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | - |
factor | One of:
| - | - |
rotation2d_matrix_to_euler_anglesfunction
rotation2d_matrix_to_euler_angles(
matrix: np.ndarray,
y_up: bool
)
Args: matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down
Parameters
Name | Type | Default | Description |
---|---|---|---|
matrix | np.ndarray | - | - |
y_up | bool | - | - |
scalefunction
scale(
img: np.ndarray,
scale: float,
interpolation: int
)
Scale an image by a factor while preserving aspect ratio. This function scales both height and width dimensions of the image by the same factor.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input image to scale. |
scale | float | - | Scale factor. Values > 1 will enlarge the image, values < 1 will shrink it. |
interpolation | int | - | Interpolation method to use (cv2 interpolation flag). |
Returns
- np.ndarray: Scaled image.
shift_bboxesfunction
shift_bboxes(
bboxes: np.ndarray,
shift_vector: np.ndarray
)
Shift bounding boxes by a given vector.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (n, m) where n is the number of bboxes and m >= 4. The first 4 columns are [x_min, y_min, x_max, y_max]. |
shift_vector | np.ndarray | - | Vector to shift the bounding boxes by, with shape (4,) for [shift_x, shift_y, shift_x, shift_y]. |
Returns
- np.ndarray: Shifted bounding boxes with the same shape as input.
shift_keypointsfunction
shift_keypoints(
keypoints: np.ndarray,
shift_vector: np.ndarray
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | - |
shift_vector | np.ndarray | - | - |
shuffle_tiles_within_shape_groupsfunction
shuffle_tiles_within_shape_groups(
shape_groups: dict[tuple[int, int], list[int]],
random_generator: np.random.Generator
)
Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.
Parameters
Name | Type | Default | Description |
---|---|---|---|
shape_groups | dict[tuple[int, int], list[int]] | - | Groups of tile indices categorized by shape. |
random_generator | np.random.Generator | - | The random generator to use for shuffling the indices. If None, a new random generator will be used. |
Returns
- list[int]: A list where each index is mapped to the new index of the tile after shuffling.
split_uniform_gridfunction
split_uniform_grid(
image_shape: tuple[int, int],
grid: tuple[int, int],
random_generator: np.random.Generator
)
Splits an image shape into a uniform grid specified by the grid dimensions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image_shape | tuple[int, int] | - | The shape of the image as (height, width). |
grid | tuple[int, int] | - | The grid size as (rows, columns). |
random_generator | np.random.Generator | - | The random generator to use for shuffling the splits. If None, the splits are not shuffled. |
Returns
- np.ndarray: An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).
Notes
The function uses `generate_shuffled_splits` to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.
swap_tiles_on_imagefunction
swap_tiles_on_image(
image: np.ndarray,
tiles: np.ndarray,
mapping: list[int] | None = None
)
Swap tiles on the image according to the new format.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image | np.ndarray | - | Input image. |
tiles | np.ndarray | - | Array of tiles with each tile as [start_y, start_x, end_y, end_x]. |
mapping | One of:
| None | list of new tile indices. |
Returns
- np.ndarray: Output image with tiles swapped according to the random shuffle.
swap_tiles_on_keypointsfunction
swap_tiles_on_keypoints(
keypoints: np.ndarray,
tiles: np.ndarray,
mapping: np.ndarray
)
Swap the positions of keypoints based on a tile mapping. This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | A 2D numpy array of shape (N, 2) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates. |
tiles | np.ndarray | - | A 2D numpy array of shape (M, 4) where M is the number of tiles. Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates. |
mapping | np.ndarray | - | A 1D numpy array of shape (M,) where M is the number of tiles. Each element i contains the index of the tile that tile i should be swapped with. |
Returns
- np.ndarray: A 2D numpy array of the same shape as the input keypoints, containing the new positions
Notes
- Keypoints that do not fall within any tile will remain unchanged. - The function assumes that the tiles do not overlap and cover the entire image space.
to_distance_mapsfunction
to_distance_maps(
keypoints: np.ndarray,
image_shape: tuple[int, int],
inverted: bool = False
)
Generate a ``(H,W,N)`` array of distance maps for ``N`` keypoints. The ``n``-th distance map contains at every location ``(y, x)`` the euclidean distance to the ``n``-th keypoint. This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | A numpy array of shape (N, 2+) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates. |
image_shape | tuple[int, int] | - | Shape of the image (height, width) |
inverted | bool | False | If ``True``, inverted distance maps are returned where each distance value d is replaced by ``d/(d+1)``, i.e. the distance maps have values in the range ``(0.0, 1.0]`` with ``1.0`` denoting exactly the position of the respective keypoint. |
Returns
- np.ndarray: A float32 array of shape (H, W, N) containing ``N`` distance maps for ``N``
tps_transformfunction
tps_transform(
target_points: np.ndarray,
control_points: np.ndarray,
nonlinear_weights: np.ndarray,
affine_weights: np.ndarray
)
Apply TPS transformation with consistent types.
Parameters
Name | Type | Default | Description |
---|---|---|---|
target_points | np.ndarray | - | - |
control_points | np.ndarray | - | - |
nonlinear_weights | np.ndarray | - | - |
affine_weights | np.ndarray | - | - |
transposefunction
transpose(
img: np.ndarray
)
Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.
Parameters
Name | Type | Default | Description |
---|---|---|---|
img | np.ndarray | - | Input array. |
Returns
- np.ndarray: Transposed array.
validate_bboxesfunction
validate_bboxes(
bboxes: np.ndarray,
image_shape: Sequence[int]
)
Validate bounding boxes and remove invalid ones.
Parameters
Name | Type | Default | Description |
---|---|---|---|
bboxes | np.ndarray | - | Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max]. |
image_shape | Sequence[int] | - | Shape of the image as (height, width). |
Returns
- np.ndarray: Array of valid bounding boxes, potentially with fewer boxes than the input.
Example
>>> bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])
>>> valid_bboxes = validate_bboxes(bboxes, (100, 100))
>>> print(valid_bboxes)
[[10 20 30 40]]
validate_if_not_found_coordsfunction
validate_if_not_found_coords(
if_not_found_coords: Sequence[int] | dict[str, Any] | None
)
Validate and process `if_not_found_coords` parameter.
Parameters
Name | Type | Default | Description |
---|---|---|---|
if_not_found_coords | One of:
| - | - |
validate_keypointsfunction
validate_keypoints(
keypoints: np.ndarray,
image_shape: tuple[int, int]
)
Validate keypoints and remove those that fall outside the image boundaries.
Parameters
Name | Type | Default | Description |
---|---|---|---|
keypoints | np.ndarray | - | Array of keypoints with shape (N, M) where N is the number of keypoints and M >= 2. The first two columns represent x and y coordinates. |
image_shape | tuple[int, int] | - | Shape of the image as (height, width). |
Returns
- np.ndarray: Array of valid keypoints that fall within the image boundaries.
Notes
This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.
volume_hflipfunction
volume_hflip(
volume: np.ndarray
)
Perform horizontal flip on a volume (numpy array). Flips the volume along the width axis (axis=2). Handles inputs with shapes (D, H, W) or (D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volume | np.ndarray | - | Input volume. |
Returns
- np.ndarray: Horizontally flipped volume.
volume_rot90function
volume_rot90(
volume: np.ndarray,
factor: Literal[0, 1, 2, 3]
)
Rotate a volume 90 degrees counter-clockwise multiple times. Rotates the volume in the height-width plane (axes 1 and 2). Handles inputs with shapes (D, H, W) or (D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volume | np.ndarray | - | Input volume. |
factor | One of:
| - | Number of 90-degree rotations. |
Returns
- np.ndarray: Rotated volume.
volume_vflipfunction
volume_vflip(
volume: np.ndarray
)
Perform vertical flip on a volume (numpy array). Flips the volume along the height axis (axis=1). Handles inputs with shapes (D, H, W) or (D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volume | np.ndarray | - | Input volume. |
Returns
- np.ndarray: Vertically flipped volume.
volumes_hflipfunction
volumes_hflip(
volumes: np.ndarray
)
Perform horizontal flip on a batch of volumes (numpy array). Flips the volumes along the width axis (axis=3). Handles inputs with shapes (B, D, H, W) or (B, D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volumes | np.ndarray | - | Input batch of volumes. |
Returns
- np.ndarray: Horizontally flipped batch of volumes.
volumes_rot90function
volumes_rot90(
volumes: np.ndarray,
factor: Literal[0, 1, 2, 3]
)
Rotate a batch of volumes 90 degrees counter-clockwise multiple times. Rotates the volumes in the height-width plane (axes 2 and 3). Handles inputs with shapes (B, D, H, W) or (B, D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volumes | np.ndarray | - | Input batch of volumes. |
factor | One of:
| - | Number of 90-degree rotations |
Returns
- np.ndarray: Rotated batch of volumes.
volumes_vflipfunction
volumes_vflip(
volumes: np.ndarray
)
Perform vertical flip on a batch of volumes (numpy array). Flips the volumes along the height axis (axis=2). Handles inputs with shapes (B, D, H, W) or (B, D, H, W, C).
Parameters
Name | Type | Default | Description |
---|---|---|---|
volumes | np.ndarray | - | Input batch of volumes. |
Returns
- np.ndarray: Vertically flipped batch of volumes.
warp_affinefunction
warp_affine(
image: np.ndarray,
matrix: np.ndarray,
interpolation: int,
fill: tuple[float, ...] | float,
border_mode: int,
output_shape: tuple[int, int]
)
Apply an affine transformation to an image. This function transforms an image using the specified affine transformation matrix. If the transformation matrix is an identity matrix, the original image is returned.
Parameters
Name | Type | Default | Description |
---|---|---|---|
image | np.ndarray | - | Input image to transform. |
matrix | np.ndarray | - | 2x3 or 3x3 affine transformation matrix. |
interpolation | int | - | Interpolation method for resampling. |
fill | One of:
| - | Border value(s) to fill areas outside the transformed image. |
border_mode | int | - | OpenCV border mode for handling pixels outside the image boundaries. |
output_shape | tuple[int, int] | - | Shape (height, width) of the output image. |
Returns
- np.ndarray: Affine-transformed image with dimensions specified by output_shape.
warp_affine_with_value_extensionfunction
warp_affine_with_value_extension(
image: np.ndarray,
matrix: np.ndarray,
dsize: tuple[int, int],
flags: int,
border_mode: int,
border_value: tuple[float, ...] | float
)
Parameters
Name | Type | Default | Description |
---|---|---|---|
image | np.ndarray | - | - |
matrix | np.ndarray | - | - |
dsize | tuple[int, int] | - | - |
flags | int | - | - |
border_mode | int | - | - |
border_value | One of:
| - | - |