Domain adaptation transforms¶

It is possible to perform style transfer without the use of Neural networks.

Resulting imagews do not look as ideal but could be generated on the fly in a reasonable time.

Python

import random

import cv2
from matplotlib import pyplot as plt
from pathlib import Path
import numpy as np
import cv2

import albumentations as A

Python

def visualize(image):
    plt.figure(figsize=(10, 5))
    plt.axis('off')
    plt.imshow(image)

Python

def load_rgb(image_path):
    image = cv2.imread(image_path)
    return cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

Python

image1 = load_rgb("../images/park1.jpeg")
image2 = load_rgb("../images/rain2.jpeg")

Python

visualize(image1)

png

Python

visualize(image2)

png

Historgram matching¶

This process adjusts the pixel values of an input image to align its histogram with that of a reference image. When dealing with multi-channel images, this alignment occurs separately for each channel, provided that both the input and the reference image have an identical number of channels.

Documentation

Python

transform = A.Compose([A.HistogramMatching(reference_images=[image2], 
                                           read_fn = lambda x: x, 
                                           p=1,
                                           blend_ratio=(0.3, 0.3)
                                          )], p=1)

Python

transformed = transform(image=image1)["image"]

Python

visualize(transformed)

png

Fourier Domain Adaptation (FDA)¶

Fourier Domain Adaptation (FDA) for simple "style transfer" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.

This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.

Documentation

Python

transform = A.Compose([A.FDA(reference_images=[image1], read_fn = lambda x: x, p=1, beta_limit=(0.2, 0.2))], p=1)

Python

transformed = transform(image=image1)["image"]
visualize(transformed)

png

Python

(transformed - image1).mean()

55.401878356933594

PixelDistribution¶

Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.

The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse of the transform fitted on the reference image.

This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.

Documentation

Python

transform = A.Compose([A.PixelDistributionAdaptation(reference_images=[image1], read_fn = lambda x: x, p=1,  transform_type="pca")], p=1)

Python

transformed = transform(image=image1)["image"]
visualize(transformed)

png

Python

beta = 0.1

Python

height, width = image1.shape[:2]
border = int(np.floor(min(height, width) * beta))
center_y, center_x = height // 2, width // 2

# Define region for amplitude substitution
y1, y2 = center_y - border, center_y + border

x1, x2 = center_x - border, center_x + border

Python

x1, x2, y1, y2

(205, 307, 205, 307)

Python

def low_freq_mutate_np(amp_src, amp_trg, L, h, w):
    b = int(np.floor(min(h, w) * L))
    c_h, c_w = h // 2, w // 2
    h1, h2 = max(0, c_h - b), min(c_h + b, h - 1)
    w1, w2 = max(0, c_w - b), min(c_w + b, w - 1)
    amp_src[h1:h2, w1:w2] = amp_trg[h1:h2, w1:w2]
    return amp_src

def fourier_domain_adaptation(src_img, trg_img, beta=0.1):
    assert src_img.shape == trg_img.shape, "Source and target images must have the same shape."
    src_img = src_img.astype(np.float32)
    trg_img = trg_img.astype(np.float32)

    height, width, num_channels = src_img.shape

    # Prepare container for the output image
    src_in_trg = np.zeros_like(src_img)

    for c in range(num_channels):
        # Perform FFT on each channel
        fft_src = np.fft.fft2(src_img[:, :, c])
        fft_trg = np.fft.fft2(trg_img[:, :, c])

        # Shift the zero frequency component to the center
        fft_src_shifted = np.fft.fftshift(fft_src)
        fft_trg_shifted = np.fft.fftshift(fft_trg)

        # Extract amplitude and phase
        amp_src, pha_src = np.abs(fft_src_shifted), np.angle(fft_src_shifted)
        amp_trg = np.abs(fft_trg_shifted)

        # Mutate the amplitude part of the source with the target

        mutated_amp = low_freq_mutate_np(amp_src.copy(), amp_trg, beta, height, width)

        # Combine the mutated amplitude with the original phase
        fft_src_mutated = np.fft.ifftshift(mutated_amp * np.exp(1j * pha_src))

        # Perform inverse FFT
        src_in_trg_channel = np.fft.ifft2(fft_src_mutated)

        # Store the result in the corresponding channel of the output image
        src_in_trg[:, :, c] = np.real(src_in_trg_channel)

    return np.clip(src_in_trg, 0, 255)

Python

visualize(fourier_domain_adaptation(image2, image1, 0.01).astype(np.uint8))

png

Python