Last Updated: February 13, 2022

# Multi-Resolution Image Blending

This article dives deep into the world of seemlessly merging 2 images together in a more natural way than alpha matting. The techniques used are Laplacian and Gaussian blending. There would be no deep dive into the technical details of laplacian or gaussian matting as this article is focused on image blending.

## Problem

To begin with, we have 2 images that we'd like to blend, shown below as

lets load the images using OpenCV as follows

```
A = cv.imread('Hand.png', cv.IMREAD_REDUCED_COLOR_4)
B = cv.imread('Veles-Mask-Template.png', cv.IMREAD_REDUCED_COLOR_4)
M = cv.imread('Mask.png', cv.IMREAD_REDUCED_GRAYSCALE_4)
```

The 'cv.IMREAD_REDUCED_COLOR_4' and 'cv.IMREAD_REDUCED_GRAYSCALE_4' options in OpenCV rescales the image by 1/4th its original size.

Image B is the source image that we would be transfering into Image A, the hand. We'd use the binary mask to crop out the source image.

A binary mask is not the same as an alpha mask. The main difference lies in the fact that a binary mask has only 1s/0s (0 or 255) in the image while an alpha mask has values ranging from 0 to 1 (i.e any value from 0 to 255)

# Creating a Gaussian Pyramid

Inspired by actual pyramids, an image pyramid is simply a collection of images in decreasing order of sizes, with the largest image at the botton and the smallest at the top.

There is no consistent definition of an image pyramid, as some texts refer to gaussian pyramids, laplacian pyramids or simply just a pyramid of images downscaled with no transformations applied to each layer.

## Constructing an Image pyramid

First, we'd construct an image pyramid, specifically, a gaussian image pyramid. OpenCV has a builtin function for constructing gaussian pyramids called `cv2.pyrDown`

We can utilise this in creating a function that returns a pyramid when given an input image and the number of levels in the pyramid (scale).

```
import cv2
def cv_pyramid(A, scale) -> list:
gp = [A]
for i in range(1, scale):
A = cv2.pyrDown(A)
gp.append(A)
return gp
```

With this function, we can construct gaussian image pyramids for the 3 images previously imported by running

```
gpA = cv_pyramid(A.copy(), scale=5)
gpB = cv_pyramid(B.copy(), scale=5)
gpM = cv_pyramid(M.copy(), scale=5)
```

An illustration of the hand image is shown below

# Creating a Laplacian pyramid

In terms of frequency, a laplacian pyramid can be seen as a high frequency, multi scale representation of an image while the gaussian pyramid can be seen as a low frequency representation. What does this mean? Think of a laplacian pyramid as a compression step that captures the "important" information in an image, kind of like an edge detector.

It can also be seen to consist of difference images as we construct it by finding the difference between 2 consecutive images in the gaussian pyramid.

## Construcing a laplacian pyramid using OpenCV

We'd use the OpenCV functions **pyrUp** and **subtract** to create a laplacian pyramid. A function to perform this is shown below.

```
def cv2_same_size(a,b):
maxH = max(a.shape[0], b.shape[0])
maxW = max(a.shape[1], b.shape[1])
a = cv2.resize(a, (maxW, maxH))
b = cv2.resize(b, (maxW, maxH))
return a,b
def cv_laplacian(gp, scale) -> list:
lp = [gp[-1].copy()]
for i in reversed(range(scale-1)):
gExp = cv2.pyrUp(gp[i+1].copy())
gpi = gp[i].copy()
gpi, gExp = cv2_same_size(gpi, gExp)
li = cv2.subtract(gpi, gExp)
lp.insert(0, li)
return lp
```

Sometimes the image sizes differ by a single pixel due to rounding errors, as the

`pyrUp`

operation upscales the image by a factor of 2. The`cv2_same_size`

helper function simply ensures both images are of the same size.

The input to our **cv_laplacian** function is the gaussian pyramid created from the previous step and a scale (which can also be inferred from the number of levels in the supplied gaussian pyramid).

Similar to the gaussian pyramid, the first level of the laplacian pyramid is kept as it is. While for each level, the following operations are performed to it:

- An imaginary next image is created called
*gExp*by using the*pyrUp*function. We create the 'difference' for the current level by 'borrowing' the image from the next level. - The laplacian for the current level is simply the difference between this level and the next i.e $gpi - gExp$

Lets visualise what the laplacian of the hand image looks like by running the following and viewing the image.

```
lpA = cv_laplacian(gpA, scale=5)
lpB = cv_laplacian(gpB, scale=5)
lpM = cv_laplacian(gpM, scale=5)
```

The laplacian images are mostly black, therefore, we'd apply a little visualisation trick to brighten the high frequency aspects of the image.

```
# Brighten the laplacian pyramid for visualisation purposes
apy = [cv2pil((x + 100).astype('uint8')) for x in lpA[:-1]]
apy.append(cv2pil(lpA[-1]))
for idx,x in enumerate(apy[:-1]):
x.save(f'hand_laplacian_level_{idx}.png')
apy[-1].save('hand_laplacian_level_4.png')
```

The results are shown as

## Reconstructing the original image from the Laplacian pyramid

The original image can be reconstructed from the laplacian pyramid using the following function

```
def cv_reconstruct_laplacian(pyramid):
scale = len(pyramid)
up = pyramid[-1] # start with the tip, this is would the smallest scale image
for i in range(scale-1, 0, -1):
next = pyramid[i-1].copy()
up = cv2.pyrUp(up)
up, next = cv2_same_size(up, next) # sometimes the width/height can be off by a few pixels due to `cv2.pyrUp`
up = cv2.add(next, up)
return up
```

# The Multi-Resolution blending algorithm

```
def multiply_nn_mnn(g, rgb):
# multiply an rgb image by a single channel image
rgb[:,:,0] = rgb[:,:,0] * g
rgb[:,:,1] = rgb[:,:,1] * g
rgb[:,:,2] = rgb[:,:,2] * g
return rgb
def cv_multiresolution_blend(gm, la, lb) -> list:
gm = [x // 255 for x in gm]
blended = []
for i in range(len(gm)):
gmi , lbi = cv2_same_size(gm[i], lb[i])
bi = multiply_nn_mnn(gmi, lbi) + multiply_nn_mnn((1-gmi), la[i])
bi = bi.astype(np.uint8)
blended.append(bi)
return blended
```

With the blending function, we can create the blended image as follows

```
blended_pyramid = cv_multiresolution_blend(gpM, lpA, lpB)
blended_image = cv_reconstruct_laplacian(blended_pyramid)
```

and the result is shown as