This post is about image stitching methods used to make a panoramic image. Panoramic images have become important in the digital age. Initially, panoramic images were developed to increase the field of view on the photograph. In the digital age, because one cannot print out pictures with resolutions less than 200 dots per inch (explained here), the method to take print outs for posters is to take a number of photographs with at least 15% overlap and stitch them together later using some software. In order to take the individual pictures that make a panoramic picture, the best technique involves using a tripod so that the camera lens only moves on a sphere eliminating parallax error. In addition, the aperture and shutter speed should not vary between the various pictures. More tips on the techniques of panoramic pictures can be found all over Google or by sending an email to me. This post is more about the science behind stitching the images of a panoramic picture.
The idea of image stitching is to take multiple images and to make a single image from them with an invisible seam and such that the mosaic pictures remains true to the individual images (in other words, does not change the lighting effects too much). This is different from just placing the images side by side because there will be differences in the lighting between the 2 images and that would lead to a prominent seam in the mosaic picture.
This Figure shows 3 photos and the locations of the seams are shown in black boxes on each picture and the final mosaic formed from all three pictures.
The first step is to find points that are equivalent in 2 overlapping pictures [1]. This can be done by taking into consideration a certain amount of pixels in the neighborhood of a pixel from 2 pictures and finding the regions that overlap in colors between the 2 pictures. Then the images are placed or warped on a surface such as a cylinder (because the panoramic picture is a 2-dimensional representation of the overlapping pictures in a cylinder quite often). After this step the curve is found that gives the most amount of overlap between the equivalent pixels on both images. Then the images are stitched together with color correction. I will deal in this post with the various algorithms for color correction.
2. Pyramid Blending: In addition to the pixel representation of images, images can also be stored as pyramids. This is a data compression method in which a the image is stored as a hierarchy or pyramid of low-pass filtered versions of the original image so that successive levels correspond to lower frequencies (dividing the images into different layers that vary over a smaller or larger region of space so that the sum of it gives you the original image). During the blending method described above, the lower frequencies (which vary over a larger distance) are blended over spatially larger distance and the higher frequencies are blended over a spatially lower distance [1] causing a more realistic blended image to be formed. Here during the pyramid forming process, the 2nd derivatives of the images (Laplacian) are taken into consideration while forming the pyramid and the blended pyramid is formed and reintegrated to form the final blended image.
This figure shows the pyramid representation of the pixels in an image and the pyramid blending approach.
3. Gradient Domain Blending: Instead of making a low resolution mapping of the image as above, the gradient domain blending method requires the calculation of the 1st derivative of the images. Hence, the image resolution is not reduced before the blending process, but the idea is the same as above. This method is also developed to find the optimal window size for alpha blending and is adaptive to regions that vary fast or slower.
This figure shows the gradient blend approach.