Image registration, the process of finding the transformation that best maps one image to another, is an important tool in document image processing. Having properly-aligned microfilm images can help in manual and automated content extraction, zoning, and batch compression of images. An image registration algorithm is presented that quickly identifies the global affine transformation (rotation, scale, translation and/or shear) that maps one tabular document image to another, using the Fourier-Mellin Transform. Each component of the affine transform is recovered independantly from the others, dramatically reducing the parameter space of the problem, and improving upon standard Fourier-Mellin Image Registration (FMIR), which only directly separates translation from the other components. FMIR is also extended to handle shear, as well as different scale factors for each document axis. This registration method deals with all transform components in a uniform way, by working in the frequency domain. Registration is limited to foreground pixels (the document form and printed text) through the introduction of a novel, locally adaptive foreground-background segmentation algorithm, based on the median filter. The background removal algorithm is also demonstrated as a useful tool to remove ambient signal noise during correlation. Common problems with FMIR are eliminated by background removal, meaning that apodization (tapering down to zero at the edge of the image) is not needed for accurate recovery of the rotation parameter, allowing the entire image to be used for registration. An effective new optimization to the median filter is presented. Rotation and scale parameter detection is less susceptible to problems arising from the non-commutativity of rotation and "tiling" (periodicity) than for standard FMIR, because only the regions of the frequency domain directly corresponding to tabular features are used in registration. An original method is also presented for automatically obtaining blank document templates from a set of registered document images, by computing the "pointwise median" of a set of registered documents. Finally, registration is demonstrated as an effective tool for predictive image compression. The presented registration algorithm is reliable and robust, and handles a wider range of transformation types than most document image registration systems (which typically only perform deskewing).
College and Department
Physical and Mathematical Sciences; Computer Science
BYU ScholarsArchive Citation
Hutchison, Luke Alexander Daysh, "Fast Registration of Tabular Document Images Using the Fourier-Mellin Transform" (2004). Theses and Dissertations. 4.
document image registration, deskewing, transformation reversal, Fourier-Mellin Transform, Fourier transform, rotation, scale, translation, shear, background removal, thresholding, locally-adaptive thresholding, image segmentation, microfilm processing, tabular document images, image processing, family history technology