Image Fusion

Published on January 2017 | Categories: Documents | Downloads: 73 | Comments: 0 | Views: 826

of 54

Content

1. INTRODUCTION

1.1 Information Fusion

Pixel-level image fusion defines the process of fusing visual information from
a number of registered images into a single fused image. It is part of the much
broader subject of multisensor information fusion, which has attracted a considerable
amount of research attention in the last two decades.

Multisensor information fusion utilizes information obtained from a number
of different sensors surveying an environment. The aim is to achieve better situation
assessment and more rapid and accurate completion of a pre-defined task than would
be possible using any of the sensors individually. The only formal definition of
information fusion (data fusion) to date, is that given by the U.S. Department of
Defense, Joint Directors of Laboratories Data Fusion Subpanel which represents the
first formal body explicitly dealing with the process of data fusion. Their definition
can be found in as: a multilevel, multifaceted process dealing with the automatic
detection, association, correlation, estimation and combination of data and
information from multiple sources.

Image fusion represents a specific case of multisensor information fusion in
which all the information sources used represent imaging sensors. Information fusion
can be achieved at any level of the image information representation. Image fusion is
usually performed at one of the three different processing levels: signal, feature and
decision. Image level image fusion, also known as pixel-level image fusion,
represents fusion at the lowest level, where a number of raw input image signals are
combined to produce a single fused image signal. Object level image fusion, also
called feature level image fusion, fuses feature and object labels and property
descriptor information that have already been extracted from individual input images.
Finally, the highest level, decision or symbol level image fusion represents fusion of
probabilistic decision information obtained by local decision makers operating on the
results of feature level processing on image data produced from individual sensors.

Figure 1.1 illustrates a system using image fusion at all three levels of processing.

1

Scene observation Scene observation

Input
Signal 2
Signal 1
signals

Sensor 1

Sensor 2

Pixel-Level

Fusion

Feature

Feature

Extraction

Extraction

Feature

Extraction

Feature

Feature

Vectors Vectors

Feature-Level

Fusion

Fused

Feature

Decision

Decision

Decision

vectors

Makers Makers Makers

Decision

Decision

Signal 1

Signal 2

Vectors

Symbol-Level

Vectors

Fusion

Fused
Decision

Figure 1.1: An example of a system using information fusion at all three

processing levels.

The aim would be to detect and correctly classify objects in a presented scene.
The two sensors (1 and 2) survey the scene and register their observations in the form
of image signals. Two images are then pixel-level fused to produce a third, fused
image and are also passed independently to local feature extraction processes. The
fused image can be directly displayed for a human operator to aid better scene
understanding or used in a further local feature extractor.

2

Feature extractors act as simple automatic target detection systems, including
processing elements such as segmentation, region characterization, morphological
processing and even neural networks to locate regions of interest in the scene.

Decision level fusion is performed on the decisions reached by the local
classifiers, on the basis of the relative reliability of individual sensor outputs and the
fused feature set. Fusion is achieved using statistical methods such as Bayesian
inference and the Dempster-Schafer method with the aim of maximizing the
probability of correct classification for each object of interest. The output of the
whole system is a set of classification decisions associated to the objects found in the
observed scene.

1.2 Project Objectives

The objectives of the project work :

1. The design of improved performance pixel-level image fusion algorithms,
when compared with existing schemes in terms of:

i) Minimizing information loss and distortion effects and

ii) Reducing overall computational complexity.

2. The design of perceptually meaningful objective measures of pixel-level
image Fusion performance.

1.3 Types of Image Fusion Technique

Image fusion methods can be broadly classified into two - spatial domain
fusion and transform domain fusion. The fusion methods such as averaging, Brovey
method, principal component analysis (PCA) and IHS based methods fall under
spatial domain approaches. Another important spatial domain fusion method is the
high pass filtering based technique. Here the high frequency details are injected into
upsampled version of MS images. The disadvantage of spatial domain approaches is
that they produce spatial distortion in the fused image. Spectral distortion becomes a
negative factor while we go for further processing, such as classification problem.
Spatial distortion can be very well handled by transform domain approaches on image
fusion. The multiresolution analysis has become a very useful tool for analysing
remote sensing images. The discrete wavelet transform has become a very useful tool
for fusion. Some other fusion methods are also there, such as Lapacian pyramid

3

based, curvelet transform based etc. These methods show a better performance in
spatial and spectral quality of the fused image compared to other spatial methods of
fusion.

The images used in image fusion should already be registered. Misregistration is a
major source of error in image fusion. Some well-known image fusion methods are:

1. High pass filtering technique

2. IHS transform based image fusion

3. PCA based image fusion

4. Wavelet transform image fusion

5. pair-wise spatial frequency matching

1.4 Application of Image fusion

1. Image Classification

2. Aerial and Satellite imaging

3. Medical imaging

4. Robot vision

5. Concealed weapon detection

6. Multi-focus image fusion

7. Digital camera application

8. Battle field monitoring

1.5 Medical Image Fusion

Medical imaging has become increasingly important in medical analysis and diagnosis.
Diﬀerent medical imaging techniques such as X-rays, computed tomography (CT), magnetic
resonance imaging (MRI), and positron emission tomography (PET) provide diﬀerent
perspectives on the human body that are important in the diagnosis of diseases or physical
disorders. For example, CT scans provide highresolution information on bone structure while
MRI scans provide detailed information on tissue types within the body. Therefore, an improved

4

understanding of a patient’s condition can be achieved through the use of diﬀerent imaging modalities. A powerful technique
used in medical imaging analysis is medical image fusion, where streams of information from medical images of diﬀerent
modalities are combined into a single fused image.

The fused image of MRI scan and a CT scan give both the bone structure and
tissue structure can be clearly identified in the single image. Therefore, image fusion
allows a physician to obtain a better visualization of the patient’s overall condition.

1.6 Pixel-Level Image Fusion

Medical image fusion usually employs the pixel level fusion techniques.
Pixel-level image fusion represents fusion of visual information of the same scene,
from any number of registered image signals, obtained using different sensors. The
goal of pixel-level image fusion can broadly be defined as:

To represent the visual information present in any number of input images, in
a single fused image without the introduction of distortion or loss of information.

In simpler terms, the main condition for successful fusion is that “all” visible
information in the input images should also appear visible in the fused image. In
practice, however, the complete representation of all of the visual information from a
number of input images into a single one is almost impossible.

Thus, the practical goal of pixel-level image fusion is modified to the fusion,
or preservation in the output fused image, of the “most important” visual information
that exists in the input image set.

The main requirement of the fusion process then, is to identify the most
significant features in the input images and to transfer them without loss into the
fused image. What defines important visual information is generally application
dependant. In most applications and in image fusion for display purposes in
particular, it means perceptually important information.

A simple diagram of a system using pixel-level image fusion is shown in the
block diagram in Figure 1.2. For simplicity, only two imaging sensors survey the
environment, producing two different representations of the same scene.

5

The representations of the environment are, again, in the form of image
signals which are corrupted by noise arising from the atmospheric aberrations, sensor
design, quantization, etc.

The image signals produced by the sensors are input into a registration
process, which ensures that the input images to the fusion process correspond
spatially, by geometrically warping one of them.

Multisensor image registration is another widely researched area. In Figure
1.2, the registered input images are fused and the resulting fused image, can then be
used directly for display purposes or can be passed on for further processing see
Figure 1.1.

Noise
Image A

Image

Display

Sensor 1

Registration

Fused

Image

Image

Environment

Fusion

Further

Sensor 2
Processing

Noise
Image B

Figure 1.2: Basic structure of a multisensor system using pixel-level image

fusion.

The pixel-level image fusion work presented in this report assumes that the
input images meet a number of requirements. Firstly, input images must be of the
same scene, i.e. the fields of view of the sensors must contain a spatial overlap.
Furthermore, inputs are assumed to be spatially registered and of equal size and
spatial resolution. In practice, resampling one of the input images often satisfies size
and resolution constraints.

6
2. LITERATURE SURVEY

[1] Firooz Sadjadi Lockheed Martin Corporation, [email protected]

“Comparative image fusion analysis”

In this they have proposed the results of a study to provide a quantitative
comparative analysis of a typical set of image fusion algorithms. The results were
based on the application of these algorithms on two sets of collocated visible (electro-
optic) and infrared (IR) imagery. The quantitative comparative analysis of their
performances was based on using 5 different measures of effectiveness. These metrics
were based on measuring information content and/or measures of contrast. The results
of this study indicate that the comparative merit of each fusion method is very much
dependent on the measures of effectiveness being used. However, many of the fusion
methods produced results that had lower measures of effectiveness than their input
imagery. The highest relative MOE values were associated with the Fechner- Weber
and Entropy measures in both sets. Fisher metrics showed large values mainly due to
low pixel variances in the target background areas.

[2] V.P.S. Naidu and J.R. Raol National Aerospace Laboratories, Bangalore.

“Pixel-level Image Fusion using Wavelets and Principal Component Analysis”

Pixel-level image fusion using wavelet transform and principal component
analysis are implemented in PC MATLAB. Different image fusion performance
metrics with and without reference image have been evaluated. The simple averaging
fusion algorithm shows degraded performance. Image fusion using wavelets with
higher level of decomposition shows better performance in some metrics while in
other metrics, the PCA shows better performance.

[3] Stavri Nikolov, Paul Hill, David Bull, Nishan Canagarajah Image
Communications Group ,Centre for Communications Research University of
Bristol,Merchant Venturers Building, Woodland Road ,Bristol BS8

1UB,UK,[email protected], [email protected] “ Wavelts For Image

Fusion”

In this they have compared some newly developed wavelet transform fusion
methods with existing fusion techniques.

7

For an effective fusion of images a technique should aim to retain important
features from all input images. These features often appear at different positions and
scales.

Multiresolution analysis tools such as the wavelet transform are therefore
ideally suited to image fusion. Simple non multiresolution methods for image fusion
wavelet fusion schemes have many specical advantages and benefit from a well
understood theoretical background. Many image processing steps example denoising
contrast enhancement edge detection segmentation texture analysis and compression
can be easily and successfully performed in the wavelet domain. Wavelet techniques
thus provide a powerful set of tools for image enhancement and analysis together
with a common framework for various fusion tasks such as averaging and PCA
methods have produced limited results.

8
3. SYSTEM ANALYSIS

3.1 Existing Work

Fusion on two medical and normal images using wavelet transform.

3.1.1 Drawback In Existing Work

In wavelet transform it has two main disadvantages:

Lack of shift invariance, which means that small shifts in the input signal
can cause

major variations in the distribution of energy between DWT coefficients at different
scales.

Poor directional selectivity for diagonal features, because the wavelet filters
are separable and real.

3.2 Proposed Work

Fusion on two medical and normal images using multiresolution algorithm
such as gradient fusion and to overcome these drawbacks in wavelet transform and to
prove which is the best suited for medical images and normal images.

9
4. DIGITAL IMAGE PROCESSING

4.1 Image

An image is a two-dimensional functional that represents a measure of some
characteristic such as brightness or colour of a viewed scene. An image project of a
3D scene into a 2D projection plane. It can be defined as a two variable function
f(x,y) where for each position (x,y) in the projection plane, f(x,y) defines the light
intensity at this point.

4.2 Analog Image

An analog image can be mathematically represented as a continuous range of
values representing position and intensity. An analog image is characterized by a
physical magnitude varying continuously in space. For example, the image produced
on screen of a CRT monitor is analog in nature.

4.3 Digital Image

A digital image is composed of picture elements called pixels. Pixels are the
smallest sample of an image. A pixel represents the brightness at one point.
Conversion of an analog image into a digital image involves two important
operations, namely, sampling and quantization, which are illustrated in fig4.1

Analog image Sampling Quantisation Digital
image

Figure 4.1: Digital Image From Analog Image

4.4 Advantages Of Digital Images

The advantages of digital images are summarized below:

1. The processing of image is faster and cost effective.

2. Digital image can be effectively stored and efficiently transmitted from one
place to another.

3. When shooting a digital image one can immediately see if the images good or
not.

10

4. Copying a digital image is easy. The quality of digital image will not be
degraded even if it is copied for several times.

5. Whenever the image is in digital format, the reproduction of the image is both
faster and cheaper.

6. Digital technology offers plenty of scope for versatile image manipulation.

4.5 Drawback of Digital Images

Some of the drawbacks of digital images are:

1. Misuse of copyright as become easier because image can be copied from the
internet just by clicking the mouse couple of time.

2. A digital file cannot be enlarged beyond a certain size without compromising
on quality

3. The memory required to store and process good quality digital images is very
high.

4. For real time implementation of digital image processing algorithms, the
processor has to be very fast because the volume of the data is very high

4.6 Digital Image Processing

The processing of an image by means of a computer is generally termed
digital image processing. The advantages of using computers for the processing of
images are summarized below:

1. Flexibility and adaptability

The main advantages of digital computers when compared to analog
electronic and optical information processing devices is that no hardware
modification are necessary in order to reprogram digital computers to solve
different tasks. This features makes digital computers an ideal device for
processing image digital signal adaptively.

11
2. Data storage and transmission

With the development of different image-compression algorithms, the digital
data can be effectively stored. The digital data within in the computer can be
easily transmitted from one place to another.

The only limitation of the digital imaging and digital image processing are
memory and processing speed capabilities of computers. Different image
processing techniques include image enhancement, image restoration, image
fusion and image watermarking.

4.7 Types of digital image processing

1. Binary image processing

2. Grayscale image processing

3. Colour image processing

4. Wavelet based image processing

4.7.1Binary image processing

The simplest type of image which is used widely in a variety of industrial and
medical applications is binary, i.e. a black-and-white or silhouette image. Binary
image processing has several advantages but some corresponding drawbacks:

Advantages

- Easy to acquire: simple digital cameras can be used together with very simple
framestores, or low-cost scanners, or thresholding may be applied to grey-
level images.

- Low storage: no more than 1 bit/pixel, often this can be reduced as such
images are very amenable to compression (e.g. run-length coding).

- Simple processing: the algorithms are in most cases much simpler than those
applied to grey-level images.

12
Disadvantages

- Limited application: as the representation is only a silhouette, application is
restricted to tasks where internal detail is not required as a distinguishing
characteristic.

- Does not extend to 3D: the 3D nature of objects can rarely be represented by
silhouettes. (The 3D equivalent of binary processing uses voxels, spatial
occupancy of small cubes in 3D space).

- Specialised lighting is required for silhouettes: it is difficult to obtain reliable
binary images without restricting the environment. The simplest example is an
overhead projector or light box.

4.7.2 Gray scale image processing

Grayscale images are distinct from one-bit bi-tonal black-and-white images,
which in the context of computer imaging are images with only the two colors, black,
and white (also called bi level or binary images). Grayscale images have many shades
of gray in between. Grayscale images are also called monochromatic, denoting the
presence of only one (mono) color (chrome).

Grayscale images are often the result of measuring the intensity of light at
each pixel in a single band of the electromagnetic spectrum (e.g.infrared, visibl light,
ultraviolet, etc.), and in such cases they are monochromatic proper when only a given
frequency is captured. But also they can be synthesized from a full color image; see
the section about converting to grayscale.

4.7.3 Color image processing

Methods and Applications is a versatile resource that can be used as a
graduate textbook or as stand-alone reference for the design and the implementation
of various image and video processing tasks for cutting-edge applications.

13
Features:

1. Details recent advances in digital color image acquisition, analysis, processing, and
display

2. Explains the latest techniques, algorithms, and solutions for digital color imaging

3. Provides comprehensive coverage of system design, implementation, and application
aspects of digital color imaging

4. Explores new color image, video, multimedia, and biomedical processing applications

5. Contains numerous examples, illustrations, online access to full-color results, and
tables summarizing results from quantitative studies

Application

1. Secure imaging

2. Object recognition and feature
detection

3. Facial and retinal image analysis

4. Digital camera image processing

5. Spectral and superresolution
imaging

6. Image and video colorization

7. Virtual restoration of artwork

8. Video shot segmentation and
surveillance

14
4.7.4 Wavelet image processing

A wavelet is a wave-like oscillation with an amplitude that starts out at zero,
increases, and then decreases back to zero. It can typically be visualized as a "brief
oscillation" like one might see recorded by a seismograph or heart monitor.
Generally, wavelets are purposefully crafted to have specific properties that make
them useful for signal processing. Wavelets can be combined, using a "revert, shift,
multiply and sum" technique called convolution, with portions of an unknown signal
to extract information from the unknown signal.

For example, a wavelet could be created to have a frequency of Middle C and a short
duration of roughly a 32nd note. If this wavelet were to be convolved at periodic
intervals with a signal created from the recording of a song, then the results of these
convolutions would be useful for determining when the Middle C note was being
played in the song. Mathematically, the wavelet will resonate if the unknown signal
contains information of similar frequency - just as a tuning fork physically resonates
with sound waves of its specific tuning frequency. This concept of resonance is at the
core of many practical applications of wavelet theory.

As a mathematical tool, wavelets can be used to extract information from many
different kinds of data, including - but certainly not limited to - audio signals and
images. Sets of wavelets are generally needed to analyze data fully. A set of
"complementary" wavelets will deconstruct data without gaps or overlap so that the
deconstruction process is mathematically reversible. Thus, sets of complementary
wavelets are useful in wavelet based compression/decompression algorithms where it
is desirable to recover the original information with minimal loss.

In formal terms, this representation is a wavelet series representation of a square-
integrable function with respect to either a complete, orthonormal set of basis
functions, or anovercomplete set or frame of a vector space, for the Hilbert space of
square integrable functions.

15
5. IMAGE FUSION

5.1 Introduction

Multisensor image fusion has attracted a considerable amount of research
attention in the last ten years. Soon after the introduction of the first multisensor
arrays in image dependant systems, researchers began considering image fusion as a
necessity to solve the growing problem of information overload. Since the end of the
1980s and throughout the 1990s image, and in particular pixel-level, fusion was
established as a subject through a stream of publications presenting fusion algorithms.

Pixel

Input

level

image1

fusion

Wavelet

Fused

transform image

Pixel

Input

level

image2

fusion

Figure 5.1 Block diagram of image fusion technique

Furthermore, towards the end of the last decade, research attention was also
beginning to focus on the problem of performance evaluation of different image
fusion systems. In this chapter the literature published on the subject of pixel-level
image fusion from its beginnings, in the end of 1980s, until this day is reviewed.

5.2 General Pixel-level Image Fusion Techniques

The multiresolution and multiscale methods dominate the field of pixel-level
fusion; arithmetic fusion algorithms are the simplest and sometimes effective fusion
methods. Arithmetic fusion algorithms produce the fused image pixel by pixel, as an
arithmetic combination of the corresponding pixels in the input images. Arithmetic

fusion can be summarized by the expression given in Equation (5.1)
F (n,m)= k
A
A(n,m)+ k
B
B(n,m)+C (5.1)

16

where A, B, and F represent the inputs and the fused images respectively at
location (n,m). k
A
, k
B
and C are all constants defining the fusion method, with k
A

and k
B
defining the relative influence of the individual inputs on the fused image and
C the mean offset. Image averaging is the most commonly used example of such
fusion methods. In this case, the fused signal is evaluated as the average value
between the inputs, i.e. k
A
=½, k
B
=½ and C=0. In general, averaging produces
reasonable image quality in areas where input images are similar but the quality
rapidly decreases in regions where inputs are different.

The Intensity-Hue-Saturation (IHS) colour representation is another format
suitable for information fusion. It relates to the principles of human colour perception
and is easily obtained by a simple arithmetic transformation from the more common
RGB space. In IHS fusion the intensity channel of the colour input image is replaced
by the monochrome input image. The fused colour image is then obtained by reversed
transformation to the RGB space. Contrast stretching is commonly applied to the IHS
channels prior inverse transformation to obtain enhanced colour images.

Principal component analysis (PCA) is another powerful tool used for
merging remotely sensed images. It is a statistical technique that transforms a set of
intercorrelated variables into a set of new uncorrelated linear combinations of the
original variables. Evaluation of principal components (PCs) of an image signal also
involves calculations of covariance and eigen values (vectors). An inverse PCA,
transforms the data back to the original image space. Principal component analysis is
used in pixel-level image fusion based on the component substitution technique
where transforming the low-resolution colour image to principal components, PC1 is
substituted by the high-resolution monochrome data. Inverse PCA is applied to get
the fused image. These properties make multiresolution fusion algorithms potentially
more robust than other fusion approaches.

17
5.3 Multiresolution Image Fusion Based on the Gaussian Pyramid

Representation

Multiresolution processing methods enable an image fusion system to fuse
image information in a suitable pyramid format.

Image pyramids are made up of a series of sub-band signals, organized into
pyramid levels, of decreasing resolution (or size) each representing a portion of the
original image spectrum. Information contained within the individual sub-band
signals corresponds to a particular scale range, i.e. each sub-band contains features of
a certain size. By fusing information in the pyramid domain, superposition of features
from different input images is achieved with a much smaller loss of information than
in the case of single resolution processing.

Fusing images in their pyramid representation therefore, enables the fusion
system to consider image features of different scales separately even when they
overlap in the original image.

Furthermore, this scale separability also limits damage of sub-optimal fusion
decisions, made during the feature selection process, to a small portion of the
spectrum.

Figure 5.2: The structure of multiresolution pixel-level image fusion

systems based on the derivatives of the Gaussian pyramid

18

Multiresolution image processing was first applied to pixel-level image fusion
using derivatives of the Gaussian pyramid representation in which the information
from the original image signal is represented through a series of (coarser) low-pass
approximations of decreasing resolution. The pyramid is formed by iterative
application of low-pass filtering, usually with a 5x5 pixel Gaussian template,
followed by subsampling with a factor 2, a process also known as reduction.

All multiresolution image fusion systems based on this general approach
exhibit a very similar structure, which is shown in the block diagram of Figure 5.2.
Input images obtained from different sensors are first decomposed into their Gaussian
pyramid representations.

Gaussian pyramids are then used as a basis for another type of high pass (HP)
pyramids, such as the Laplacian, which contain, at each level, only information
exclusive to the corresponding level of the Gaussian pyramid.

HP pyramids represent a suitable representation for image fusion. Important
features from the input images are identified as significant coefficients in the HP
pyramids and they are transferred (fused) into the fused image by producing a new,
fused, HP pyramid from the coefficients of the input pyramids.

The process of selecting significant information from the input pyramids is
usually referred to as feature selection and the whole process of forming a new
composite pyramid is known as pyramid fusion. The fused pyramid is transformed
into the fused image using a multiresolution reconstruction process. This process is
dual to the decomposition and involves iterative expansion (up-sampling) of the
successive levels of the fused Gaussian pyramid and combination (addition in the
case of Laplacian pyramids) with the corresponding levels of the fused HP pyramid,
known as expand operation.

The contrast pyramid was also used in another interesting fusion approach
presented by Toet et al introduced an image fusion technique which preserves local
luminance contrast in the sensor images. The technique is based on selection of image
features with maximum contrast rather than maximum magnitude.

19

A contrast pyramid is formed by dividing each level of the Gaussian low-pass
pyramid with the expanded version of the next, coarser, level. Each level of the
contrast pyramid contains only information exclusive to the corresponding level of
the Gaussian pyramid.

5.4 Multiresolution Image Fusion Based on the Wavelet Transform

The Discrete Wavelet Transform (DWT) was successfully applied in
the field of image processing with the appearance of Mallat’s algorithm that enabled
the implementation of two dimensional DWT using one dimensional filter banks.
This significant multiresolution approach is discussed in more detail in the next
chapter.

Figure 5.3: The structure of an image fusion system based on wavelet

multiresolution analysis

Its general structure, briefly describe here, is very similar to that of the
Gaussian pyramid based approach. The structure of a wavelet based image fusion
system is shown in Figure 5.2. Input signals are transformed using the wavelet
decomposition process into the wavelet pyramid representation.

In contrast to Gaussian pyramid based methods, high pass information is also
separated into different sub-band signals according to orientation as well as scale. The
scale structure remains logarithmic, i.e. for every new pyramid level the scale is
reduced by a factor of 2 in both directions.

20

The wavelet pyramid representation has three different sub-band signals
containing information in the horizontal, vertical and diagonal orientation at each
pyramid level. The size of the pyramid coefficients corresponds to “contrast” at that
particular scale in the original signal, and can therefore, be used directly as a
representation of saliency. In addition, wavelet representation is compact.

One of the first wavelet based fusion systems was presented by Li et al. in
1995. It uses Mallat's technique to decompose the input images and an area based
feature selection for pyramid fusion.

In the proposed system, Li et al. use a 3x3 or a 5x5 neighbourhood to evaluate
a local activity measure associated with the centre pixel.

It is given as the largest absolute coefficient size within the neighbourhood. In
case of coefficients from the two input pyramids exhibiting dissimilar values, the
coefficient with the largest activity associated with it is chosen for the fused pyramid.
Otherwise, similar coefficients are simply averaged to get the fused value. Finally,
after the selection process, a majority filter is applied to the binary decision map to
remove bad selection decisions caused by noise “hot-spots”.

This fusion technique works well at lower pyramid levels, but for coarser
resolution levels, the area selection and majority filtering, especially with larger
neighbourhood sizes, can significantly bias feature selection towards one of the
inputs.

21
6. MULTIRESOLUTION WAVELET IMAGE FUSION

6.1 Introduction

Multiresolution analysis represents image signals in a multiresolution pyramid
form. This means that performing image fusion in this “pyramid” domain enables the
fusion of features from different input images at various scales even when they
occupy overlapping areas of the observed scene. Segmentation of the image spectrum
into pyramid levels corresponding to narrow ranges of scale and the use of selective
pyramid fusion techniques introduces robustness to the fusion system by minimizing
the information loss produced when applying fusion algorithms on a single resolution
basis. As a result, a number of multiresolution image processing techniques have been
proposed in the field of pixel-level image fusion.

The DWT multiresolution representation, or wavelet pyramid, has a number
of advantages over the Gaussian pyramid based multiresolution techniques. One of
the most fundamental issues is that wavelet functions used in this type of
multiresolution image analysis form an orthonormal basis that results in a
nonredundant signal representation. In other words, the size of the multiresolution
pyramid is exactly the same as that of the original image; Gaussian based pyramid
representations are 4/3 of the original image size. Also, further to their redundant
signal representation, the computational complexity of the multiresolution analysis
process used to obtain Gaussian based pyramids, far exceeds that of the wavelet
decomposition process, which can be implemented using one-dimensional filters
only.

The reconstruction process, which transforms the fused multiresolution
pyramid back to the original image representation, is the dual of the decomposition
process.In this chapter, we describe a new system for pixel-level image fusion of
gray-level image modalities, using the DWT multiresolution approach. It is based on
a novel cross-sub-band feature selection and fusion mechanism that is used to fuse
information from different input pyramids. Results obtained with this new method,
show that this form of pyramid fusion significantly reduces information loss and
ringing artifacts exhibited by more conventional wavelet based fusion systems.

22
6.2 Wavelet transform

The basic idea of the wavelet transform is to represent an arbitrary signal f as
a weighted superposition of wavelets. Wavelets are functions generated by dilations
and translations of a single prototype function, called the mother wavelet, ¢ (t).

¢
a,b
(t) = (1/a
2
) ¢ ([t-b] / a) (6.1)
The wavelet transform is useful in image fusion applications due to its good

localization properties in both the spectral and spatial domain. These properties arise
from the nature of the process in which the wavelets are produced from the prototype
function. Dilations of the orthogonal wavelet ensure that the signal is analyzed at
different spectral ranges providing spectral localization, while translations provide the
spatial analysis resulting in good spatial domain localization. The reconstruction of
the original signal from the wavelet representation is possible if the wavelet prototype
function satisfies the decay condition:

} |¢ (e)|2 / |e|de < · (6.2)

Where ¢(e) represents the Fourier transform of ¢(t).

The integral wavelet transform of a signal f(t) with respect to some analyzing
wavelet ¢ is defined as

·
W¢ f (a,b) = } f(t) ¢
a,b
(t) dt (6.3)

-·

The parameters a and b are called dilation and translation parameters
respectively.

Equations mentioned above are relates to continuous wavelets and the
continuous wavelet transform (CWT), however, for practical reasons and in
applications of interest to this thesis a discrete version, or the Discrete Wavelet
Transform (DWT) is preferred.

6.3 Two Dimensional QMF Multiresolution Decomposition

The multiresolution image analysis technique used in our fusion system is
based on the Quadrature Mirror Filter (QMF) implementation of the discrete wavelet
transform, embodied in Mallet’s algorithm.

23

Quadrature Mirror filters represents a class of wavelet
decomposition/reconstruction filters developed independently for subband coding and
compression of discrete signals.

They satisfy the condition which defines the relationships between the low
and high-pass analysis and synthesis impulse responses (changing the sign of every
other sample between the LP and HP response and mirroring the analysis filters in
time to produce the synthesis bank), capable of pyramid decomposition and perfect
reconstruction of the original signal from its decomposed pyramid representation.

QM filter banks used in multiresolution signal processing are made up of two
pairs of power complimentary conjugate FIR filters.

Signal decomposition is performed in the analysis filter bank by the analysis
QMF pair h0(n) and h1(n). Signal reconstruction takes place in a synthesis bank
consisting of the QMF synthesis pair g0(n) and g1(n).

Figure 6.1 QMF decomposition structure: analysis bank

Figure 6.2 QMF reconstruction structure: synthesis bank

24

Two-dimensional signals are decomposed with one-dimensional FIR filters by
applying the filters in both directions independently.

The structure of the QMF analysis and synthesis filter banks is shown in
Figure 6.1 and Figure 6.2. In the analysis bank, a one dimensional decomposition
filter bank is first applied in the horizontal direction to the input image or its
approximation A
LL
k+1
. The image is filtered along the rows with the low and high
pass analysis filters, H0 and H1, and the resulting signals are critically decimated in
the horizontal direction by keeping one column out of two.

The two half images produced in this way are themselves inputs into identical
filter banks which operate in vertical direction. Signals are filtered along their
columns and only every other row of the processed signals is kept.

Image reconstruction from the multiresolution pyramid is through a series of
synthesis filter banks. The reconstruction process is dual to the decomposition
process. In each stage of the reconstruction, all the sub-band signals of the same
resolution level are input into the synthesis bank to produce the low-pass
approximation of the higher resolution level. Initially, all signals are interpolated in
the vertical direction by inserting a row of zeros after each sub-band row. Interpolated
signals are then filtered along the columns with the QMF synthesis pair G0 and G1.
The results of the two one dimensional synthesis banks are input into a further
synthesis bank where they are processed in the horizontal dimension by inserting
columns of zeros followed by filtering along the rows. Finally, the reconstructed
signal is obtained as the sum of the outputs of the low and high-pass filtering
branches of the last filter bank.

6.4 Wavelet Fusion Structure

Wavelet based pixel-level image fusion schemes increase the information
content of fused images by selecting the most significant features from input images
and transferring them into the composite image.

This process takes place in the multiresolution pyramid domain reached by the
process of multiresolution analysis. Information fusion is achieved by creating a new,
fused pyramid representation that contains all the significant information from the
multiresolution pyramids of the input images.

25

Input images A and B, are first decomposed into multiresolution pyramids
using a series of multiresolution QMF Analysis filter banks.

Then, a new pyramid array is initialized containing no information, i.e. it is
filled with zeros. The pyramid fusion algorithm then considers, in a systematic way,
individual or groups of pixels from the multiresolution pyramid representations of the
input images, and forms values for the corresponding pixels of the new pyramid. The
coefficients of the new pyramid are formed either by transferring the input coefficient
values directly or as arithmetic combinations of the corresponding coefficients from
the input pyramids. Criteria for the selection and fusion of input pyramid coefficients
are determined in the design of the feature selection process.

Thus the feature selection process searches the input pyramids and identifies
the most significant image features at each position and scale. The aim then, is to
transfer these features from the input image pyramids into the fused without loss of
information. For each level of scale (resolution) all spatial positions have to be
considered and features from input images compared with each other. The pyramid
fusion process used in the proposed system is based on a cross-band feature
evaluation and selection approach. It integrates feature information from a number of
sub-band signals and levels at once, to make a decision on how to fuse particular
input pyramid pixels. When the pyramid fusion process is completed and all the fused
pyramid coefficients have been produced, the fused pyramid is input into the wavelet
reconstruction process to obtain the final fused image.

6.5 Conventional Feature Selection and Pyramid Fusion Mechanisms

Feature selection mechanisms can be broadly divided into pixel and area
based schemes, and pyramid coefficient fusion methods are purely selective, purely
arithmetic or composite, a combination of the first two.

Pixel based feature selection systems make a fusion decision for each pyramid
pixel individually, based on its value. In contrast, area based methods use a
neighbourhood of coefficient values to form a selection criterion for the center pixel.
A diagram illustrating these two types of coefficient selection mechanisms is shown
for a single pyramid sub-band fusion in Figure 6.3.

26

In terms of pyramid fusion, selective schemes form the composite pyramid by
direct transfer of coefficient values from the input pyramids into the fused, according
to a selection map produced by the feature selection process.

Arithmetic methods on the other hand, evaluate fused pyramid coefficients as
an arithmetic combination, usually a weighted sum, equation (5.1), of the input
pyramid values. Composite methods use both of the above approaches.

Figure 6.3: Pixel-based and Area-based selection method

Robustness can be added into the system by the use of area based selection
criteria, such as those used in schemes by Burt and Kolczynski and Li et. al.

Decisions based on a neighbourhood around the center coefficient (Figure 6.3)
remove most of the selection map randomness due to noise and random large values
in the sub-band signal.

They also reduce the contrast loss by ensuring that all the coefficients
belonging to a particular dominant feature are selected. The performance of these
methods however, depends on the image content and the size of the neighbourhood
used.

27

The feature selection mechanism proposed in this chapter is based on a
crossband coefficient selection criterion that exploits the high level of correlation
present between the different levels and sub-bands of input pyramids, to form a more
robust and complete evaluation of input image features. The processes of forming a
selection decision based upon information from multiple sub-bands and selecting
multiple input coefficients at once are also referred to as the integration of selection
information. Integration of selection information refers to the process of using
information from more than one resolution level of the input pyramids to aid pyramid
coefficient selection and fusion.

The coefficient values from more than one resolution level in our selection
criterion we gain an even better evaluation of the saliency of the original feature. In a
top-down approach used in the proposed system, values of the “father” coefficients,
are used in the selection criterion of their “children”. The selection criterion thus
becomes

A
L
LH

LH
A
L
HL

FL

HL ═ A
L
HH
, AL+AL+1 >BL+BL+1

F
L

F
L

HH

B
L
LH

,otherwise (6.4)

B
L
HL

B
L
HH

28
where
A
L
=|A
L
LH
| + |A
L
HL
| + |A
L
HH
| (6.5)
B
L
=|B
L
LH
| + |B
L
HL
| + |B
L
HH
| (6.6)
A
L+1
=|A
L+1
LH
| + |A
L+1
HL
| + |A
L+1
HH
| (6.7)
B
L+1
=|B
L+1
LH
| + |B
L+1
HL
| + |B
L+1
HH
| (6.8)
A
L
sb
B
L
sb
and F
L
sb
represent coefficients of sub-band sb on level L, with
A
L+1
sb
and B
L+1
sb
being the corresponding “father” coefficients in the input
pyramids. The cross-band feature selection and pyramid fusion mechanism described
above has a constraint in that it can be implemented only on pyramid levels which are
not at the end of the pyramid.

29
7. GRADIENT BASED MULTIRESOLUTION IMAGE FUSION

7.1 Introduction

In this chapter, a novel approach to multiresolution image analysis designed
specifically for the use in pixel-level image fusion systems is presented. The aim is to
eliminate the main problems encountered in conventional multiresolution fusion
approaches: i.e. reconstruction errors, loss of contrast information and prohibitively
high computational complexity. At the same time, the ability to operate successfully
across a wide range of pixel-level fusion applications has also been an important
objective in this part of the program.

As mentioned in the chapter 5 and chapter 6, most of the previously proposed
pixel-level image fusion systems have been based on wavelet signal analysis
techniques. These multiresolution fusion systems that employ the Discrete Wavelet
Transform (DWT) achieve high fused image quality and robust performance at
reasonable computational cost.

However, the multiresolution structure of the wavelet analysis also introduces
a number of characteristic problems in the image fusion domain. The most important
of these is certainly the problem of reconstruction errors or “ringing” artifacts.

Ringing artifacts are the result of compromising the perfect reconstruction property of
the wavelet multiresolution analysis, by introducing discontinuities into the sub-band
signals. This is almost unavoidable in the process of image fusion.

In this chapter a novel approach to multiresolution wavelet analysis is
described. The approach is based on a Gradient (edge) signal representation of image
information which is particularly well suited to pixel-level image fusion. Edge signal
representation is compressed into a related Gradient (edge) map representation that is
easily derived from the original image signal.

Edge maps express the information contained in the original image signal as
changes in the signal value, rather then absolute signal values. This edge map
representation can be incorporated into the multiresolution decomposition process by
using alternative Gradient (edge) filters.

30

Information fusion is performed in the multiresolution gradient map domain,
resulting in a new decomposition-fusion processing structure. At each level of scale
input signals are first transformed into their edge map representations, and fused to
produce fused edge maps. High-pass information from fused edge map signals is then
decomposed into a simplified wavelet pyramid representation using edge filters.

The basic multiresolution structure is preserved in the system and the fused
image is obtained through a conventional multiresolution reconstruction process. The
method provides clear advantages over conventional wavelet fusion systems both in
terms of a more robust feature selection and also a significant reduction in the amount
of ringing artifacts and information loss.

7.2 Gradient Map Representation of Image Information

The practical goal of pixel-level image fusion algorithms is to identify
(detect), compare and transfer the most important visual information from the inputs
into the fused image. Visual information, contained within image signals, is mainly in
the form of edges, i.e. changes or uncertainties of the signal rather than the absolute
gray level value of each pixel. Larger, more perceptually meaningful, information
carrying image structures such as patterns, features and objects can be considered as
collections of basic edge elements of different scales and orientations with specific
spatial arrangements. The aim of image information fusion is to transfer, without loss,
all the most important edge information from any number of registered input images
into the fused image. Indeed, an ideally fused image can be defined as an image that

contains all the edge information of all input images.

Edge signals form a representation of image information that enables the
fusion process to avoid most problems described in the previous section. It is
particularly well suited for image fusion applications in that it operates directly on the
input signal (image) level, rather than on a sub-band at a time. Moreover, the edge
signal representation is possible at any resolution, which allows the preservation of
the multiresolution structure. In this way, information from the entire spectrum is
used to make feature selection decisions. The Gradient map of a one dimensional

signal x is defined as

x
(7.1) o (n)=x(n)-x(n-1), for all n

31

In an gradient signal representation, an image signal is expressed as sum of
appropriately translated and weighted edge signals. Basically, each edge signal
captures a single gray level change, edge element, from the original image and is

constant elsewhere.

For two-dimensional (2-D) image signals, gradient maps are defined in the

horizontal and vertical directions independently.

They are 2-D signals that represent at each position the horizontal and vertical
gradient information as the difference between the corresponding pixel and the pixel
directly above or to the left The horizontal and vertical image gradient map signals

are defined as

x
H
(n,m) =x(n,m)-x(n,m-1), for all n,m
(7.2) o

x
v
(n,m) =x(n,m)-x(n-1,m), for all n,m
(7.3) o

Figure 7.1: Two-dimensional Gradient map signal representation:

a) input signal b) vertical gradient map representation c) horizontal gradient

map representation.

An example of an image and its horizontal and vertical edge map
representations is shown in Figure 7.1. The input image (in Figure 7.1 a) contains
significant features (information) in all possible directions.

32

Its horizontal edge map representation, (shown in 5.1 b), contains mainly
vertical edges and, to a certain extent, diagonal ones. Large grayish areas indicate
regions where there is a considerable amount of small detail, which is usually
omnidirectional.

Horizontally oriented patterns are mostly visible in the vertical edge map
(shown in 7.1 c), as are, to a certain extent, the diagonal too. In both edge map
representations the signal takes both positive and negative values and the images
displayed in Figure 7.1 are scaled absolute values of the original edge map signals.

Finally, the gradient map representation is a spatially non-redundant image
signal representation. In other words, edge map signals contain all the information
from the input signal using the same number of pixels. Compared to the gradient
signal representation this enables a significant reduction in complexity.

The original image can be perfectly reconstructed from the edge map
representation through the cumulative sum:

n
x ( n, m ) = ∑ o
x
H
( k, m ) (7.4) k =1

7.3 Gradient Filters

The gradient map image information domain is particularly relevant to image
fusion. However, gradient maps contain information from the entire spectrum. At the
same time, it is the multiresolution structure, in which inputs are fused at a range of
scales independently, that ensures the robustness of fusion performance. Such a
multiresolution structure is directly obtainable from gradient map signals by using
gradient filters.

In each QMF analysis stage, the upper half of the image spectrum in a
particular direction is decomposed into a “detail” subband signal. This is achieved by
filtering the input signal along that axis with the high-pass QMF H
1
.

33

The same result is obtained when one filters the corresponding gradient map
with a gradient filter H
e
. Equivalency is best demonstrated for 1-D signals in the Z

transform domain
X (z) H
1
(z)= A
x
(z) H
e
(z) (7.5)
the gradient map signal A
x
(z) is derived from the input signal X (z) according to the
z transform of the expression in (7.1)

A
x
(z)=X(z) –z
–1
X(z) = (1 –z
–1
) X(z) (7.6)
and by using (7.6) and (7.5) the gradient filter is defined in terms of the impulse
response of the QMF high pass filter h
1
(n)
H
e
(z) = H
1
(z) / (1 –z
–1
) ÷ h
e
(n) = h
1
(n) * u(n) (7.7)
h
e
(n) is therefore obtained by convolving the QMF high-pass impulse
response with a step function u(n) . Using the causal convolution sum and exploiting
the fact that for k > n, the step function u (n-k) is zero and otherwise one, we get a
simplified expression for h
e
(n) as follows:
n
h
e
(n) = ∑ h
1
(k) (7.8)
k =0
Thus, the gradient filter h
e
(n) defined in (7.8) is a finite-impulse response
(FIR) filter with N coefficients, where N is the length of the QMF h
1
(n), symmetrical
about the central coefficient.

Figure 7.2: Impulse response of a) high-pass QMF filter and

b) the corresponding edge filter he(n)

34

As an example, the impulse responses of the Johnston 16A QMF high-pass
filter and the gradient filter derived from it are plotted in Figure. 7.2 (a) and (b),
respectively.

7.4 Gradient Fusion Structure

Information fusion in the proposed gradient-based fusion system takes place
in the gradient map domain. The system uses an alternative, fuse-then-decompose
strategy, which yields a novel fusion-decomposition architecture based on gradient
filters and the gradient map representation of image signals. Information fusion is
achieved by applying sophisticated feature selection and fusion algorithms to gradient
maps.

Figure 7.3: Simplified spectral decomposition for a) gradient based fusion and

b) resulting pyramid structure with sub-bands of different sizes.

Image fusion in the gradient-based fusion system is performed within the
general framework of the conventional, logarithmic, multiresolution structure.
Information in horizontal and vertical directions is fused half-band at a time.

At each resolution level only the upper ¾ of the image spectrum is fused
(Figure 7.3). Decomposition is extended by applying further stages of analysis banks
until the vast majority of the information contained in the input spectra is fused. The
remaining base band residuals are fused last using alternative methods.

35
7.4.1 Block diagram

Vertical

Low pass image

Base-band

Low-Pass
Fused

Approximation

approximation

Fusion

Base-band

Horizontal

Low-Pass

Approximation

Vertical

Fused

Gradient Map

Gradient

Gradient map

Horizontal

Fusion

Filter

Representation Sub-band

Input
images

Horizontal
Fused

Gradient Map

Gradient Vertical

Gradient Map

Fusion

Filter Sub-band
Representation

Figure 7.4: Structure of a single resolution level of gradient-based

multiresolution fusion system

The general Structure of a single resolution level of gradient-based
multiresolution fusion system is illustrated Figure 7.4 is based on combined fusion-
analysis filter banks, where, for simplicity, only two input images A and B are fused.

At each resolution level, input image signals are transformed into their
horizontal gradient map representations, which are in turn fused into a single
horizontal gradient map signal. Gradient filters are then applied to this map and the
resulting signal is decimated (by a factor 2) to produce the fused horizontal subband
at the k th resolution level. This subband contains fused information exclusive to the
horizontal upper half of the input signal spectrum.

At the same time, the input image signals are filtered with low-pass filters in
the horizontal direction, which produces their low-pass approximations containing
only the lower half of the spectrum in this direction.

In the second stage, these low-pass approximations are processed in the same
manner as the input signals but in the vertical direction.

36

This produces the vertical fused subband signal and the quarter-band low-pass
input image approximations, A1 and B1 The low-pass approximations created by the
structure are further input into an equivalent bank operating on the (k+1) th resolution
level. Finally, when all the high-pass information from the input spectra has been
fused and decomposed, or a certain decomposition depth reached, the remaining input
image basebands A1 and B1 are fused using arithmetic fusion methods. The gradient-
based multiresolution image fusion architecture of Figure 7.4 uses gradient maps and
gradient filters to effectively implement the QMF high-pass filtering branches of
multiresolution analysis filter banks.

The detailed block diagram of a single resolution stage of this analysis process
is shown in Figure 7.5. Both input images are initially processed using the horizontal
delay elements in high-pass filtering branches H
e
to produce horizontal gradient
maps. These gradient maps are fused into a single horizontal gradient map, which is
then filtered along the rows with the H
e
filter. The filtered signal is decimated by a
factor of 2 to produce the fused horizontal subband. Input signals are also filtered
along the rows using H
o
and are decimated by 2.

The resulting low-pass approximations are processed in the vertical direction
by using vertical delay elements; input signals are expressed as vertical gradient maps
that are then fused.

Figures 7.5: Implementation structure of the gradient based fusion-

decomposition process

37

The resulting fused gradient map is gradient filtered along the columns and
decimated by ignoring every other row of the filtered signal, to produce the vertical
subband signal The half-band approximations are also low-pass filtered and
decimated in the vertical direction resulting in ¼ band low-pass subband signals
A1and B1. These are used as inputs into further decomposition stages.

A high-resolution fused image is obtained from this fused multiresolution
pyramid by applying a modified version of the conventional QMF pyramid
reconstruction process. Image reconstruction is implemented through a series of
cascaded, two-dimensional synthesis filter banks (same as conventional wavelet
reconstruction).

7.5 Gradient information fusion

In gradient-based multiresolution image fusion, information fusion is

performed in the gradient map domain.

Unlike wavelet pyramid coefficients, whose size is only an indication of the
saliency of features collocated within a neighborhood, the absolute size of the
gradient map elements is a spatially accurate direct measure of feature contrast.
Furthermore, gradient map signals contain the information from the entire spectrum,

which adds reliability to the process of feature selection and fusion.

This also enables the fusion system to transfer into the fused pyramid all the
high frequency information of a particular feature. Due to these properties, gradient-
based fusion exhibits improved performance in terms of robust feature selection and

achieves significant reductions in fused visual information distortion.

The simplest method of feature selection and fusion is the pixel-based select
max approach. In this approach, the fused gradient map pixel takes the value of the

corresponding input gradient map pixel with the largest absolute value, i.e.,
o
F
= o
A
, |o
A
|>|o
B
| (7.9)
o
B
, otherwise

However, this method is not always as reliable as more complex subband
fusion techniques. The cross-band fusion method used in the edge based fusion
system employs the same principles as the one presented in chapter 9.

38

In this case however, sub-band coefficients of the wavelet pyramid are
replaced with the edge elements (pixels) of the edge map.

Furthermore, there is no straightforward integration of selection decisions
since there is no direct spatial correspondence between pixels of the horizontal and
vertical edge maps (they are of different size, Figure 7.3). The basic feature selection
used in horizontal and vertical edge map fusion is expressed as:

Fx
=
o
L

Ax
, S
L

Ax
>S
L

Bx

oL
o
L

Bx
,
(7.10)
Otherwise

Fy
=
o
L

Ay
, S
L

Ay
>S
L

By

oL
o
L

By
,
(7.10)
Otherwise

k is a constant, experimentally determined to be k=3, and L and L+1 indicate
edge map information from the current and coarser resolution levels respectively.

Consistency verification of selection decisions can change the edge element fusion
method from selective to arithmetic fusion, if the majority of corresponding selection
decisions made on the higher resolution level (L-1) do not agree with the current
decision.

The spatial correspondence between edge elements at neighbouring resolution
levels is the same as in the conventional wavelet pyramid case.

Exact weighting coefficients of the arithmetic fusion method are again based
on the distance between the edge elements:

|o
L

Ax
(n,m)| + |o
L

Bx
(n,m)|

D=

(7.11)

max(|o
L

Ax
(n,m)| , |o
L

Bx
(n,m)|)

39

Weighting coefficients of the arithmetic fusion are evaluated according to the
size of the difference D compared to a threshold T as: If the distance between the
coefficients is very large, D>T, input edge elements are added to form the fused
value. Otherwise they are considered similar and their average value is taken for the
fused edge map. The optimal value for the threshold parameter T was experimentally
determined to be in the region of 0.8. The complete cross-band feature selection and
edge map fusion method is illustrated in graphical form in Figure 7.6.

7.6 Baseband Fusion

Baseband signals are the residual, low-pass approximations of the input
signals. These baseband signals contain only the very large-scale features that form

the background of input images and are important for their natural appearance.

In the proposed fusion system, baseband fusion is performed using arithmetic

combinations of input basebands as follow.
F
k
(n,m)=A1
k
(n,m)+B1
k
(n,m)-(µ
A
+µ
B
)/2 (7.13)

Where F
k
, A1
k
and B1
k
are the fused and input baseband signals, µ
A
and µ
B

are the mean values of the two input basebands, and k represents the coarsest
resolution level. Generally, baseband fusion methods have little influence on the
overall fusion performance.

7.7 Fusion Complexity

In terms of computational complexity the edge based multiresolution
decomposition-fusion approach proposed in this chapter offers a reduction in the
computational effort required, to fuse two images, when compared to the
conventional QMF implementation approach. The most significant portion of the
reduction in complexity comes from the reduction in the number filters used in the
decomposition and reconstruction (analysis and synthesis) filter banks. In both
analysis and synthesis banks, this elimination of a one-dimensional filter at the
second and first stages of filtering respectively, reduces the complexity by around ¼
from the direct implementation.

40
8.OBJECTIVE EVALUATION OF PIXEL LEVEL IMAGE FUSION

PERFORMANCE

8.1 Introduction

This chapter addresses the issue of objectively measuring pixel-level image
fusion performance. Multisensor image fusion is widely recognized as valuable in
image based application areas such as remote and airborne sensing and medical
imaging. As a consequence, with the constant improvements in the availability of
multispectral/ multisensor equipment, considerable research effort has been directed
towards the development of advanced image fusion techniques. Fusion performance
metrics are used in this context to identify suitable and robust fusion approaches and
to optimize the system parameters.

In this chapter, the objective evaluation of pixel-level image fusion
performance is proposed. The framework models the amount of and accuracy with
which visual information is transferred from the inputs to the fused image by the
fusion process. It is based on the principle that visual information conveyed by an
image signal relates to edge information. Therefore, by comparing the edge
information of the inputs to that of the fused image, the success of information
transfer from the input images into the fused output image can be measured. This
quantity then represents a measure of fusion performance. Perceptual importance of
different regions within the image is also taken into account in the form of perceptual
weighting coefficients associated with each gradient (edge) point in the inputs. The
objective fusion performance measure produces a single, numerical, fusion
performance score obtained as a sum of perceptually weighted measures of local
information fusion success.

41
8.2 Edge Information Extraction

As mentioned previously, human observers are motivated by resolving the
uncertainties (i.e. gray level changes) in the image. In real image signals, these
changes are not concentrated in any predefined region but are commonly distributed
according to content throughout the image signal. Spatial locations where the signal
changes value form a part of the uncertainty associated with the image signal.

An observer searches the visual stimulus (image signal) for these areas of

“uncertainty” and extracts information from them.

However, information is not only contained in the detectable changes of the
signal value fixated by the observer. The lack of signal change (zero edge) carries a
small but finite amount of information, i.e. that there is no edge there.

Therefore, in order to capture all the information contained within an image,
all possible “uncertainties” of that signal have to be considered. This is done by
measuring edge (gradient) information at all spatial locations within the presented
image.

-1 -2 -1

0 0 0

1 2 1

-1 0 1

2 0 2

-1 0 1

Figure 8.1: a) Horizontal and b) vertical Sobel template

Visual information from the image signal is represented, at each position,
through edge strength and orientation parameters. These parameters are extracted
using a simple Sobel edge operator, defined in it’s basic form by the two 3×3
templates shown in Figure 8.1. These templates represent the horizontal and vertical
edge operators that measure edge components in the horizontal and vertical directions
respectively.

For the purpose of edge information extraction in the proposed objective
measure, all three images, A and B and F, are two-dimensionally filtered with the two

42
Sobel templates. The result of filtering each image, are two further images s
x
and s
y

that contain edge components in the x and y directions.

From these components, the edge strength, g(n,m), and orientation, o(n,m),
information is easily obtained for each pixel p(n,m) of an input image (say image A)
according to:

g
A
(n,m) = \ (s
x
A
(n,m)
2
+ s
y
A
(n,m)
2
) (8.1)
o
A
(n,m) = tan
-1
(s
x
A
(n,m) / s
y
A
(n,m) ) (8.2)

for 1 s n s N and 1 s m s M, where N and M are the dimensions of the input image.

8.3 Perceptual Loss of Edge Strength and Orientation

The edge information preservation estimator is a crucial part of the objective
fusion performance measure. It provides a measure of how well edge information in
the fused image represents the edge information that can be found in the inputs.This
measurement represents a comparison with the theoretical aim of the fusion process
which is to preserve, as truthfully as possible, all input information in a single fused
image. This comparison is the basis of the measurement of image fusion performance
achieved by the fusion system.

Edge information extracted from the input and fused images is in the form of
edge strength and orientation maps, g
A
(n,m) g
B
(n,m) and g
F
(n,m), and o
A
(n,m)

o
B
(n,m) and o
F
(n,m).

The change in edge strength is evaluated as the ratio between the strength of
the fused and of the input gradient for the case when there is a loss of contrast, i.e. the
input gradient is larger than the fused. In the opposite case, when the fused gradient is
larger than the input, we have unintended contrast enhancement which is treated in
the same way as an inverted loss in contrast and the ratio is inverted. The strength
change parameter of information in F with respect to A, G
AF
can therefore be
expressed as:

43

g
F
(n,m)/ g
A
(n,m)

G
AF
(n,m)=
g
A
(n,m)/ g
F
(n,m)

, if g
A
(n,m) > g
F
(n,m)

(8.3)

, otherwise

From the expression in equation (6.3), it can be seen that parameter G
AF
has a value
of unity when the fused gradient strength gF(n,m) is a perfect representation of, i.e. it
is equal to, input gradient strength gA(n,m). For an increasing difference between the
two values, G
AF
decreases linearly towards zero.

Change of orientation information in F with respect to A, A
AF
, can be
expressed as a normalized relative distance between input and fused edge orientation:

|| o
A
(n,m) - o
F
(n,m) | - t/2 |
A
AF
(n,m)=
t/2

These are used to derive the edge strength and orientation preservation values

Γg

Q
g
AF
(n,m ) =

(8.5)
1+exp(k
g
(G
AF

(n,m) - o
g
))

Γα

AF

Qo (n,m) =
1+exp (ko (A
AF

(8.6)
(n,m) - oo))

Q
g
AF
(n,m) and Qo
AF
(n,m) model perceptual loss of information in F, in
terms of how well the strength and orientation values of a pixel p(n,m) in A are
represented in the fused image. The constants Γg, κ g, σ g and Γα, κα, σα determine
the exact shape of the sigmoid functions used to form the edge strength and
orientation preservation values, see equations (8.5) and (8.6).

Edge information preservation values are then defined as

44
Q
AF
(n,m) = Q
g
AF
(n,m) Qo
AF
(n,m)

With 0 s G
AF
(n,m) s 1 A value of 0 corresponds to the complete loss of
edge information, at location (n,m), as transferred from A into F. G
AF
(n,m)=1
indicates fusion from A to F with no loss of information.

The overall objective fusion performance measurement of an image fusion
process p, operating on input images A and B to produce a fused image F, is
evaluated as a perceptually weighted, normalized sum of edge information
preservation coefficients across the input image set:

Q
AF
(n,m) w
A
(n,m) + Q
BF
(n,m) w
B
(n,m)
Q
AB / F
=

(8.7)

w
A
(i , j)+ w
B
(i , j)

the edge preservation values Q
AF
(n,m) and Q
BF
(n,m) are weighted by w
A
(n,m) = [g
A
(n,m)]
L
and w
B
(n,m) = [g
B
(n,m)]
L
respectively. Where L is a
constant. The reasonable importance distribution is obtained only with L in the region
of 0.8 < L < 1.2. Higher and lower values place extensive emphasis on either strong
of weak edges respectively.

45
9. SOFTWARE DESCRIPTION:

9.1 Introduction

MATLAB is a programming environment for algorithm development, data
analysis, visualization, and numerical computation. Using MATLAB, you can solve
technical computing problems faster than with traditional programming languages,
such as C, C++, and Fortran. MATLAB in a wide range of applications, including
signal and image processing, communications, control design, test and measurement,
financial modeling and analysis, and computational biology.

9.2 Structures

MATLAB supports structure data types. Since all variables in MATLAB are
arrays, a more adequate name is "structure array", where each element of the array
has the same field names. In addition, MATLAB supports dynamic field names.
Unfortunately, MATLAB JIT does not support MATLAB structures, therefore just a
simple bundling of various variables into a structure will come at a cost

9.3 Function handles

MATLAB supports elements of lambda-calculus by introducing function
handles, or function references, which are implemented either in .m files or
anonymous/nested functions.

9.4 MATLAB Fundamentals

- Working with the MATLAB user interface

- Entering commands and creating variables

- Performing analysis on vectors and matrices

- Visualizing vector and matrix data

- Working with data files

- Working with data types

- Automating commands with scripts

- Writing programs with logic and flow control

46
9.5 ADVANTAGES OF MATLAB:

Algorithm Development

Develop algorithms using the high-level language and development tools in
MATLAB.

Data Analysis

Analyze, visualize, and explore data with MATLAB.

Data Visualization

Visualize engineering and scientific data with a wide variety of plotting
functions in MATLAB.

Numeric Computation

Perform mathematical operations and analyze data with MATLAB functions.

Publishing and Deploying

Share your work by publishing MATLAB code from the Editor to HTML and
other formats.

47
10. RESULTS

Figure 10.1: Image fusion of input image1 (focus on left part) and input image2
(focus on right part) with image averaging and wavelet fusion method.

48

Figure 10.2: Image fusion of input image 1 (focus on left part) and input image 2

(focus on right part) with Gradient based image fusion.

49

Figure 10.3: Image fusion of input image1(CT image) and input image2

(MRI image ) with wavelet fusion method.

50

Figure 10.4: Image fusion of input image 1(CT image) and input image 2(MRI

image) with Gradient based image fusion

51
11. CONCLUSION

This chapter summarizes and concludes the investigation of pixel-level image
fusion presented in this report. The novel multiresolution signal-level image fusion
method whose architecture belongs to the same broad system class as DWT is
presented in this report. The method uses an alternative gradient map image
information representation and a new “fuse-then-decompose” approach within the
framework of a novel, combined fusion/decomposition multiresolution architecture.
Furthermore, the image information representation in the form of gradient map
signals allows for reliable feature selection in a process, which is realized using cross-
band information fusion. Thus, the proposed fusion system significantly reduces
reconstruction error artefacts and the loss of contrast information, conditions which a
commonly observed in conventional DWT-based fusion. The objective performance
evaluation results demonstrate the superiority of gradient-based multiresolution image
fusion with respect to more complex multiresolution fusion approaches.

Further Enhancement

The biggest effort required to further is connected with the practical side of
image fusion development such as data gathering. Using the Neural networks going
to identify the objects and using Fuzzy logic to generate the database to the physician
to diaganize the patient more effectively.

52

10. REFERENCE

[1] Ahmed Abd-el-kader, Hossam El-Din Moustafa, Sameh Rehan, “Performance

Measure for image fusion based on wavelet transform and curvelet transform” April

26-28,2011, National Telecommunication institute.

[2] Anjali Malviya, S.G. Bhirud, “ Objective Criterion for performance Evaluation of
image fusion techniques” 2010 International journal of computer Applications (0975-
8887) volume1- No.25.

[3] YI zheng-jun, LI Hua-feng, SONG Rui-jing.Spatial Frequency Ratio Image
Fusion Method Based On Improved Lifting Wavelet Transform[J]. Opto-Electronic
Engineering, 2009,36(7):65-70.

[4] OLIVER ROCKINGER'S COLLECTION [EB/OL]. [2010-3-19].
http://www.imagefusion.org/.

[5] XU kai-yu, LI Shuang-yi.A Images Fusion Algorithm Based on Wavelet
Transform[J]. Infrared Technology, 2007,29(8):455-458.200.

[6] SUN yan kui. Wavelet analysis and its application[M].Beijing: Machinery
Industry Press, 2005.

[7] M. I. Smith, J. P. Heather, "Review of Image Fusion Technology in 2005,"
Proceedings of the SPIE, Volume 5782, pp. 29-45, 2005.

[8] Ishita De and Bhabatosh Chanda "A simple and efficient algorithm for multi-
focus image fusion using morphological wavelets" Electronics and Communication
Sciences Unit, Indian Statistical Institute, Kolkata 700108, India, 2005.

[9] Paul Hill, Nishan Canagarajah and Dave Bull "Image Fusion using Complex
Wavelets" Dept. of Electrical and Electronic Engineering The University of Bristol
Bristol, BS5 lUB, UK, 2002

[10] A. Toet and J. Ijspeert, “Perceptual evaluation of different image fusion
schemes,” Proc. SPIE, vol. 4380, pp. 427–435, Aug. 2001

53
[11] C. Xydeas and V. Petrovic´, “Objective image fusion performance measure,”

Electron. Lett., vol. 36, no. 4, pp. 308–309, Feb. 2000.

[12] H Li, B Munjanath, S Mitra, "Multisensor Image Fusion Using the Wavelet
Transform", Graphical Models and Image Processing, Vol. 57(3), 1995, pp 235- 245

[13] D Esteban, C Galand, “Application of quadrature mirror filters to split band
voice coding schemes”, Proc. Int. Conf. Acoustics, Speech, Signal Processing
ICASSP, Hartford, May 1977, pp 191-195

[14] Z Zhang, “Investigations of Image Fusion”, on
www.eecs.lehigh.edu/SPCRL/spcrl.htm, Lehigh University, May 2000.

[15] P Burt, R Kolczynski, "Enhanced Image Capture Through Fusion", Proc. 4
th

International Conference on Computer Vision, Berlin 1993, pp 173-182

[16] W Handee, P Wells, “The Perception of Visual Information”, Springer, New
York 1997

[17] J Johnston, “A filter family designed for use in quadrature mirror filter banks”,
Proc. IEEE International Conference on Acoustic, Speech and Signal Processing,
1980, pp. 291-294

[18] V. Petrovic´ and C. Xydeas, “Multiresolution image fusion using cross band
feature selection,” Proc. SPIE , vol. 3719, pp. 319–326, Apr. 1999.

[19] M Sonka, V Hlavac, R Boyle, “Image Processing, Analysis and Machine
Vision”, PWS Publishing, Pacific Grove, 1998

54

Image Fusion

Comments

Content

Sponsor Documents

Recommended