IMAGE FUSION DOC

Published on January 2017 | Categories: Documents | Downloads: 81 | Comments: 0 | Views: 574

of 44

Content

INTRODUCTION
Image fusion is the process by which two or more images are combined into a single image retaining the important features from each of the original images. The fusion of images is often required for images acquired from different instrument modalities or capture techniques of the same scene or objects. Important applications of the fusion of images include medical imaging, microscopic imaging, remote sensing, computer vision, and robotics. Fusion techniques include the simplest method of pixel averaging to more complicated methods such as principal component analysis and wavelet transform fusion. Several approaches to image fusion can be distinguished, depending on whether the images are fused in the spatial domain or they are transformed into another domain, and their transforms fused. With the development of new imaging sensors arises the need of a meaningful combination of all employed imaging sources. The actual fusion process can take place at different levels of information representation, a generic categorization is to consider the different levels as, sorted in ascending order of abstraction: signal, pixel, feature and symbolic level. This focuses on the so-called pixel level fusion process, where a composite image has to be built of several input images. To date, the result of pixel level image fusion is considered primarily to be presented to the human observer, especially in image sequence fusion (where the input data consists of image sequences). A possible application is the fusion of forward looking infrared (FLIR) and low light visible images (LLTV) obtained by an airborne sensor platform to aid a pilot navigate in poor weather conditions or darkness. In pixel-level image fusion, some generic requirements can be imposed on the fusion result. The fusion process should preserve all relevant information of the input imagery in the composite image (pattern conservation) The fusion scheme should not introduce any artifacts or inconsistencies which would distract the human observer or following processing stages .The fusion process should be shift and rotational invariant, i.e. the fusion result should not depend on the location or orientation of an object the input imagery .In case of image sequence fusion arises the additional problem of temporal stability and consistency of the fused image sequence. The human visual system is primarily sensitive to moving 1

light stimuli, so moving artifacts or time depended contrast changes introduced by the fusion process are highly distracting to the human observer. So, in case of image sequence fusion the two additional requirements apply. Temporal stability: The fused image sequence should be temporal stable, i.e. gray level changes in the fused sequence must only be caused by gray level changes in the input sequences, they must not be introduced by the fusion scheme itself; Temporal consistency: Gray level changes occurring in the input sequences must be present in the fused sequence without any delay or contrast change.

1.1

FUSION METHODS

1.1.1 Introduction The following summarize several approaches to the pixel level fusion of spatially registered input images. Most of these methods have been developed for the fusion of stationary input images (such as multispectral satellite imagery). Due to the static nature of the input data, temporal aspects arising in the fusion process of image sequences, e.g. stability and consistency, are not addressed. A generic categorization of image fusion methods is the following:  linear superposition  nonlinear methods  optimization approaches  artificial neural networks  image pyramids  wavelet transform  generic multiresolution fusion scheme

2

1.1.2 Linear Superposition The probably most straightforward way to build a fused image of several input frames is performing the fusion as a weighted superposition of all input frames. The optimal weighting coefficients, with respect to information content and redundancy removal, can be determined by a principal component analysis (PCA) of all input intensities. By performing a PCA of the covariance matrix of input intensities, the weightings for each input frame are obtained from the eigenvector corresponding to the largest eigenvalue. A similar procedure is the linear combination of all inputs in a pre-chosen colorspace (eg. R-G-B or H-S-V), leading to a false color representation of the fused image. 1.1.3 Nonlinear Methods Another simple approach to image fusion is to build the fused image by the application of a simple nonlinear operator such as max or min. If in all input images the bright objects are of interest, a good choice is to compute the fused image by an pixel-by-pixel application of the maximum operator. An extension to this approach follows by the introduction of morphological operators such as opening or closing. One application is the use of conditional morphological operators by the definition of highly reliable 'core' features present in both images and a set of 'potential' features present only in one source, where the actual fusion process is performed by the application of conditional erosion and dilation operators. A further extension to this approach is image algebra, which is a high-level algebraic extension of image morphology, designed to describe all image processing operations. The basic types defined in image algebra are value sets, coordinate sets which allow the integration of different resolutions and tessellations, images and templates. For each basic type binary and unary operations are defined which reach from the basic set operations to more complex

3

ones for the operations on images and templates. Image algebra has been used in a generic way to combine multisensor images 1.1.4 Optimization Approaches In this approach to image fusion, the fusion task is expressed as an bayesian optimization problem. Using the multisensor image data and an a-prori model of the fusion result, the goal is to find the fused image which maximizes the a-posteriori probability. Due to the fact that this problem cannot be solved in general, some simplifications are introduced: All input images are modeled as markov random fields to define an energy function which describes the fusion goal. Due to the equivalence of of gibbs random fields and markov random fields, this energy function can be expressed as a sum of so-called clique potentials, where only pixels in a predefined neighborhood affect the actual pixel. The fusion task then consists of a maximization of the energy function. Since this energy function will be non-convex in general, typically stochastic optimization procedures such as simulated annealing or modifications like iterated conditional modes will be used. 1.1.5 Artificial Neural Networks Inspired by the fusion of different sensor signals in biological systems, many researchers have employed artificial neural networks in the process of pixel-level image fusion. The most popular example for the fusion of different imaging sensors in biological systems is described by Newman and Hart line in the 80s: Rattlesnakes (and the general family of pit vipers) possess so called pit organs which are sensitive to thermal radiation through a dense network of nerve fibers. The output of these pit organs is fed to the optical tectum, where it is combined with the nerve signals obtained from the eyes. Newman and Hart line distinguished six different types of bimodal neurons merging the two signals based on a sophisticated combination of suppression and enhancement. Several researchers modeled this fusion process 1.1.6 Image Pyramids

4

Image pyramids have been initially described for multiresolution image analysis and as a model for the binocular fusion in human vision. A generic image pyramid is a sequence of images where each image is constructed by low pass filtering and sub sampling from its predecessor. Due to sampling, the image size is halved in both spatial directions at each level of the decomposition process, thus leading to an multiresolution signal representation. The difference between the input image and the filtered image is necessary to allow an exact reconstruction from the pyramidal representation. The image pyramid approach thus leads to a signal representation with two pyramids: The smoothing pyramid containing the averaged pixel values, and the difference pyramid containing the pixel differences, i.e. the edges. So the difference pyramid can be viewed as a multiresolution edge representation of the input image. The actual fusion process can be described by a generic multiresolution fusion scheme which is applicable both to image pyramids and the wavelet approach. There are several modifications of this generic pyramid construction method described above. Some authors propose the computation of nonlinear pyramids, such as the ratio and contrast pyramid, where the multistage edge representation is computed by an pixel-by-pixel division of neighboring resolutions. A further modification is to substitute the linear filters by morphological nonlinear filters, resulting in the morphological pyramid. Another type of image pyramid - the gradient pyramid - results, if the input image is decomposed into its directional edge representation using directional derivative filter 1.1.7 Wavelet Transform A signal analysis method similar to image pyramids is the discrete wavelet transform. The main difference is that while image pyramids lead to an over complete set of transform coefficients, the wavelet transform results in a nonredundant image representation. The discrete 2-dim wavelet transform is computed by the recursive application of lowpass and high pass filters in each direction of the input image (i.e. rows and columns) followed by sub sampling. Details on this scheme can be found in the reference section. One major drawback of the wavelet transform when applied to image fusion is its well known shift dependency, i.e. a simple shift of the input signal may lead to complete different transform coefficients. This results in inconsistent fused images when 5

invoked in image sequence fusion. To overcome the shift dependency of the wavelet fusion scheme, the input images must be decomposed into a shift invariant representation. There are several ways to achieve this: The straightforward way is to compute the wavelet transform for all possible circular shifts of the input signal. In this case, not all shifts are necessary and it is possible to develop an efficient computation scheme for the resulting wavelet representation. Another simple approach is to drop the subsampling in the decomposition process and instead modify the filters at each decomposition level, resulting in a highly redundant signal representation. The actual fusion process can be described by a generic multiresolution fusion scheme which is applicable both to image pyramids and the wavelet approach.

1.1.8 Generic Multiresolution Fusion Scheme The basic idea of the generic multiresolution fusion scheme is motivated by the fact that the human visual system is primary sensitive to local contrast changes, i.e. edges. Motivated from this insight, and in mind that both image pyramids and the wavelet transform result in an multiresolution edge representation, it is straightforward to build the fused image as a fused multiscale edge representation. The fusion process is summarized in the following: In the first step the input images are decomposed into their multiscale edge representation, using either any image pyramid or any wavelet transform. The actual fusion process takes place in the difference resp. wavelet domain, where the fused multiscale representation is built by a pixel-by-pixel selection of the coefficients with maximum magnitude. Finally the fused image is computed by an application of the appropriate reconstruction scheme

6

Fig. 1 Block Diagram Of Basic Image Fusion Process

7

AIM OF THE PROJECT
2.1 NEW IMAGE FUSION ALGORITHM The paper adopts the multiresolution analysis discrete wavelet frame transform and fuzzy region feature fusion scheme to implement the selection of source image wavelet coefficients. Fig.1 is the framework of the proposed image fusion algorithm. The first step is to choose an image as object image that can reflect the object and background clearer than the other image. The second step is to decompose the source image into multiresolution representation. There are low frequency band at each level during the next level decomposition. The low frequency bands of the object image are segmented into region images. The third step is defining the attributes of the regions by some region features, such as the mean of gray level in a region. In this case, each pixel point has its membership value. Then using certain attribute region fusion scheme combining with the membership value of each pixel, the multiresolution representation of the fusion result is achieved using defuzzification process. The final step is to do inverse discrete wavelet frame transform, and the final fusion result is obtained.

8

The fusion of images is the process of combining two or more images into a single image retaining important features from each. Fusion is an important technique within many disparate fields such as remote sensing, robotics and medical applications. Wavelet based fusion techniques have been reasonably effective in combining perceptually important image features. Shift invariance of the wavelet transform is important in ensuring robust sub band fusion. Therefore the novel application of the shift invariant and directionally selective Dual Tree Complex Wavelet Transform (DT-CWT) to image fusion is now introduced. This novel technique provides improved qualitative and quantitative results compared to previous wavelet fusion method.

9

The goals for this Project have been the following. One goal has been to compile an introduction to the subject of Image Fusion. There exist a number of studies on various algorithms, but complete treatments on a technical level are not as common. Material from papers, journals, and conference proceedings are used that best describe the various parts. Another goal has been to search for algorithms that can be used to implement for the image fusion for various applications. A third goal is to evaluate their performance of with different image quality metrics. These properties were chosen because they have the greatest impact on the detection of Image fusion algorithms A final goal has been to design and implement the Wavelet based fuzzy and Neural approaches using matlab. 2.2 SCOPE OF THE PROJECT

2.2.1 DWT versus DT-CWT Figures 3(a) and 3(b) show a pair of multifocus test images that were fused for a closer comparison of the DWT and DT-CWT methods. Figures 3(d) and 3(e) show the results of a simple MS method using the DWT and DT-CWT, respectively. These results are clearly superior to the simple pixel averaging result shown in 3(c). They both retain a perceptually acceptable combination of the two “in focus” areas from each input image. An edge fusion result is also shown for comparison (figure 3(f)) [8]. Upon closer inspection however, there are residual ringing artefacts found in the DWT fused image not found within the DT-CWT fused image. Using more sophisticated coefficient fusion rules (such as WBV or WA) the DWT and DT-CWT results were much more difficult to distinguish. However, the above comparison when using a simple MS method reflects the ability of the DT-CWT to retain edge details without ringing.

10

Figure 2.1: (a) First image of the multifocus test set. (b) Second image of the multi focus test set. (c) Fused image using average pixel values. (d) Fused image using DWT with an MS fuse rule. (e) Fused image using DT-CWT with an MS fuse rule. (f) Fused image using multiscale edge fusion (point representations).

11

2.2.2 Quantitative Comparisons Often the perceptual quality of the resulting fused image is of prime importance. In these circumstances comparisons of quantitative quality can often be misleading or meaningless. However, a few authors [1, 7, 10] have attempted to generate such measures for applications where their meaning is clearer. Figures 3(a) and 3(b) reflect such an application: fusion of two images of differing focus to produce an image of maximum focus. Firstly, a “ground truth” image needs to be created that can be quantitatively compared to the fusion result images. This is produced using a simple cut-and-paste technique, physically taking the “in focus” areas from each image and combining them. The quantitative measure used to compare the cut-and-paste image to each fused image was taken from [1]

Figure 2.2: (a) First image (MR) of the medical test set. (b) Second image (CT) of the medical test set. (c) Fused image using average pixel values. (d) Fused image using DWT with an MS fuse rule. (e) Fused image using DTCWT with an MS fuse rule. (f) Fused image using multiscale edge fusion (point representations). 12

where Igt is the cut-and-paste “ground truth” image, ___ is the fused image and is the size of the image. Lower values of _ indicate greater similarity between the images___ and ___ and therefore more successful fusion in terms of quantitatively measurable similarity. Table 1 shows the results for the various methods used. The average pixel value method gives a baseline result. The PCA method gave an equivalent but a slightly worse result. These methods have poor results relatively to the others. This was expected as they have no scale selectivity. Results were obtained for the DWT methods using all the bio-orthogonal wavelets available within the Matlab (5.0) Wavelet Toolbox. Similarly, results were obtained for the DT-CWT methods using all the shift invariant wavelet filters described in [3]. Results were also calculated for the SIDWT using the Haar wavelet and the bior2.2 Daubechies wavelet. The table 1 shows the best results for all filters for each method. For all filters, the DWT results were worse than their DT-CWT equivalents. Similarly, all the DWT results were worse than their SIDWT equivalents. This demonstrates the importance of shift invariance in wavelet transform fusion. The DT-CWT results were also better than the equivalent results using the SIDWT. This indicates the improvement gained from the added directional selectivity of the DT-CWT over the SIDWT. The WBV and WA methods performed better than MS with equivalent transforms as expected, with WBV performing best for both cases. All of the wavelet transform results were decomposed to four levels. In addition, the residual low pass images were fused using simple averaging and the window for the WA and WBV methods were all set to 3_3.

13

Table 2.1: Quantitative results for various fusion methods.

2.3

EFFECT OF WAVELET FILTER CHOICE FOR DWT AND DT-CWT BASED FUSION There are many different choices of filters to affect the DWT transform. In

order not to introduce phase distortions, using filters having a linear phase response is a sensible choice. To retain a perfect reconstruction property, this necessitates the use of biorthogonal filters. MS fusion results were compared for all the images in figures 3 and 4 using all the biorthogonal filters included in the Mat lab (5.0) Wavelet Toolbox. Likewise there are also many different choices of filters to affect the DT-CWT transform. MS fusion results were compared for all the same three image pairs using all the specially designed filters given in [3]. Qualitatively all the DWT results gave more ringing artifacts than the equivalent DTCWT results. Different choices of DWT filters gave ringing artifacts at different image locations and scales. The choice of filters for the DT-CWT did not seem to alter or move the ringing artifacts found within the fused images. The perceived higher quality of the DT-CWT fusion results compared to the DWT fusion results was also reflected by a quantitative comparison.

14

WAVELET TRANSFORM OVERVIEW
3.1 WAVELET TRANSFORM Wavelets are mathematical functions defined over a finite interval and having an average value of zero that transform data into different frequency components, representing each component with a resolution matched to its scale. The basic idea of the wavelet transform is to represent any arbitrary function as a superposition of a set of such wavelets or basis functions. These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilations or contractions (scaling) and translations (shifts). They have advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes. Many new wavelet applications such as image compression, turbulence, human vision, radar, and earthquake prediction are developed in recent years. In wavelet transform the basis functions are wavelets. Wavelets tend to be irregular and symmetric. All wavelet functions, w(2kt - m), are derived from a single mother wavelet, w(t). This wavelet is a small wave or pulse like the one shown in Fig. 3.2.

Fig. 3.1 Mother wavelet w(t) Normally it starts at time t = 0 and ends at t = T. The shifted wavelet w(t - m) starts at t = m and ends at t = m + T. The scaled wavelets w(2kt) start at t = 0 and end at t = T/2k. Their graphs are w(t) compressed by the factor of 2k as shown in Fig. 3.3. For example, when k = 1, the wavelet is shown in Fig 3.3 (a). If k = 2 and 3, they are shown in (b) and (c), respectively.

15

(a)w(2t)

(b)w(4t)

(c)w(8t)

Fig. 3.2 Scaled wavelets The wavelets are called orthogonal when their inner products are zero. The smaller the scaling factor is, the wider the wavelet is. Wide wavelets are comparable to low-frequency sinusoids and narrow wavelets are comparable to high-frequency sinusoids. 3.1.1 Scaling Wavelet analysis produces a time-scale view of a signal. Scaling a wavelet simply means stretching (or compressing) it. The scale factor is used to express the compression of wavelets and often denoted by the letter a. The smaller the scale factor, the more “compressed” the wavelet. The scale is inversely related to the frequency of the signal in wavelet analysis. 3.1.2 Shifting Shifting a wavelet simply means delaying (or hastening) its onset. Mathematically, delaying a function f(t) by k is represented by: schematic is shown in fig. 3.4. f(t-k) and the

(a) Wavelet function Ψ(t) (b) Shifted wavelet function Ψ(t-k) Fig. 3.3 Shifted wavelets

16

3.1.3 Scale and Frequency The higher scales correspond to the most “stretched” wavelets. The more stretched the wavelet, the longer the portion of the signal with which it is being compared, and thus the coarser the signal features being measured by the wavelet coefficients. The relation between the scale and the frequency is shown in Fig. 3.5.

Low scale

High scale

Fig. 3.4 Scale and frequency Thus, there is a correspondence between wavelet scales and frequency as revealed by wavelet analysis: •Low scale a frequency. •High scale a Stretched wavelet Slowly changing, coarse features Low frequency. 3.2 DISCRETE WAVELET TRANSFORM Calculating wavelet coefficients at every possible scale is a fair amount of work, and it generates an awful lot of data. If the scales and positions are chosen based on powers of two, the so-called dyadic scales and positions, then calculating wavelet coefficients are efficient and just as accurate. This is obtained from discrete wavelet transform (DWT). 3.2.1 One-Stage Filtering For many signals, the low-frequency content is the most important part. It is the identity of the signal. The high-frequency content, on the other hand, imparts details to the signal. In wavelet analysis, the approximations and details are obtained after filtering. The approximations are the high-scale, low frequency 17 Compressed wavelet Rapidly changing details High

components of the signal. The details are the low-scale, high frequency components. The filtering process is schematically represented as in Fig. 3.6.

Fig. 3.5 Single stage filtering The original signal, S, passes through two complementary filters and emerges as two signals. Unfortunately, it may result in doubling of samples and hence to avoid this, downsampling is introduced. The process on the right, which includes downsampling, produces DWT coefficients. The schematic diagram with real signals inserted is as shown in Fig. 3.7.

Fig. 3.6 Decomposition and decimation

3.2.2 Multiple-Level Decomposition The decomposition process can be iterated, with successive approximations being decomposed in turn, so that one signal is broken down into 18

many lower resolution components. This is called the wavelet decomposition tree and is depicted as in Fig. 3.8.

Fig. 3.7 Multilevel decomposition

3.2.3 Wavelet Reconstruction The reconstruction of the image is achieved by the inverse discrete wavelet transform (IDWT). The values are first upsampled and then passed to the filters. This is represented as shown in Fig. 3.9.

Fig. 3.8 Wavelet Reconstruction The wavelet analysis involves filtering and downsampling, whereas the wavelet reconstruction process consists of upsampling and filtering. Upsampling is the process of lengthening a signal component by inserting zeros between samples as shown in Fig. 3.10.

19

Fig. 3.9 Reconstruction using upsampling 3.2.4 Reconstructing Approximations and Details It is possible to reconstruct the original signal from the coefficients of the approximations and details. The process yields a reconstructed approximation which has the same length as the original signal and which is a real approximation of it. The reconstructed details and approximations are true constituents of the original signal. Since details and approximations are produced by downsampling and are only half the length of the original signal they cannot be directly combined to reproduce the signal. It is necessary to reconstruct the approximations and details before combining them. The reconstructed signal is schematically represented as in Fig. 3.11.

Fig. 3.10 Reconstructed signal components

3.2.5 1-D Wavelet Transform The generic form for a one-dimensional (1-D) wavelet transform is shown in Fig. 3.12. Here a signal is passed through a low pass and high pass filter, h and g,

20

respectively, then down sampled by a factor of two, constituting one level of transform.

Fig. 3.11 1D Wavelet Decomposition. Repeating the filtering and decimation process on the lowpass branch outputs make multiple levels or “scales” of the wavelet transform only. The process is typically carried out for a finite number of levels K, and the resulting coefficients are called wavelet coefficients. The one-dimensional forward wavelet transform is defined by a pair of filters and t that are convolved with the data at either the even or odd locations. The filters s and t used for the forward transform are called analysis filters. nL li = ∑ sjx2i+j j=-nl and nH hi = ∑ tjx2i+1+j j=-nH

Although l and h are two separate output streams, together they have the same total number of coefficients as the original data. The output stream l, which is commonly referred to as the low-pass data may then have the identical process applied again repeatedly. The other output stream, h (or high-pass data), generally remains untouched. The inverse process expands the two separate low- and highpass data streams by inserting zeros between every other sample, convolves the resulting data streams with two new synthesis filters s’ and t’, and adds them together to regenerate the original double size data stream. nH yi = ∑ t’jl’i+j + ∑ j= -nH nl s’j h’i+j where j= -nH 21 l’2i = li, l’ 2i+1 = 0 h’2i+1 = hi, h’2i = 0

To meet the definition of a wavelet transform, the analysis and synthesis filters s, t, s’ and t’ must be chosen so that the inverse transform perfectly reconstructs the original data. Since the wavelet transform maintains the same number of coefficients as the original data, the transform itself does not provide any compression. However, the structure provided by the transform and the expected values of the coefficients give a form that is much more amenable to compression than the original data. Since the filters s, t, s’ and t’ are chosen to be perfectly invertible, the wavelet transform itself is lossless. Later application of the quantization step will cause some data loss and can be used to control the degree of compression. The forward wavelet-based transform uses a 1-D subband decomposition process; here a 1-D set of samples is converted into the low-pass subband (Li) and high-pass subband (Hi). The low-pass subband represents a down sampled low-resolution version of the original image. The high-pass subband represents residual information of the original image, needed for the perfect reconstruction of the original image from the low-pass subband 3.3 2-D TRANSFORM HEIRARCHY The 1-D wavelet transform can be extended to a two-dimensional (2-D) wavelet transform using separable wavelet filters. With separable filters the 2-D transform can be computed by applying a 1-D transform to all the rows of the input, and then repeating on all of the columns.

LL1

HL1 Fig. 3.12 Subband Labeling Scheme for a one level, 2-D Wavelet Transform 22

LH1

HH1

The original image of a one-level (K=1), 2-D wavelet transform, with corresponding notation is shown in Fig. 3.13. The example is repeated for a threelevel (K =3) wavelet expansion in Fig. 3.14. In all of the discussion K represents the highest level of the decomposition of the wavelet transform.

LL1 LH1

HL1 HH1 LH2

HL2 HL3 HH2

LH3

HH3

Fig. 3.13 Subband labeling Scheme for a Three Level, 2-D Wavelet Transform

The 2-D subband decomposition is just an extension of 1-D subband decomposition. The entire process is carried out by executing 1-D subband decomposition twice, first in one direction (horizontal), then in the orthogonal (vertical) direction. For example, the low-pass subbands (Li) resulting from the horizontal direction is further decomposed in the vertical direction, leading to LLi and LHi subbands. Similarly, the high pass subband (Hi) is further decomposed into HLi and HHi. After one level of transform, the image can be further decomposed by applying the 2-D subband decomposition to the existing LLi subband. This iterative process results in multiple “transform levels”. In Fig. 3.14 the first level of transform results in LH1, HL1, and HH1, in addition to LL1, which is further decomposed into LH2, HL2, HH2, LL2 at the second level, and the information of LL2 is used for the third level transform. The subband LLi is a low-resolution subband and high-pass 23

subbands LHi, HLi, HHi are horizontal, vertical, and diagonal subband respectively since they represent the horizontal, vertical, and diagonal residual information of the original image. An example of three-level decomposition into subbands of the image CASTLE is illustrated in Fig. 3.15.

H2H1HH

Fig. 3.14 The process of 2-D wavelet transform applied through three transform levels

To obtain a two-dimensional wavelet transform, the one-dimensional transform is applied first along the rows and then along the columns to produce four subbands: low-resolution, horizontal, vertical, and diagonal. (The vertical subband is created by applying a horizontal high-pass, which yields vertical edges.) At each level, the wavelet transform can be reapplied to the low-resolution subband to further decorrelate the image. Fig. 3.16 illustrates the image decomposition, defining level and subband conventions used in the AWIC algorithm. The final configuration contains a small low-resolution subband. In addition to the various transform levels, the phrase level 0 is used to refer to the original image data. When the user requests zero levels of transform, the original image data (level 0) is treated as a low-pass band and processing follows its natural flow.

Low Resolution Subband 24

4 3 Level 2 Level 1

4 4

3 3

Level 2 Level 2

Level 1 Vertical subband HL Level 1 Diagonal Subband HH Fig. 3.15 Image Decomposition Using Wavelets

Horizontal Subband LH

Wavelet transform is first performed on each source images, then a fusion decision map is generated based on a set of fusion rules. The fused wavelet coefficient map can be constructed from the wavelet coefficients of the source images according to the fusion decision map. Finally the fused image is obtained by performing the inverse wavelet transform. From the above diagram, we can see that the fusion rules are playing a very important role during the fusion process. Here are some frequently used fusion rules in the previous work:

25

When constructing each wavelet coefficient for the fused image. We will have to determine which source image describes this coefficient better. This information will be kept in the fusion decision map. The fusion decision map has the same size as the original image. Each value is the index of the source image which may be more informative on the corresponding wavelet coefficient. Thus, we will actually make decision on each coefficient. There are two frequently used methods in the previous research. In order to make the decision on one of the coefficients of the fused image, one way is to consider the corresponding coefficients in the source images as illustrated by the red pixels. This is called pixel-based fusion rule. The other way is to consider not only the corresponding coefficients, but also their close neighbors, say a 3x3 or 5x5 windows, as illustrated by the blue and shadowing pixels. This is called window-based fusion rules. This method considered the fact that there usually has high correlation among neighboring pixels. In our research, we think objects carry the information of interest, each pixel or a small neighboring pixels are just one part of an object. Thus, we proposed a region-based fusion scheme. When make the decision on each coefficient, we consider not only the corresponding coefficients and their closing neighborhood, but also the regions the coefficients are in. We think the regions represent the objects of interest. We will provide more details of the scheme in the following.

26

3.4

PROPOSED SCHEME Neural Network and Fuzzy Logic approach can be used for sensor fusion.

Such a sensor fusion could belong to a class of sensor fusion in which case the features could be input and decision could be output. The help of Neuro-fuzzy of fuzzy systems can achieve sensor fusion. The system can be trained from the input data obtained from the sensors. The basic concept is to associate the given sensory inputs with some decision outputs. After developing the system. another group of input data is used to evaluate the performance of the system. Following algorithm and .M file for pixel level image fusion using Fuzzy Logic illustrate the process of defining membership functions and rules for the image fusion process using FIS (Fuzzy Inference System) editor of Fuzzy Logic toolbox in Matlab. 3.5 PROPOSED ALGORITHM

STEP 1  Read first image in variable M1 and find its size (rows z l , columns: SI).  Read second image in variable M2 and find its size (rows z2, columns: s2).  Variables MI and M2 are images in matrix form where each pixel value is in the range from 0-255. Use Gray color map.  Compare rows and columns of both input images. If the two images are not of the same size, select the portion,which are of same size. STEP 2  Apply wavelet decomposition and form spatial decomposition Trees  Convert the images in column form which has C= zl*sl entries.

27

STEP 3 Create fuzzy interference system of type Mamdani with following specifications Name: 'c7' Type: 'mamdani' AndMethod: 'min' OrMethod: 'max' DefuzzMethod: 'centroid' ImpMethod: 'min' AggMethod: 'max'

STEP 4  Decide number and type of membership functions for both the input images by tuning the membership functions.  Input images in antecedent are resolved to a degree of membership ranging 0 to 255.  Make rules for input images, which resolve the two antecedents to a single number from 0 to 255.

28

STEP 5 For num=l to C in steps of one, apply fuzzification using the rules developed above on the corresponding pixel values of the input images which gives a fuzzy set represented by a membership function and results in output image in column format.

Check the rules using rule viewer and surface viewer

29

STEP 6 Convert the column form to matrix form and display the fused image.

3.7

ALGORITHM USING NEURO FUZZY

30

STEP 1  Read first image in variable M1 and find its size (rows z l , columns: SI).  Read second image in variable M2 and find its size (rows z2, columns: s2).  Variables MI and M2 are images in matrix form where each pixel value is in the range from 0-255. Use Gray color map.  Compare rows and columns of both input images. If the two images are not of the same size, select the portion,which are of same size. STEP 2  Apply wavelet decomposition and form spatial decomposition Trees  Convert the images in column form which has C= zl*sl entries.

STEP 3  Form a training data, which is a matrix with three columns and entries in each column are form 0 to 255 in steps of 1.  Form a check data which is a matrix of pixels of two input images in a column format  Decide the number and type of Membership Function.  Create fuzzy interference system of type Mamdani with following specifications Name: 'c7' Type: 'mamdani' AndMethod: 'min' 31

OrMethod: 'max' DefuzzMethod: 'centroid' ImpMethod: 'min' AggMethod: 'max'

STEP 4  Decide number and type of membership functions for both the input images by tuning the membership functions.  Input images in antecedent are resolved to a degree of membership ranging 0 to 255.  Make rules for input images, which resolve the two antecedents to a single number from 0 to 255.

32

STEP 5 For num=l to C in steps of one, apply fuzzification using the rules developed above on the corresponding pixel values of the input images which gives a fuzzy set represented by a membership function and results in output image in column format.

STEP 6  Start training using ANFIS for the generated Fuzzy Interference system using Training data  Apply Fuzification using Trained Data and Check Data  Convert the column form to matrix form and display the fused image.

33

QUANTITATIVE COMPARISONS
4.1 PERFORMANCE EVALUATION OF FUSION It has been common to evaluate the result of fusion visually. According to visual evaluation, human judgment determines the quality of the image. Some independent and objective observers give grade to corresponding image and the final grade is obtained by taking the average or weighted mean of the individual grades. Obviously this evaluation method has some drawbacks, namely it is not accurate and depends on the observer’s experience. For an accurate and truthful assessment of the fusion product some quantitative measures (indicator) is required. Two different measures are used in this project to evaluate the results of fusion process. They are Information Entropy and Root Mean Square Error. 4.2 ENTROPY One of the quantitative measures in digital image processing is Entropy. Claude Shannon introduced the entropy concept in quantification of information content of messages. Although he used entropy in communication, it can be also employed as a measure and quantify the information content of digital images. A digital image consists of pixels arranged in rows and columns. Each pixel is defined by its position and by its grey scale level. For an image consists of L grey levels, the entropy is defined as:

where is the probability (here frequency) of each grey scale level. As an example a digital image of type uint8 (unsigned integer 8) has 256 different levels from 0 (black) to 255(white . It must be noticed that in combined images the number of levels is very large and grey level intensity of each pixel is a decimal, double number. But the equation (10) is still valid to compute the entropy. For images with high information content the entropy is large. The larger alternations and changes in an image give larger entropy and the sharp and focused images have more 34

changes than blurred and misfocused images. Hence, the entropy is a measure to assess the quality of different aligned images from the same scene. The Root Mean Square Error between the reference image, I and the fused image is defined as: F

where and i j denotes the spatial position of pixels, M and are the dimensions of the images. N This measure is appropriate for a pair of images containing two objects. First a reference, everywhere-infocus image I is taken. Then two images are provided from this original image. In one image the first object is focused and the second one is blurred. In the other image the first object is blurred and another one is remained focused. The fused image would contain both well-focused objects. Often the perceptual quality of the resulting fused image is of prime

importance. In these circumstances, comparisons of quantitative quality can often be misleading or meaningless. However, a few authors [1, 8, 9] have attempted to generate such measures for applications where their meaning is clearer. Figure 2 reflects such an application: fusion of two images of differing focus to produce an image of maximum focus. Firstly, a “ground truth” image needs to be created that can be quantitatively compared to the fusion result images. This is produced using a simple cut-and-paste technique, physically taking the “in focus” areas from each image and combining them. The quantitative measure used to compare the cutand-paste image to each fused image wastaken from [1]

where Igt is the cut-and-paste “ground truth” image, Ifd is the fused image and N is the size of the image. Lower values of _ indicate greater similarity between the images Igt and Ifd and therefore more successful fusion in terms of quantitatively 35

measurable similarity. Table 1 shows the results for the various methods used. The average pixel value method, the pixel based PCA and the DWT methods give poor results relatively to the others as expected. The DT-CWT methods give roughly equivalent results although the New-CWT method gave slightly worse results. The results were however very close and should not be taken as indicative as this is just one experiment and the transforms are producing essentially the same subband orms. The WBV and WA methods performed better than MS with equivalent transforms as expected in most cases. The residual low pass images were fused using simple averaging and the window for the WA and WBV methods were all set to 3×3. The table 1 shows the best results for all filters available for each method. 4.3 APPLICATIONS AND TRENDS

4.3.1 Navigation Aid To allow helicopter pilots navigate under poor visibility conditions (such as fog or heavy rain) helicopters are equipped with several imaging sensors, which can be viewed by the pilot in a helmet mounted display. A typical sensor suite includes both a low-light-television (LLTV) sensor and a thermal imaging forwardlooking-infrared (FLIR) sensor. In the current configuration, the pilot can choose on of the two sensors to watch in his display. A possible improvement is combine both imaging sources into a single fused image which contains the relevant image information of both imaging devices. The following images in the result 1.1 illustrate this application. 4.3.2 Merging Out-Of-Focus Images Due to the limited depth-of-focus of optical lenses (especially such with long focal lengths) it is often not possible to get an image which contains all relevant objects 'in focus'. One possibility to overcome this problem is to take several pictures with different focus points and combine them together into a single frame which finally contains the focused regions of all input images. The following images in the result 1.2 illustrate this approach.

36

4.3.3 Medical Imaging With the development of new imaging methods in medical diagnostics arises the need of meaningful (and spatial correct) combination of all available image datasets. Examples for imaging devices include computer tomography (CT), magnetic resonance imaging (MRI) or the newer positron emission tomography (PET). The following images in the result 1.3 illustrate the fusion of a CT and a MRI image.

4.3.4 Remote Sensing Remote sensing is a typical application for image fusion: Modern spectral scanners gather up to several hundred of spectral bands which can be either visualized and processed individually, or which can be fused into a single image, depending on the image analysis task. The following images illustrate in the result 1.4 the fusion of two bands of a multispectral scanner.

37

4.4

RESULTS

Fig 4.1 Fusion by Averaging

Fig 4.2 Fusion by Maximum

38

Fig 4.3 Fusion by Minimum

Fig 4.4 Fusion by PCA

39

Fig 4.5 Fusion by averaging

Fig 4.6 Fusion by Averaging

40

FUSION BY AVERAGING

FUSION BY MAXIMUM

FUSION BY MINIMUM

FUSION BY PCA

41

CONCLUSION
In this project, the use of Discrete Wavelet Transform (DWT), Fuzzy, Neuro Fuzzy, the fusion of images taken by digital camera was studied. The pixel-levelbased fusion mechanism applied to sets of images. All the results obtained by these methods are valid in case of using aligned source images from the same scene. In order to evaluate the results and compare these methods two quantitative assessment criteria Information Entropy and Root Mean Square Error were employed. Experimental results indicated that there are no considerable differences between these two methods in performance. In fact if the result of fusion in each level of decomposition is separately evaluated visually and quantitatively in terms of entropy, no considerable difference will be observed (Fig. 5, 6,7, 9 and 11 and Tables 2 and 4). Although some differences identified in lower levels, DWT and LPT demonstrated similar results from level three of decomposing. Both techniques reach the best result in terms of information entropy with a decomposing level of three. Experimental results demonstrated in Tables 2 and 4 also indicate that LPT algorithm reaches its best quality in terms of entropy in lower levels than DWT. The RMSE values represented in Table 6 show that neither LPT nor DWT has better performance in all levels, although the best result belongs to the LPT method. However the RMSE results compared to quality and entropy of fused images indicate that RMSE can not be used as a proper criterion to evaluate and compare the fusion results. Finally the experiments showed that the LPT approach is implemented faster than DWT. Actually LPT takes less than half the time in comparison with DWT and with regard to approximately similar performance, LPT is preferred in real-time applications. Fuzzy and Neuro-Fuzzy algorithms have been implemented to fuse a variety of images. The results of fusion process proposed are given in terms of Entropy and Variance. The fusions have been implemented for medical images and remote sensing images. It is hoped that the techniques can be extended for colored images and for fusion of multiple sensor images.

42

5.1

DWT Fusion The DWT fusion methods provide computationally efficient image fusion

techniques. Various fusion rules for the selection and combination of subband coefficients increase the quality (perceptual and quantitatively measurable) of image fusion in specific applications. 5.2 DT-CWT Fusion The developed DT-CWT fusion techniques provide better quantitative and qualitative results than the DWT at the expense of increased computation. The DTCWT method is able to retain edge information without significant ringing artifacts. It is also good at faithfully retaining textures from the input images. All of these features can be attributed to the increased shift invariance and orientation selectivity of the DT-CWT when compared to the DWT. A previously developed shift invariant wavelet transform (the SIDWT) has been used for image fusion [7]. However, the SIDWT suffers from excessive redundancy. The SIDWT also lacks the directional selectivity of the DT-CWT. This is reflected in the superior quantitative results of the DT-CWT (see table1) Various fusion rules for the selection and combination of sub band coefficients quantitatively measurable) of image fusion in specific applications. The DT-CWT has the further advantage that the phase information is available for analysis. However, after an initial set of experiments using the notion of phase coherence, no improvement in fusion performance has been achieved.

43

REFERENCES
[1] Shutao Li, James T. Kwok, Ivor W. Tsang, Yaonan Wang, “ Fusing images with different focuses using support vector machines” IEEE Transactions on Neural Networks, 15(6):1555- 1561, Nov. 2004. [2] P. J. Burt and R. J. Lolczynski, “ Enhanced image capture through fusion” In Proc. the 4th Intl. Conf. on Computer Vision, pages 173-182, Berlin, Germany, May 1993. [3] Z. Zhang and R. Blum, “A categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application” Proceedings of the IEEE, pages 1315 -1328, August 1999. [4] P.J Burt, EH Adelson, “The Laplacian pyramid as a compact image code”. IEEE Transactions Communications, 31, pp.532-540, April.1983 [5] Shutao Li, James T. Kwok, Yaonan Wang, “Combination of images with diverse focuses using the spatial frequency”, Information Fusion 2(3): 169-176, 2001 [6] Z. Zhang and R. S. Blum, “ Multisensor Image Fusion using a Region-Based Wavelet Transform Approach” Proc. of the DARPA IUW, pp. 1447-1451, 1997. [7] Pajares, G., De La Cruz, JM, “ A wavelet-based image fusion tutorial”. Pattern Recognition,37, pp. 1855-1872, 2004. [8] MATLAB, Wavelet Toolbox User's Guide, http://www.mathworks.com The Mathworks, Inc., August 2005. [9] H. Wang, J. Peng and W. Wu, “Fusion algorithm for multisensor images based on discretemultiwavelet transform”. Vision, Image and Signal Processing, Proceedings of the IEEE Vol.149, no.5: October 2002.

44

IMAGE FUSION DOC

Comments

Content

Sponsor Documents

Recommended