Feature Extraction

In classification studies involving data of high spectral dimension, it is desirable to select the optimum subset of channels for analysis in order to avoid the Hughes phenomena [5] and parameter estimation problems due to interband correlation, as well as to reduce computational requirements. The AVIRIS data have known responses to the reflectances of many minerals for which extensive laboratory spectrometer studies have been performed, and the observed signal can be modeled as a linear mixture of the known spectral end members. The response of the system to vegetation is not nearly as well characterized. Several of the bands exhibit characteristic patterns for different vegetation types, although the responses are often complex and involve multiple bands for discrimination. Because the analysis of the KSC AVIRIS data is exploratory, both feature based and statistical approaches to band selection were investigated.

Prinicipal Components

Eigenanalysis approaches for band selection involve transformations of the original data to a new coordinate system where the channels are statistically uncorrelated. Principal Component Analysis (PCA), which is the most commonly used statistical approach for variable selection in a system where inputs are highly correlated [6], computes the mutually orthogonal normalized linear combinations of the vector of observations with maximum variance.

Although PCA has been successfully applied to many remotely sensed data sets, it is not scale invariant, is quite variable with respect to the information content of a particular image and does not guarantee good class separation in the transformed space.

Subjective analysis of the PC's computed for the KSC AVIRIS scene yielded the following subjective interpretation of the information content of the bands relative to the classes listed in Table 1. The ordering of these PC's illustrates the problem that they do not necessarily represent decreasing information content.

Table 1. Principle Components and their Information Content
PC Features Discriminated
1 Land and water are the predominant features and are well separated
2 Areas of high and low (urban) biomass are represented
3 Salt Marsh is well discriminated, having very low values
4-8 Noise
9 Some mudflats appear to be discriminated as low values vs. an overall bright image
10 Graminoids are dark, mud is very bright
11 Cattail is recognized as having a consisten low signature. Linear features of the urban areas and dikes are strongly enhanced.
12 Contains "texture" information which cannot be directly related to known phenomena
13-14 Noise
15 No discernible information which can be related to features
16 Willow and cattail are bright, while CP hammock is very dark.
17-21 Noise
22 Impoundments are well discriminated

Eight PC bands were initially selected as input for classification: 1,2,3,9,10,11,16, and 22. As discussed in the following sections, a hierarchical approach was used for classification of the AVIRIS data. Bands transformed by PCA were utilized in this scheme for final classification of the uplands and wetlands classes.

Minimum Noise Fraction (MNF) Transformation

The Minimum Noise Fraction (MNF) transform computes the normalized linear combinations of the original bands which maximize the ratio of the signal to noise [7]. The approach was developed specifically for analysis of multiple band remotely sensed data which would produce orthogonal bands ordered by their information content. It can also be used for filtering noise through application of filters matched to the noise characteristics of the transformed bands and inverting the data.

Because the transform is a ratio, it is also invariant with respect to scale changes in bands. Additionally, the signal and noise of the transformed bands are also orthogonal. The approach requires that the covariance of the noise be known, which is not generally the case for remotely sensed data. A reasonable estimate of the noise in each band can be obtained when the signal is highly correlated across bands through adaptation of a procedure called the maximum autocorrelation factor which exploits the correlation of signals in spatial neighborhoods [8].

Subjective analysis of the MNF bands computed for the KSC AVIRIS scene yielded the following subjective interpretation of the information content of the bands relative to the classes listed in Table 2.

Table 2. Minimum Noise Fraction (MNF) Transformation and their Information Content
MNF Features Discriminated
1 Land and water are the predominant features and are well separated
2 Discriminates between urban and impounded waters
3 Separates the shallow waters/sediments in the lagoon.
4 Separates the upland vegetation from the wetlands.
5 Strong signature in cattail region around T24-D and bright signature of upland vegetation.
6 Separates the cattail and salt marsh well
7 Salt marsh and cattail stand out
8 Detects differences in the marshes (possibly due to moisture)
9 Willow swamps and mud flats stand out.
10 Vegetation which line the dikes (Mangrove and Buttonwood) stand out.
11 Isolates Cabbage Palm Hammocks.
12 Discriminates other stands of trees (Oaks, willow, and slash pine) along with CP Hammock.
13 Hardwood and willow swamps show up well.
14 Noise

Decision Boundary Feature Extraction

An approach to feature extraction developed by Lee and Landgrebe [9] focuses on robust selection of bands, independent of the relative relationships between the means and covariances of the various classes, and development of an objective capability to predict intrinsic dimensionality of a space.

The decision boundary approach selects features (bands) as the minimum number of features required to achieve the "same" classification accuracy as the original space. This is enforced using a Bayesian decision rule.

When the data are Gaussian, and a maximum likelihood classifier is employed, the decision boundary is linear (hyperplane) if the covariance matrices of the classes are equal and quadratic (ellipsoid) if they are different. A piecewise linear approximation to the effective portion of the boundary is used to separate two classes. For more than two classes, the total decision boundary feature matrix is computed as the sum of the matrices for each pair of classes (i,j), where prior probabilities are used for weighting.

Similarly to the MNF transform, the decision boundary approach was applied to the wetlands and uplands data separately for selection of features. A Gaussian distribution was assumed where the means and covariance matrices were computed from training data.