Thematic Mapper Classification Results
Classification can be performed on the selected principal components via either unsupervised or supervised techniques. Trial clustering is typically performed to determine the separation between clusters associated with the various components and selection of relevant PCA's. Training data are selected for use by subsequent supervised techniques. Three classification methods were investigated to perform the preliminary land cover classification in the 1987 Landsat-TM data: isodata clustering, maximum likelihood, and neural networks. These methods were chosen because they are standard methods in all classification packages and thus serve as benchmarks for future classification results as well as providing initial indications of appropriate procedures for the KSC environment.
A. Unsupervised Clustering - Isodata
The standard isodata clustering algorithm [Richards] was chosen as the unsupervised classification method for this dataset. The algorithm starts by randomly selecting cluster centers in the multidimensional input data space. Each pixel is then grouped into a candidate cluster based on the minimization of a distance function between that pixel and the cluster centers. After each iteration, the cluster means are updated, and clusters are possibly spilt or merged depending on the size and spread of the data points in the clusters.
Table 1: Isodata Clustering Results
| Class Type |
Area (sq km) |
% of Scene |
GIS % of Scene |
| Man Made |
42.31 |
3.47 |
1.43 |
| Upland Vegetation |
140.17 |
11.51 |
13.02 |
| Water |
885.69 |
72.74 |
73.46 |
| Fresh Marsh |
77.07 |
6.33 |
5.42 |
| Salt Marsh |
68.43 |
5.62 |
3.81 |
| Coastal Dunes |
3.94 |
0.32 |
0.38 |
Considering the fact that the isodata is an unsupervised classification, it did reasonably well separating the general environments based on comparision to the GIS map. For the isodata classification, a minimum of eight clusters and a maximum of ten clusters were specified. The final result produced nine clusters. Four of the clusters are manmade areas such as roads, buildings, and runways. The isodata did very well in clustering the water bodies, which included ocean, lagoon and impounded waters. Upland vegetation, such as scrub and trees, was clustered as a single group, while marshes and fireburns were clustered together as well. The isodata clustering was also able to identify the coastal beaches.
There were, however, some problems with the isodata clustering. As indicated above, the algorithm had difficulty separating the marsh areas from the fireburns that occurred in 1987. It also had difficulty separating the salt water marshes from the fresh water marshes.
B. Maximum Likelihood Classifier
Maximum Likelihood Classification (ML) is the most common supervised classification technique. Pixels are assigned to pre-selected classes based on a decision rule which maximizes the likelihood of having obtained the observed values given the overall assignment of classes to the image. The goal is to assign each pixel to the class which has greatest probability of occurrence given the observed data. Although it is possible to use ML classification with data drawn from any a population with any parametric or nonparametric distribution, virtually all commercial packages assume that the data are normally distributed. This assumption should be checked for each application. Variations in implementation of ML classification allow selection of probability thresholds required for assignment of classes and separation requirements for individual classes. Pixels not satisfying the requirement are assigned to the "unclassified" class.
Table 2: Maximum Likelihood Classification Results
| Class Type |
Area (sq km) |
% of Scene |
GIS % of Scene |
| Man Made |
53.75 |
4.41 |
1.43 |
| Scrub Vegetation |
27.85 |
2.28 |
3.49 |
| Water |
860.63 |
70.68 |
73.46 |
| Trees |
98.18 |
8.06 |
9.53 |
| Fresh Marsh |
32.78 |
2.69 |
5.42 |
| Salt Marsh |
120.50 |
9.89 |
3.81 |
|
| 20.31 |
1.66 |
--- |
| Coastal Dunes |
3.62 |
0.29 |
0.38 |
The a priori probabilities were set equal for each of the eight classes - subsequent classification will utilize probabilities from either the clustering algorithm or some other independent source. All pixels in the KSC dataset were classified; none were declared to be "unclassified".
Overall, the ML classifier performed better than the unsupervised isodata clustering. The ML did reasonably well in delineating the boundaries between fireburns, upland vegetation, and marshlands.
However, the ML classifier did have problems with over classifying man-made features, almost three times the actual amount of surface area. It also did not do well in detecting the coastal dunes and beach as it classified them as man-made. In addition, there appears to be some misclassification between the trees and scrub vegetation as well as the freshwater and saltwater marshes.
C. Neural Networks
The application of neural networks to remotely sensed data is another supervised classification method which is becoming increasingly popular. A neural network consists of a series of interconnected layers: an input layer, an output layer, and hidden
layer(s). The nodes of each layer are connected by a series of weights which are trained to select a particular class given a certain set of inputs, independent of the probability distributions of the data. At each iteration of the training process, a minimization of an error function between the actual output and the target values is performed. Based upon this training, as well as the numbers of hidden units and hidden layers, decision boundaries are generated to separate one output class from another in the input data space.
Table 3: Neural Network Classification Results
| Class Type |
Area (sq km) |
% of Scene |
GIS % of Scene |
| Man Made |
13.06 |
1.07 |
1.43 |
| Scrub Vegetation |
65.62 |
5.38 |
3.49 |
| Water |
857.89 |
70.45 |
73.46 |
| Trees |
95.57 |
7.84 |
9.53 |
| Fresh Marsh |
81.96 |
6.73 |
5.42 |
| Salt Marsh |
87.42 |
7.18 |
3.81 |
| Fireburn |
9.92 |
0.81 |
--- |
| Coastal Dunes |
6.17 |
0.51 |
0.38 |
The neural network consisted of five inputs, eight outputs, and one hidden layer with 13 hidden units. The network was trained for 50 iterations using normal gradient descent.
Overall, the neural network classification results showed perhaps the most promise with developing an accurate classifier for the KSC environment. The neural network readily identified waters, including those with heavy turbidity and submerged vegetation. It also performed better than the ML classifier for discriminating the taller trees from the scrub vegetation. In addition, the neural network was successful in detecting the coastal dunes and beaches. However, the neural network did misclassify features such as narrower roads and some fireburns that the ML classifier was able to identify.

Last Modified: Wed Apr 14, 1999
CSR/TSGC Team Web
|