HSI-ML Presentation by Me

Machine Learning and Hyperspectral Imaging
I. Gkouzionis
Technical University of Crete

ECE Department
Electronics Laboratory
Optoelectronics & Imaging Diagnostics Research Group
January 22, 2018
Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 1 / 66

Outline
1 Hyperspectral Imaging: What is so special!
2 Hyperspectral Image Classification

Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment
3 First Approach

Hyperspectral Imaging: What is so special!
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Figure 1

Due to the advance of optical sensing technology, hyperspectral imaging (HSI)

can record rich spectral and spatial information of the observed scene.
The tremendous amount of spatial and spectral information in HSI guarantees
superior identifiability for classification, which is a crucial part of tons of
applications.
Therefore, in the last decade, there is increasing interest in the research of HSI
classification.

Hyperspectral Image Classification
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Hyperspectral Image Classification Introduction - Challenges
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Hyperspectral Image Classification
The traditional pixel-wise HSI classification is based on the fact that different
materials have different spectral reflectance and identify each material based
on its spectral curve.
In other words, classify each pixel by its digital numbers from different bands.
Given a set of observations (i.e. pixel vectors in a hyperspectral image), the
goal of classification is to assign a unique label to each pixel vector so that it
is well-defined by a given class.
The availability of hyperspectral data with high spatial resolution has been
quite important for classification techniques (i.e. data mostly contains pure
pixels that represented by a single predominant spectral signature).

Challenges in Hyperspectral Image Classification

Imbalance between dimensionality and training samples, presence of mixed
pixels
(a) (b)
Figure 2
Challenges in Hyperspectral Image Classification
The special characteristics of hyperspectral data pose several processing

problems:
1 The high-dimensional nature of hyperspectral data introduces important
limitations in supervised classifiers, such as the limited availability of
training samples or the inherently complex structure of the data
2 There is a need to address the presence of mixed pixels resulting from
insufficient spatial resolution and other phenomena in order to properly
model the hyperspectral data (e.g. underneath the pixels there is a mix of
several classes)
3 There is a need to develop computationally efficient algorithms, able to
provide a response in a reasonable time and thus address the
computational requirements of time-critical applications

Classification Process
The discrimination of materials based on their spectral profile, can be
considered as a classification task, where groups of pixels are labeled to a
particular class based on their reflectance properties, exploiting training
examples for modeling each class.
The classification process has two main stages:
1 The number and nature of the categories are determined
2 Every unknown or unseen element is assigned to one of the categories
according to its level of resemblance or similarity to the basic patterns
Figure 3

Classification Process
Hyperspectral image classification involves several steps:

Feature extraction/selection: the term feature refers to a single element or
a pattern.
Training: the term training arose from the fact that many pattern
recognition systems were trainable, i.e. they learned the discriminant
functions in the feature space by adjusting their parameters when applied
to a training pattern (pixel vector) whose true class is known
Labeling: the process of allocating individual pixels to their most likely
class

Hyperspectral Image Classification Feature Mining
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Feature Extraction/Selection
Hyperspectral imaging is characterized by the high spectral resolution

available, which allows capturing fine details of the spectral characteristics of
materials in a wide range of applications.
But, there are some challenges to be faced:
There is high correlation between adjacent bands and the number of the
original spectral features may be too high for classification purposes
The original spectral features may not be the most effective ones to
separate the objects of interest from others
These observations have fostered the use of feature mining techniques, so that
an effective set of features can be identified prior to classification.

Feature Extraction/Selection
Feature extraction performs two functions:

1 Separation of useful information from noise or non-information
2 Reduction of the dimensionality of the data in order to simplify the
calculations performed by the classifier, and to increase the efficiency of
statistical estimators in a statistical classifier
These aims can be achieved by applying spatial or spectral transform to the
image, such as selection of a subset of bands, or a principal component
transformation to reduce the data dimensionality.

Feature Extraction
Several strategies have been used in the hyperspectral imaging literature to

perform feature extraction prior to classification purposes.
A distinguishing characteristic of feature extraction methods is that they
exploit all available spectral measurements in order to extract relevant
features.
A widely used approach has been the generation of features in a new space,
like those obtained from the Principal Component Analysis (PCA).
In this technique, the hyperspectral data are projected onto a new space in
which the first few components account for most of the total information in
the data.

Feature Extraction
The goals of PCA are to:

extract the most important information from the data table
compress the size of the data set by keeping only the important
information
simplify the description of the data set
analyze the structure of the observation and the variables

Feature Extraction
Another spectral-based approach to generate new features has been the

Discrete Wavelet Transform (DWT), which allows for the separation of high
and low frequency components separately.
This allows a form of derivative analysis, which has been used to generate
features prior to hyperspectral image classification.
Another popular strategy has been Canonical Analysis, which is focused on
the extraction of features that maximize the ratio between the variance among
classes and the average variance within the classes.
However, this approach requires good estimates of the class covariance
matrices, and therefore a generally large number of training samples are often
required.

Feature Selection
In feature selection, the idea is to select a set of spectral bands from the initial
pool of bands available prior to classification.
A particular characteristic of feature selection methods is that they tend to
retain the spectral meaning, while reducing the number of bands.
In unsupervised feature selection, the goal is to automatically find statistically
important features. The advantage of unsupervised methods is that they do
not need training data.
Quite opposite, supervised feature selection is based on general/expert
knowledge, and require labeled and unlabeled training samples.
Techniques in the later category comprise methods based on class separability
measures using standard distance metrics (e.g. Euclidean, Mutual
information, Bhattacharyya, Mahalanobis).

Hyperspectral Image Classification Classification
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Introduction
Generally, image classification techniques can be divided into supervised and

unsupervised methods based on the involvement of the user during the
classification process.
Supervised learning is the more useful technique when the data samples have
known outcomes that the user wants to predict.
On the other hand, unsupervised learning is more appropriate when the user
does not know the subdivisions into which the data samples should be divided.

Supervised Vs. Unsupervised Classification
Supervised classification techniques require training areas to be defined by

the analyst in order to determine the characteristics of each category.
Each pixel in the image is, thus, assigned to one of the categories using
the extracted discriminating information
Unsupervised classification searches for natural groups of pixels, called
clusters, present within the data by means of assessing the relative
locations of the pixels in the feature space
The main difference between supervised and unsupervised classification
approaches is that supervised classification requires training data.

Supervised Vs. Unsupervised Classification
Figure 4

Hyperspectral Image Classification Supervised Classification
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Introduction
Supervised classification is performed in two stages. The first stage is the

training of the classifier, and the second stage is testing the performance of the
trained classifier on unknown pixels.
In the training stage, the analyst defines the regions that will be used to
extract training data, from which statistical estimates of the data properties
are computed.
At the classification stage, every unknown pixel in the test image is labeled in
terms of its spectral similarity to specified features.
If a pixel is not spectrally similar to any of the classes, then it can be allocated
to an unknown class.
As a result, an output image, or thematic map is produced, showing every
pixel with a class label.

Training Data
The characteristics of the training data selected by the analyst have a

considerable effect on the reliability and the performance of a supervised
classification process.
The training data must be defined by the analyst in such a way that they
accurately represent the characteristics of each individual feature and class
used in the analysis.
Two features of the training data are of key importance:
1 Data must represent the range of variability within class
2 The size of the training data set should be sufficient

Maximum Likelihood (ML) Classifier
Maximum Likelihood (ML) is a very popular supervised classifier, being widely

used in pattern recognition and image classification.
It usually acquires higher classification accuracy compared to other traditional
classification approaches.
It is based on the assumption that the probability density function for each
class is normal (Gaussian).
The ML classifier assumes that the statistics for each class in each band are
normally distributed and calculates the probability that a given pixel belongs
to a specific class.
Unless a probability threshold is selected, all pixels are classified. Each pixel is
assigned to the class that has the highest probability. If the highest
probability is smaller than a threshold, the pixel remains unclassified.

Minimum Distance Classifier
Minimum Distance classifier is a simple non-parametric classification method,

which uses the minimum distance between the pixel and the centroid or the
most representative spectra of the training class.
This classification method uses different kind of distance metrics in
multidimensional feature space to measure the degree of dissimilarity between
pixels and class centroids computed from training data.
The pixel is assigned to the least dissimilar class centroid.
Limitations of the classifier:
It is sensitive to different degrees of variance in the spectral response data
It is not widely used in applications where spectral classes are close to one
another in measurement space and have high variance

Spectral Angle Mapper (SAM)
The Spectral Angle Mapper (SAM) is a physically-based spectral classification

that uses an n-dimensional angle to match pixels to reference spectra.
The algorithm determines the spectral similarity between two spectra by
calculating the angle between the spectra, treating them as vectors in a space
with dimensionality equal to the number of bands.
SAM algorithm computes the spectral angle between the pixel spectrum and
the training’s pixel spectrum.
So, SAM is a common distance metric, which compares an unknown pixel
spectrum to the reference spectra, for each of K references, and assigns it to
the material having the smallest distance.
Pixels further away than the specified maximum angle threshold in radians are
not classified.

Support Vector Machine (SVM)
The Support Vector Machine (SVM) is a classification method based on the

statistical information of images.
SVM is a non-parametric classifier that locates the optimal hyperplane
between the two classes to separate them in a new high-dimensional feature
space by taking into account only the training samples that lie on the edge of
the class distributions known as support vectors.
Moreover, it does not require the assumption of normality and is insensitive to
the curse of dimensionality.
SVMs have often been found to provide higher classification accuracies than
other widely used techniques, such as the ML.
Furthermore, SVMs appear to be especially advantageous in the presence of
heterogeneous classes for which only few training samples are available.

Support Vector Machine (SVM)
Figure 5

K-Nearest Neighbour (KNN)
The K-Nearest Neighbour (KNN) algorithm is among the simplest of all

machine learning classification techniques.
Specifically, KNN classifies among the different categories based on the closest
training examples in the feature space.
The training process for this algorithm only consists of storing feature vectors
and labels of the training images.
In the classification process, the unlabelled testing data is assigned to the
label of its K-nearest neighbours, while the testing data are classified based on
the labels of their K-nearest neighbours by majority vote.
The most common distance metric function for the KNN is the Euclidean
distance.
A key advantage of the KNN algorithm is that it performs well with
multi-label classification problems, since the final prediction is based on a
small neighbourhood of similar classes.

K-Nearest Neighbour (KNN)
The steps for the KNN algorithm are the following:

1 Choose the number K of neighbors
2 Take the K nearest neighbors of the new data point, according to the
Euclidean distance
3 Among these K neighbors, count the number of data points in each
category
4 Assign the new data point to the category where you counted the most
neighbors

Hyperspectral Image Classification Unupervised Classification
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Introduction
An unsupervised classification method is used to determine the number of

spectrally-separable groups or clusters in an image for which there is
insufficient reference information available.
While applying an unsupervised method, the analyst generally specifies only
the number of classes (or the upper and lower bound on the number of classes)
and some statistical measure, depending upon the type of clustering
algorithms used.
Determination of the clusters is performed by estimating the distances or
comparison of the variance within and between the clusters.
After the specified number of groups is determined, they are labelled by
allocating pixels to features present in the scene.

K-Means Clustering
Clustering is the process of dividing a group of data into a number of smaller

groups known as clusters.
K-means clustering is one of the most popular and simple unsupervised
learning algorithms that solve the clustering problem.
The K-means algorithm is an algorithm to cluster n objects based on
attributes into k partitions, k<n. It assumes that the object attributes form a
vector space.
The goal of K-means is to reduce the variability within the cluster.

K-Means Clustering
K-Means unsupervised classification calculates initial class means evenly

distributed in the data space, then iteratively clusters the pixels into the
nearest class using a minimum-distance technique.
Each iteration recalculates class means and reclassifies pixels with respect to
the new means.
All pixels are classified to the nearest class unless a standard deviation or
distance threshold is specified.

K-Means Clustering
The algorithm is composed to the following steps:

Choose the number K of clusters
Select at random K points, the centroids
Assign each data point to the closest centroid. That forms K clusters
Compute and place the new centroid of each cluster
Reassign each data point to the new closest centroid. If any reassignment
took place, go to previous step, otherwise finish.

K-Means Clustering
Figure 6

K-Means Clustering
The major advantage of this process is that the method is robust, efficient and
easy to understand.
If variables are huge, then K-means most of the times is computationally
faster than other clustering methods, if we keep k small.
A drawback of the K-means algorithm is that the number of clusters k is an
input parameter.
An inappropriate choice of k may yield poor results.

ISODATA Clustering
ISODATA stands for Iterative Self-Organizing Data Analysis Techniques.

ISODATA clustering is a well-known algorithm which allows the number of
clusters to be automatically adjusted during the iteration by merging similar
clusters and splitting clusters with large standard deviations.
The algorithms starts with a random initial partition of the pixel vectors into
candidate clusters and then reassigns these vectors to clusters in such a way
that the squared error is reduced at each iteration, until a convergence
criterion is achieved.
The algorithm permits splitting, merging, and deleting of clusters at each
iteration in order to produce more accurate results and to mitigate
dependence of results on the initialization.
The ISODATA algorithm is more flexible than the K-means method, but the
user has to choose empirically many more parameters.

ISODATA Clustering
Figure 7

Hyperspectral Image Classification Deep Learning - CNNs
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Introduction
Deep learning-based methods achieve promising performance in many fields.

In deep learning, the Convolutional Neural Networks (CNNs) play a dominant
role for processing visual-related problems.
CNNs are biologically-inspired and multilayer classes of deep learning models
that use a single neural network trained end to end from raw image pixel
values to classifier outputs.
With the large-scale sources of training data and efficient implementation of
GPUs, CNNs have recently outperformed some other conventional methods on
many vision-related tasks, including image classification.

Artificial Neural Networks (ANN)
ANN belongs to artificial intelligence techniques, which are widely used

computing tools in image analysis.
An ANN is a massively parallel distributed processor made up of single
processing units, which has a natural propensity for storing experiential
knowledge and making it available for use.
A typical ANN comprises a large number of simple processing units, called
nodes, linked by weighted connections according to a specified architecture.
The basic ANN model consists of an input layer, a hidden layer and an output
layer (Fig. 8).
Learning occurs by adjusting the weights in the node to minimize the
difference between the output node activation and the output.
One can select the number of hidden layers to be used and can choose between
a logistic or hyperbolic activation function.

Figure 8

For the implementation of ANN a number of parameters is required to be set:

1 Training rate: determines the magnitude of the adjustment of the
weights. A higher rate will speed up the training, but will also increase
the risk of oscillations or non-convergence of the training set
2 Training threshold contribution: determines the size of the contribution of
the internal weight with respect to the activation level of the node. It is
used to adjust the changes to a node’s internal weight
3 Training momentum: is used to define the step of the training rate. Its
effect is to encourage weight changes along the current direction
4 Training RMS exit criteria field: defines the RMS error value at which
the training should stop
5 Number of hidden layers: defines whether the different input regions will
be linearly separable with a single hyperplane or not

The most widely used model is the multi-layered feed-forward ANN. Its design
consists of one input layer, at least one hidden layer and one output layer.
This algorithm is a promising technique for a number of situations such as
non-normality, complex feature spaces and multivariate data types, where
traditional methods fail to give accurate results.
One of the most notable feature about a neural network which motivates its
adoption in hyperspectral imaging classification is its robustness when
presented with partially incomplete of incorrect input pattern and the ability
to generalize input.

CNNs
CNNs represent feed-forward neural networks which consist of various

combinations of the convolutional layers, max pooling layers, and fully
connected layers.
A CNN consists of one or more pairs of convolution and max pooling layers
and finally ends with a fully connected layer (Fig. 9).
CNNs are different from ordinary NN in that neurons in convolutional layer
are only sparsely connected to the neurons in the next layer, based on their
relative location.
In CNNs, each hidden activation hi is computed by multiplying a small local
input V against the weights W.
The weights W are then shared across the entire input space.
Neurons that belong to the same layer share the same weights.
Weight sharing is a critical principle in CNNs since it helps reduce the total
number of trainable parameters and leads to more efficient training and more
effective model.

CNNs
Figure 9

CNNs
Pooling is used to make the features invariant from the location, and it
summarizes the output of multiple neurons in convolutional layers through a
pooling function.
Typical pooling function is maximum.
A max pooling function basically returns the maximum value from the input.
Max pooling partitions the input data into a set of non-overlapping windows
and outputs the maximum value for each subregion and reduces the
computational complexity for upper layers and provides a form of translation
invariance.
The computation chain of a CNN ends in a fully connected network that
integrates information across all locations in all feature maps of the layer
below.

CNN-Based HSI Classification

The hyperspectral data with hundreds of spectral channels can be illustrated
as 2D curves (Fig. 10).
(a) Trees (b) Bare Soil
Figure 10
We can see that the curve of each class has its own visual shape which is
different from other classes, although it is relatively difficult to distinguish
some classes with human eye.

The CNN varies in how the convolutional and max pooling layers are realized
and how the nets are trained.
Figure 11

In the above CNN classifier, the input represents a pixel spectral vector
followed by a convolution layer and a max pooling layer in turns to compute
20 feature maps classified with a fully connected network.
Layers C1 and M2 can be viewed as a trainable extractor to the input HSI
data, and layer F3 is a trainable classifier to the feature extractor.
The training process of the CNN classifier contains two steps: forward
propagation and backward propagation.
The forward propagation aims to compute the actual classification result of
the input data with current parameters.
The backward propagation is employed to update the trainable parameters in
order to make the discrepancy between the actual classification output and
the desired classification output as small as possible.

Hyperspectral Image Classification Accuracy Assessment
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

Introduction
The results of any classification process applied to hyperspectral data must be

quantitatively assessed in order to determine their accuracy.
The purpose of quantitative accuracy assessment is the identification and
measurement of map errors.
Quantitative accuracy assessment involves comparison of an area on a map
against reference information of the same area, assuming reference data to be
correct.

Confusion Matrix
The accuracy of classification has traditionally been measured by the overall

accuracy by generating a confusion matrix and determining accuracy levels by
dividing the total number of correctly classified pixels by the total number of
reference pixels.
However, as a single measure of accuracy, the overall accuracy gives no insight
into how well the classifier is performing for each of the different classes.
In particular, a classifier might perform well for a single class that accounts for
a large proportion of the test data and this will create a bias in overall
accuracy, despite low class accuracies for other classes.
To avoid such a bias when assessing the accuracy of a classifier, it is important
to consider the individual class accuracies.
Individual class accuracy can be obtained by dividing the total number of
correctly classified pixels in that category by the total number of pixels of that
category.

Kappa Statistics
It is a measure of the randomness of the classification result.

It measures the difference between the actual agreement in the confusion
matrix and the chance agreement which is indicated by row and column totals
(sum of product of row and column totals for each class).
It provides a better measure of the accuracy of a classifier than the overall
accuracy, and it takes into account the whole confusion matrix rather than the
diagonal elements alone.

First Approach
Outline

Feature Mining
Classification
Accuracy Assessment
3 First Approach

First Approach
Using K-Means Clustering
First, we have to acquire the spectral cube. Then we’ll apply the K-means
clustering algorithm on that cube.
K-means algorithm calculates initial class means evenly distributed in the
data space, then iteratively clusters the pixels into the nearest class using a
min-distance technique.
Choosing the optimal number of clusters:
Elbow method
Silhouette analysis

First Approach
Figure 12

First Approach
Silhouette analysis (SA) is another way to measure how close each point in a
cluster is to the points in its neighboring clusters.
The best advantage of using SA score for finding the optimal number of
clusters is that you use it for unlabelled data set.
The silhouette ranges from -1 to 1. Silhouette coefficients near +1 indicate
that the sample is far away from the neighboring clusters.
A value of 0 indicates that the sample is on or very close to the decision
boundary between two neighboring clusters.
Negative values indicate that those samples might have been assigned to the
wrong cluster.
The silhouette can be calculated with any distance metric, such as Euclidean
distance.

First Approach
Figure 13

First Approach
K-means algorithm steps are as following:

Choose number K of clusters
Select at random K points the centroids
Assign each data point to the closest centroid
Compute and place the new centroid of each cluster
Reassign each data point to the new closest centroid

First Approach
After finishing the clustering process, we have a final thematic/classification

map (pseudocolor map).
Each color of this map represents pixels with similar spectral characteristics.
Last step of the process is the classification accuracy assessment. The purpose
of quantitative accuracy assessment is the identification and measurement of
map errors.
Two main methods exist:
Confusion matrix
Kappa statistics
1 Divide total number of correctly classified pixels by the total number of

reference pixels
2 Individual class accuracy: divide total number of correctly classified
pixels in that class by the total number of pixels of that class

First Approach

HSI-ML Presentation by Me

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HSI-ML Presentation by Me

Uploaded by

Copyright:

Available Formats

Machine Learning and Hyperspectral Imaging

Technical University of Crete

January 22, 2018

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 1 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 2 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 3 / 66

Hyperspectral Imaging: What is so special!

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 4 / 66

Hyperspectral Imaging: What is so special!

Due to the advance of optical sensing technology, hyperspectral imaging (HSI)

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 5 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 6 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 7 / 66

Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 8 / 66

Challenges in Hyperspectral Image Classification

Challenges in Hyperspectral Image Classification

The special characteristics of hyperspectral data pose several processing

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 10 / 66

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 11 / 66

Hyperspectral image classification involves several steps:

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 12 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 13 / 66

Hyperspectral imaging is characterized by the high spectral resolution

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 14 / 66

Feature extraction performs two functions:

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 15 / 66

Several strategies have been used in the hyperspectral imaging literature to

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 16 / 66

The goals of PCA are to:

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 17 / 66

Another spectral-based approach to generate new features has been the

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 18 / 66

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 19 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 20 / 66

Generally, image classification techniques can be divided into supervised and

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 21 / 66

Supervised Vs. Unsupervised Classification

Supervised classification techniques require training areas to be defined by

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 22 / 66

Supervised Vs. Unsupervised Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 23 / 66

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 24 / 66

Supervised classification is performed in two stages. The first stage is the

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 25 / 66

The characteristics of the training data selected by the analyst have a

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 26 / 66

Maximum Likelihood (ML) Classifier