You are on page 1of 66

Machine Learning and Hyperspectral Imaging

I. Gkouzionis

Technical University of Crete


ECE Department
Electronics Laboratory
Optoelectronics & Imaging Diagnostics Research Group

January 22, 2018

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 1 / 66


Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 2 / 66


Hyperspectral Imaging: What is so special!

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 3 / 66


Hyperspectral Imaging: What is so special!

Hyperspectral Imaging: What is so special!

Figure 1

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 4 / 66


Hyperspectral Imaging: What is so special!

Hyperspectral Imaging: What is so special!

Due to the advance of optical sensing technology, hyperspectral imaging (HSI)


can record rich spectral and spatial information of the observed scene.
The tremendous amount of spatial and spectral information in HSI guarantees
superior identifiability for classification, which is a crucial part of tons of
applications.
Therefore, in the last decade, there is increasing interest in the research of HSI
classification.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 5 / 66


Hyperspectral Image Classification

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 6 / 66


Hyperspectral Image Classification Introduction - Challenges

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 7 / 66


Hyperspectral Image Classification Introduction - Challenges

Hyperspectral Image Classification

The traditional pixel-wise HSI classification is based on the fact that different
materials have different spectral reflectance and identify each material based
on its spectral curve.
In other words, classify each pixel by its digital numbers from different bands.
Given a set of observations (i.e. pixel vectors in a hyperspectral image), the
goal of classification is to assign a unique label to each pixel vector so that it
is well-defined by a given class.
The availability of hyperspectral data with high spatial resolution has been
quite important for classification techniques (i.e. data mostly contains pure
pixels that represented by a single predominant spectral signature).

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 8 / 66


Hyperspectral Image Classification Introduction - Challenges

Challenges in Hyperspectral Image Classification


Imbalance between dimensionality and training samples, presence of mixed
pixels

(a) (b)

Figure 2
Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 9 / 66
Hyperspectral Image Classification Introduction - Challenges

Challenges in Hyperspectral Image Classification

The special characteristics of hyperspectral data pose several processing


problems:
1 The high-dimensional nature of hyperspectral data introduces important
limitations in supervised classifiers, such as the limited availability of
training samples or the inherently complex structure of the data
2 There is a need to address the presence of mixed pixels resulting from
insufficient spatial resolution and other phenomena in order to properly
model the hyperspectral data (e.g. underneath the pixels there is a mix of
several classes)
3 There is a need to develop computationally efficient algorithms, able to
provide a response in a reasonable time and thus address the
computational requirements of time-critical applications

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 10 / 66


Hyperspectral Image Classification Introduction - Challenges

Classification Process
The discrimination of materials based on their spectral profile, can be
considered as a classification task, where groups of pixels are labeled to a
particular class based on their reflectance properties, exploiting training
examples for modeling each class.
The classification process has two main stages:
1 The number and nature of the categories are determined
2 Every unknown or unseen element is assigned to one of the categories
according to its level of resemblance or similarity to the basic patterns

Figure 3

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 11 / 66


Hyperspectral Image Classification Introduction - Challenges

Classification Process

Hyperspectral image classification involves several steps:


Feature extraction/selection: the term feature refers to a single element or
a pattern.
Training: the term training arose from the fact that many pattern
recognition systems were trainable, i.e. they learned the discriminant
functions in the feature space by adjusting their parameters when applied
to a training pattern (pixel vector) whose true class is known
Labeling: the process of allocating individual pixels to their most likely
class

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 12 / 66


Hyperspectral Image Classification Feature Mining

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 13 / 66


Hyperspectral Image Classification Feature Mining

Feature Extraction/Selection

Hyperspectral imaging is characterized by the high spectral resolution


available, which allows capturing fine details of the spectral characteristics of
materials in a wide range of applications.
But, there are some challenges to be faced:
There is high correlation between adjacent bands and the number of the
original spectral features may be too high for classification purposes
The original spectral features may not be the most effective ones to
separate the objects of interest from others
These observations have fostered the use of feature mining techniques, so that
an effective set of features can be identified prior to classification.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 14 / 66


Hyperspectral Image Classification Feature Mining

Feature Extraction/Selection

Feature extraction performs two functions:


1 Separation of useful information from noise or non-information
2 Reduction of the dimensionality of the data in order to simplify the
calculations performed by the classifier, and to increase the efficiency of
statistical estimators in a statistical classifier
These aims can be achieved by applying spatial or spectral transform to the
image, such as selection of a subset of bands, or a principal component
transformation to reduce the data dimensionality.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 15 / 66


Hyperspectral Image Classification Feature Mining

Feature Extraction

Several strategies have been used in the hyperspectral imaging literature to


perform feature extraction prior to classification purposes.
A distinguishing characteristic of feature extraction methods is that they
exploit all available spectral measurements in order to extract relevant
features.
A widely used approach has been the generation of features in a new space,
like those obtained from the Principal Component Analysis (PCA).
In this technique, the hyperspectral data are projected onto a new space in
which the first few components account for most of the total information in
the data.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 16 / 66


Hyperspectral Image Classification Feature Mining

Feature Extraction

The goals of PCA are to:


extract the most important information from the data table
compress the size of the data set by keeping only the important
information
simplify the description of the data set
analyze the structure of the observation and the variables

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 17 / 66


Hyperspectral Image Classification Feature Mining

Feature Extraction

Another spectral-based approach to generate new features has been the


Discrete Wavelet Transform (DWT), which allows for the separation of high
and low frequency components separately.
This allows a form of derivative analysis, which has been used to generate
features prior to hyperspectral image classification.
Another popular strategy has been Canonical Analysis, which is focused on
the extraction of features that maximize the ratio between the variance among
classes and the average variance within the classes.
However, this approach requires good estimates of the class covariance
matrices, and therefore a generally large number of training samples are often
required.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 18 / 66


Hyperspectral Image Classification Feature Mining

Feature Selection

In feature selection, the idea is to select a set of spectral bands from the initial
pool of bands available prior to classification.
A particular characteristic of feature selection methods is that they tend to
retain the spectral meaning, while reducing the number of bands.
In unsupervised feature selection, the goal is to automatically find statistically
important features. The advantage of unsupervised methods is that they do
not need training data.
Quite opposite, supervised feature selection is based on general/expert
knowledge, and require labeled and unlabeled training samples.
Techniques in the later category comprise methods based on class separability
measures using standard distance metrics (e.g. Euclidean, Mutual
information, Bhattacharyya, Mahalanobis).

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 19 / 66


Hyperspectral Image Classification Classification

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 20 / 66


Hyperspectral Image Classification Classification

Introduction

Generally, image classification techniques can be divided into supervised and


unsupervised methods based on the involvement of the user during the
classification process.
Supervised learning is the more useful technique when the data samples have
known outcomes that the user wants to predict.
On the other hand, unsupervised learning is more appropriate when the user
does not know the subdivisions into which the data samples should be divided.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 21 / 66


Hyperspectral Image Classification Classification

Supervised Vs. Unsupervised Classification

Supervised classification techniques require training areas to be defined by


the analyst in order to determine the characteristics of each category.
Each pixel in the image is, thus, assigned to one of the categories using
the extracted discriminating information
Unsupervised classification searches for natural groups of pixels, called
clusters, present within the data by means of assessing the relative
locations of the pixels in the feature space
The main difference between supervised and unsupervised classification
approaches is that supervised classification requires training data.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 22 / 66


Hyperspectral Image Classification Classification

Supervised Vs. Unsupervised Classification

Figure 4

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 23 / 66


Hyperspectral Image Classification Supervised Classification

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 24 / 66


Hyperspectral Image Classification Supervised Classification

Introduction

Supervised classification is performed in two stages. The first stage is the


training of the classifier, and the second stage is testing the performance of the
trained classifier on unknown pixels.
In the training stage, the analyst defines the regions that will be used to
extract training data, from which statistical estimates of the data properties
are computed.
At the classification stage, every unknown pixel in the test image is labeled in
terms of its spectral similarity to specified features.
If a pixel is not spectrally similar to any of the classes, then it can be allocated
to an unknown class.
As a result, an output image, or thematic map is produced, showing every
pixel with a class label.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 25 / 66


Hyperspectral Image Classification Supervised Classification

Training Data

The characteristics of the training data selected by the analyst have a


considerable effect on the reliability and the performance of a supervised
classification process.
The training data must be defined by the analyst in such a way that they
accurately represent the characteristics of each individual feature and class
used in the analysis.
Two features of the training data are of key importance:
1 Data must represent the range of variability within class
2 The size of the training data set should be sufficient

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 26 / 66


Hyperspectral Image Classification Supervised Classification

Maximum Likelihood (ML) Classifier

Maximum Likelihood (ML) is a very popular supervised classifier, being widely


used in pattern recognition and image classification.
It usually acquires higher classification accuracy compared to other traditional
classification approaches.
It is based on the assumption that the probability density function for each
class is normal (Gaussian).
The ML classifier assumes that the statistics for each class in each band are
normally distributed and calculates the probability that a given pixel belongs
to a specific class.
Unless a probability threshold is selected, all pixels are classified. Each pixel is
assigned to the class that has the highest probability. If the highest
probability is smaller than a threshold, the pixel remains unclassified.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 27 / 66


Hyperspectral Image Classification Supervised Classification

Minimum Distance Classifier

Minimum Distance classifier is a simple non-parametric classification method,


which uses the minimum distance between the pixel and the centroid or the
most representative spectra of the training class.
This classification method uses different kind of distance metrics in
multidimensional feature space to measure the degree of dissimilarity between
pixels and class centroids computed from training data.
The pixel is assigned to the least dissimilar class centroid.
Limitations of the classifier:
It is sensitive to different degrees of variance in the spectral response data
It is not widely used in applications where spectral classes are close to one
another in measurement space and have high variance

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 28 / 66


Hyperspectral Image Classification Supervised Classification

Spectral Angle Mapper (SAM)

The Spectral Angle Mapper (SAM) is a physically-based spectral classification


that uses an n-dimensional angle to match pixels to reference spectra.
The algorithm determines the spectral similarity between two spectra by
calculating the angle between the spectra, treating them as vectors in a space
with dimensionality equal to the number of bands.
SAM algorithm computes the spectral angle between the pixel spectrum and
the training’s pixel spectrum.
So, SAM is a common distance metric, which compares an unknown pixel
spectrum to the reference spectra, for each of K references, and assigns it to
the material having the smallest distance.
Pixels further away than the specified maximum angle threshold in radians are
not classified.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 29 / 66


Hyperspectral Image Classification Supervised Classification

Support Vector Machine (SVM)

The Support Vector Machine (SVM) is a classification method based on the


statistical information of images.
SVM is a non-parametric classifier that locates the optimal hyperplane
between the two classes to separate them in a new high-dimensional feature
space by taking into account only the training samples that lie on the edge of
the class distributions known as support vectors.
Moreover, it does not require the assumption of normality and is insensitive to
the curse of dimensionality.
SVMs have often been found to provide higher classification accuracies than
other widely used techniques, such as the ML.
Furthermore, SVMs appear to be especially advantageous in the presence of
heterogeneous classes for which only few training samples are available.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 30 / 66


Hyperspectral Image Classification Supervised Classification

Support Vector Machine (SVM)

Figure 5

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 31 / 66


Hyperspectral Image Classification Supervised Classification

K-Nearest Neighbour (KNN)

The K-Nearest Neighbour (KNN) algorithm is among the simplest of all


machine learning classification techniques.
Specifically, KNN classifies among the different categories based on the closest
training examples in the feature space.
The training process for this algorithm only consists of storing feature vectors
and labels of the training images.
In the classification process, the unlabelled testing data is assigned to the
label of its K-nearest neighbours, while the testing data are classified based on
the labels of their K-nearest neighbours by majority vote.
The most common distance metric function for the KNN is the Euclidean
distance.
A key advantage of the KNN algorithm is that it performs well with
multi-label classification problems, since the final prediction is based on a
small neighbourhood of similar classes.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 32 / 66


Hyperspectral Image Classification Supervised Classification

K-Nearest Neighbour (KNN)

The steps for the KNN algorithm are the following:


1 Choose the number K of neighbors
2 Take the K nearest neighbors of the new data point, according to the
Euclidean distance
3 Among these K neighbors, count the number of data points in each
category
4 Assign the new data point to the category where you counted the most
neighbors

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 33 / 66


Hyperspectral Image Classification Unupervised Classification

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 34 / 66


Hyperspectral Image Classification Unupervised Classification

Introduction

An unsupervised classification method is used to determine the number of


spectrally-separable groups or clusters in an image for which there is
insufficient reference information available.
While applying an unsupervised method, the analyst generally specifies only
the number of classes (or the upper and lower bound on the number of classes)
and some statistical measure, depending upon the type of clustering
algorithms used.
Determination of the clusters is performed by estimating the distances or
comparison of the variance within and between the clusters.
After the specified number of groups is determined, they are labelled by
allocating pixels to features present in the scene.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 35 / 66


Hyperspectral Image Classification Unupervised Classification

K-Means Clustering

Clustering is the process of dividing a group of data into a number of smaller


groups known as clusters.
K-means clustering is one of the most popular and simple unsupervised
learning algorithms that solve the clustering problem.
The K-means algorithm is an algorithm to cluster n objects based on
attributes into k partitions, k<n. It assumes that the object attributes form a
vector space.
The goal of K-means is to reduce the variability within the cluster.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 36 / 66


Hyperspectral Image Classification Unupervised Classification

K-Means Clustering

K-Means unsupervised classification calculates initial class means evenly


distributed in the data space, then iteratively clusters the pixels into the
nearest class using a minimum-distance technique.
Each iteration recalculates class means and reclassifies pixels with respect to
the new means.
All pixels are classified to the nearest class unless a standard deviation or
distance threshold is specified.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 37 / 66


Hyperspectral Image Classification Unupervised Classification

K-Means Clustering

The algorithm is composed to the following steps:


Choose the number K of clusters
Select at random K points, the centroids
Assign each data point to the closest centroid. That forms K clusters
Compute and place the new centroid of each cluster
Reassign each data point to the new closest centroid. If any reassignment
took place, go to previous step, otherwise finish.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 38 / 66


Hyperspectral Image Classification Unupervised Classification

K-Means Clustering

Figure 6

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 39 / 66


Hyperspectral Image Classification Unupervised Classification

K-Means Clustering

The major advantage of this process is that the method is robust, efficient and
easy to understand.
If variables are huge, then K-means most of the times is computationally
faster than other clustering methods, if we keep k small.
A drawback of the K-means algorithm is that the number of clusters k is an
input parameter.
An inappropriate choice of k may yield poor results.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 40 / 66


Hyperspectral Image Classification Unupervised Classification

ISODATA Clustering

ISODATA stands for Iterative Self-Organizing Data Analysis Techniques.


ISODATA clustering is a well-known algorithm which allows the number of
clusters to be automatically adjusted during the iteration by merging similar
clusters and splitting clusters with large standard deviations.
The algorithms starts with a random initial partition of the pixel vectors into
candidate clusters and then reassigns these vectors to clusters in such a way
that the squared error is reduced at each iteration, until a convergence
criterion is achieved.
The algorithm permits splitting, merging, and deleting of clusters at each
iteration in order to produce more accurate results and to mitigate
dependence of results on the initialization.
The ISODATA algorithm is more flexible than the K-means method, but the
user has to choose empirically many more parameters.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 41 / 66


Hyperspectral Image Classification Unupervised Classification

ISODATA Clustering

Figure 7

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 42 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 43 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Introduction

Deep learning-based methods achieve promising performance in many fields.


In deep learning, the Convolutional Neural Networks (CNNs) play a dominant
role for processing visual-related problems.
CNNs are biologically-inspired and multilayer classes of deep learning models
that use a single neural network trained end to end from raw image pixel
values to classifier outputs.
With the large-scale sources of training data and efficient implementation of
GPUs, CNNs have recently outperformed some other conventional methods on
many vision-related tasks, including image classification.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 44 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Artificial Neural Networks (ANN)

ANN belongs to artificial intelligence techniques, which are widely used


computing tools in image analysis.
An ANN is a massively parallel distributed processor made up of single
processing units, which has a natural propensity for storing experiential
knowledge and making it available for use.
A typical ANN comprises a large number of simple processing units, called
nodes, linked by weighted connections according to a specified architecture.
The basic ANN model consists of an input layer, a hidden layer and an output
layer (Fig. 8).
Learning occurs by adjusting the weights in the node to minimize the
difference between the output node activation and the output.
One can select the number of hidden layers to be used and can choose between
a logistic or hyperbolic activation function.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 45 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Artificial Neural Networks (ANN)

Figure 8

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 46 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Artificial Neural Networks (ANN)

For the implementation of ANN a number of parameters is required to be set:


1 Training rate: determines the magnitude of the adjustment of the
weights. A higher rate will speed up the training, but will also increase
the risk of oscillations or non-convergence of the training set
2 Training threshold contribution: determines the size of the contribution of
the internal weight with respect to the activation level of the node. It is
used to adjust the changes to a node’s internal weight
3 Training momentum: is used to define the step of the training rate. Its
effect is to encourage weight changes along the current direction
4 Training RMS exit criteria field: defines the RMS error value at which
the training should stop
5 Number of hidden layers: defines whether the different input regions will
be linearly separable with a single hyperplane or not

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 47 / 66


Hyperspectral Image Classification Deep Learning - CNNs

Artificial Neural Networks (ANN)

The most widely used model is the multi-layered feed-forward ANN. Its design
consists of one input layer, at least one hidden layer and one output layer.
This algorithm is a promising technique for a number of situations such as
non-normality, complex feature spaces and multivariate data types, where
traditional methods fail to give accurate results.
One of the most notable feature about a neural network which motivates its
adoption in hyperspectral imaging classification is its robustness when
presented with partially incomplete of incorrect input pattern and the ability
to generalize input.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 48 / 66


Hyperspectral Image Classification Deep Learning - CNNs

CNNs

CNNs represent feed-forward neural networks which consist of various


combinations of the convolutional layers, max pooling layers, and fully
connected layers.
A CNN consists of one or more pairs of convolution and max pooling layers
and finally ends with a fully connected layer (Fig. 9).
CNNs are different from ordinary NN in that neurons in convolutional layer
are only sparsely connected to the neurons in the next layer, based on their
relative location.
In CNNs, each hidden activation hi is computed by multiplying a small local
input V against the weights W.
The weights W are then shared across the entire input space.
Neurons that belong to the same layer share the same weights.
Weight sharing is a critical principle in CNNs since it helps reduce the total
number of trainable parameters and leads to more efficient training and more
effective model.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 49 / 66


Hyperspectral Image Classification Deep Learning - CNNs

CNNs

Figure 9

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 50 / 66


Hyperspectral Image Classification Deep Learning - CNNs

CNNs

Pooling is used to make the features invariant from the location, and it
summarizes the output of multiple neurons in convolutional layers through a
pooling function.
Typical pooling function is maximum.
A max pooling function basically returns the maximum value from the input.
Max pooling partitions the input data into a set of non-overlapping windows
and outputs the maximum value for each subregion and reduces the
computational complexity for upper layers and provides a form of translation
invariance.
The computation chain of a CNN ends in a fully connected network that
integrates information across all locations in all feature maps of the layer
below.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 51 / 66


Hyperspectral Image Classification Deep Learning - CNNs

CNN-Based HSI Classification


The hyperspectral data with hundreds of spectral channels can be illustrated
as 2D curves (Fig. 10).

(a) Trees (b) Bare Soil

Figure 10

We can see that the curve of each class has its own visual shape which is
different from other classes, although it is relatively difficult to distinguish
some classes with human eye.
Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 52 / 66
Hyperspectral Image Classification Deep Learning - CNNs

CNN-Based HSI Classification


The CNN varies in how the convolutional and max pooling layers are realized
and how the nets are trained.

Figure 11

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 53 / 66


Hyperspectral Image Classification Deep Learning - CNNs

CNN-Based HSI Classification

In the above CNN classifier, the input represents a pixel spectral vector
followed by a convolution layer and a max pooling layer in turns to compute
20 feature maps classified with a fully connected network.
Layers C1 and M2 can be viewed as a trainable extractor to the input HSI
data, and layer F3 is a trainable classifier to the feature extractor.
The training process of the CNN classifier contains two steps: forward
propagation and backward propagation.
The forward propagation aims to compute the actual classification result of
the input data with current parameters.
The backward propagation is employed to update the trainable parameters in
order to make the discrepancy between the actual classification output and
the desired classification output as small as possible.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 54 / 66


Hyperspectral Image Classification Accuracy Assessment

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 55 / 66


Hyperspectral Image Classification Accuracy Assessment

Introduction

The results of any classification process applied to hyperspectral data must be


quantitatively assessed in order to determine their accuracy.
The purpose of quantitative accuracy assessment is the identification and
measurement of map errors.
Quantitative accuracy assessment involves comparison of an area on a map
against reference information of the same area, assuming reference data to be
correct.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 56 / 66


Hyperspectral Image Classification Accuracy Assessment

Confusion Matrix

The accuracy of classification has traditionally been measured by the overall


accuracy by generating a confusion matrix and determining accuracy levels by
dividing the total number of correctly classified pixels by the total number of
reference pixels.
However, as a single measure of accuracy, the overall accuracy gives no insight
into how well the classifier is performing for each of the different classes.
In particular, a classifier might perform well for a single class that accounts for
a large proportion of the test data and this will create a bias in overall
accuracy, despite low class accuracies for other classes.
To avoid such a bias when assessing the accuracy of a classifier, it is important
to consider the individual class accuracies.
Individual class accuracy can be obtained by dividing the total number of
correctly classified pixels in that category by the total number of pixels of that
category.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 57 / 66


Hyperspectral Image Classification Accuracy Assessment

Kappa Statistics

It is a measure of the randomness of the classification result.


It measures the difference between the actual agreement in the confusion
matrix and the chance agreement which is indicated by row and column totals
(sum of product of row and column totals for each class).
It provides a better measure of the accuracy of a classifier than the overall
accuracy, and it takes into account the whole confusion matrix rather than the
diagonal elements alone.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 58 / 66


First Approach

Outline

1 Hyperspectral Imaging: What is so special!

2 Hyperspectral Image Classification


Introduction - Challenges
Feature Mining
Classification
Supervised Classification
Unupervised Classification
Deep Learning - CNNs
Accuracy Assessment

3 First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 59 / 66


First Approach

Using K-Means Clustering

First, we have to acquire the spectral cube. Then we’ll apply the K-means
clustering algorithm on that cube.
K-means algorithm calculates initial class means evenly distributed in the
data space, then iteratively clusters the pixels into the nearest class using a
min-distance technique.
Choosing the optimal number of clusters:
Elbow method
Silhouette analysis

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 60 / 66


First Approach

Using K-Means Clustering

Figure 12

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 61 / 66


First Approach

Using K-Means Clustering

Silhouette analysis (SA) is another way to measure how close each point in a
cluster is to the points in its neighboring clusters.
The best advantage of using SA score for finding the optimal number of
clusters is that you use it for unlabelled data set.
The silhouette ranges from -1 to 1. Silhouette coefficients near +1 indicate
that the sample is far away from the neighboring clusters.
A value of 0 indicates that the sample is on or very close to the decision
boundary between two neighboring clusters.
Negative values indicate that those samples might have been assigned to the
wrong cluster.
The silhouette can be calculated with any distance metric, such as Euclidean
distance.

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 62 / 66


First Approach

Using K-Means Clustering

Figure 13

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 63 / 66


First Approach

Using K-Means Clustering

K-means algorithm steps are as following:


Choose number K of clusters
Select at random K points the centroids
Assign each data point to the closest centroid
Compute and place the new centroid of each cluster
Reassign each data point to the new closest centroid

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 64 / 66


First Approach

Using K-Means Clustering

After finishing the clustering process, we have a final thematic/classification


map (pseudocolor map).
Each color of this map represents pixels with similar spectral characteristics.
Last step of the process is the classification accuracy assessment. The purpose
of quantitative accuracy assessment is the identification and measurement of
map errors.
Two main methods exist:
Confusion matrix
Kappa statistics

1 Divide total number of correctly classified pixels by the total number of


reference pixels
2 Individual class accuracy: divide total number of correctly classified
pixels in that class by the total number of pixels of that class

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 65 / 66


First Approach

Gkouzionis Ioannis (TUC) ML and HSI January 22, 2018 66 / 66

You might also like