You are on page 1of 59

Total Variation and Related Methods

Martin Burger
Institut fr Numerische und Angewandte Mathematik

CeNoS

European Institute for Molecular Imaging

Total Variation

Mathematical Imaging@WWU

Christoph Brune Benning

Alex Sawatzky

Frank Wbbeling

Thomas Ksters Martin

Brbel Schlake Marzena Franek Christina Stcker Mary Wolfram

Thomas Grosser

Jahn

Mller
Martin Burger Cetraro, September 2008

Total Variation

Imaging Basics: What and Why ?


- Denoising: Given a noisy version of an image, find a smooth approximation (better appealing to the human eye or better suited for some task, e.g. counting cells in a microscope) - Decomposition: Given an image, decompose it into different parts such as smooth structure, texture, edges, noise - Deblurring: Given a blurred version of an image (also noisy) find an approximation of the original image - Inpainting: Given an image with holes, try to fill the holes as a reasonable continuation

Martin Burger

Cetraro, September 2008

Total Variation

Imaging Basics: What and Why ?


- Segmentation: Find the edges / different objects in an image - Reconstruction: Given indirect information about an image, e.g. tomography, try to find an approximation of the image Many of these tasks can be categorized as Inverse Problems: reconstruction of the cause of an observed effect (via a mathematical model relating them) Diagnosis in medicine is a prototypical example
"The grand thing is to be able to reason backwards." Arthur Conan Doyle (A study in scarlet)

Martin Burger

Cetraro, September 2008

Total Variation

Noisy Images
Noise appears from measurement devices or transmission loss

Martin Burger

Cetraro, September 2008

Total Variation

Damaged Images
Corrupted Pixels, dusted, scratches

Martin Burger

Cetraro, September 2008

Total Variation

Medical Imaging: CT
Classical image reconstruction example: computerized tomography (CT) Mathematical Problem: Reconstruction of a density function from its line integrals Inversion of the Radon transform
cf. Natterer 86, Natterer-Wbbeling 02

Martin Burger

Cetraro, September 2008

Total Variation

Medical Imaging: CT
Classical image reconstruction example: computerized tomography (CT)

Martin Burger

Cetraro, September 2008

Total Variation

Medical Imaging: CT
+ Low noise level + High spatial resolution + Exact reconstruction + Reasonable Costs

Soret, Bacharach, Buvat 07

- Restricted to few seconds (radiation exposure, 20 mSiewert) - No functional information - Few mathematical challenges left

CT

Schfers et al 07

Martin Burger

Cetraro, September 2008

Total Variation

10

Medical Imaging: MR
+ Low noise level + High spatial resolution + Reconstruction by Fourier inversion + No radiation exposure + Good contrast in soft matter - Low tracer sensitivity - Limited functional information - Expensive - Few mathematical challenges left
Martin Burger Cetraro, September 2008

Courtesy Carsten Wolters, University Hospital Mnster

Total Variation

11

Medical Imaging: Ultrasound


+ Fast and cheap + Varying spatial resolution + Usually no reconstruction + No radiation exposure - High noise level - Bad contrast / bones

Martin Burger

Cetraro, September 2008

Total Variation

12

Imaging Examples: PET (Human / Small animal)

Positron-Emission-Tomography Data: detecting decay events of an radioactive tracer Decay events are random, but their rate is proportional to the tracer uptake (Radon transform with random directions)

Imaging of molecular properties

Martin Burger

Cetraro, September 2008

Total Variation

13

Medical Imaging: PET


+ High sensitivity + Long time (mins ~ 1 hour, radiation exposure 8-12 mSiewert) + Functional information + Many open mathematical questions
PET

Soret, Bacharach, Buvat 07

- Few anatomical information - High noise level and disturbing effects (damping, scattering, ) - Low spatial resolution
Martin Burger Cetraro, September 2008

Schfers et al 07

Total Variation

14

Small Animal PET: Burning down the Mouse

Martin Burger

Cetraro, September 2008

Total Variation

15

Image reconstruction in PET


Stochastic models needed: typically measurements drawn from Poisson model

Image u equals density function (uptake) of tracer Linear Operator K equals Radon-transform Possibly additional (Gaussian) measurement noise b

Martin Burger

Cetraro, September 2008

Total Variation

16

Cameras and Microscopes


Same model with different K can be used for imaging with photons (microscopy, CCD cameras, ..) Typically the Poisson statistic is good (many photon counts), measurement noise dominates In some cases the opposite is true !

Martin Burger

Cetraro, September 2008

Total Variation

17

Low SNR
Bad statistics arising due to lower radioactive activity or isotopes decaying fast (e.g. H2O15)
~10.000 Events

Desireable for patients


Desireable for certain ~600 15 quantitative investigations (H2O Events is useful tracer for blood flow)
Martin Burger Cetraro, September 2008

Total Variation

18

Basic Paradigms
Typical imaging tasks are s0lved by a compromise between the following two goals: - Stay close to the data - Among those close to the data choose the one that corresponds best to a-priori ideas / knowledge

The measure of how close one wants to stay to data is the SNR, respectively noise level. For zero noise / infinite SNR one would reproduce the data exactly. The higher the noise level / lower the SNR the farther the solution can be from the data.
Martin Burger Cetraro, September 2008

Total Variation

19

Imaging models
Continuum Discrete

Image:

Data: Relation by (sometimes nonlinear Operator)

Martin Burger

Cetraro, September 2008

Total Variation

20

Imaging models
We usually use an abstract treatment with an image space X and a data space Y

Digital (discrete) model is nowadays the realistic one, however there are several reasons to interpret it as a discretization of an underlying continuum model: - Images come with different resolution, should be compareable - Rich mathematical models in the continuum PDEs - Robustness of numerical methods
Martin Burger Cetraro, September 2008

Total Variation

21

Relation between Image and Data


Denoising:

Decomposition :

Martin Burger

Cetraro, September 2008

Total Variation

22

Relation between Image and Data


Deblurring:

Inpainting: inpainting region

Martin Burger

Cetraro, September 2008

Total Variation

23

Relation between Image and Data


Segmentation:

Reconstruction:

Martin Burger

Cetraro, September 2008

Total Variation

24

Bayes Paradigm
The two goals are translated into probabilities: - Conditional data probability

- A-priori probability of an image in absence of data

Martin Burger

Cetraro, September 2008

Total Variation

25

Bayes Paradigm and MAP


Together they create the a-posterior probability of an image

A-priori probability of data is a scaling factor and can be ignored A natural estimator is the one maximizing probability, the maximum-aposteriori-probability (MAP) estimator

Martin Burger

Cetraro, September 2008

Total Variation

26

MAP Estimator
MAP estimator can be computed by minimizing negative loglikelihood:

A-priori probability can be related to a regularization term

Martin Burger

Cetraro, September 2008

Total Variation

27

The log-likelihood
The probability to observe data f if the exact image is u can be related to the distribution of the noise Example: additive Gaussian noise (pointwise)

Martin Burger

Cetraro, September 2008

Total Variation

28

The log-likelihood
The log-likelihood becomes a sum, which converges to an integral in the continuum limit

Martin Burger

Cetraro, September 2008

Total Variation

29

Variational model
The above reasoning yields directly a standard variational model The MAP estimator is determined from minimizing the functional

Martin Burger

Cetraro, September 2008

Total Variation

30

Variational model II
One can show that the above minimization is equivalent to

Martin Burger

Cetraro, September 2008

Total Variation

31

Discrepancy principle
The second formulation is a (generalized) discrepancy principle for Gaussian noise: Minimize the regularization (maximize a-priori probability) among all images that give a data discrepancy of the order of the variance Alternatively this can be interpreted as a rule of choosing Choose
Martin Burger

such that
Cetraro, September 2008

Total Variation

32

Image Space and Regularization


The image space and a-priori probability are directly related, X consists of all images with positive probability or, equivalently, finite regularization functional

What is the right choice of R ?


Martin Burger Cetraro, September 2008

Total Variation

33

Image Space and Regularization


How can we get reasonable regularization terms ? Dependent on goals and expectations on the solution Typical expectation: smoothness, in particular few oscillations (high oscillations = noise, to be eliminated) Few oscillations means small gradient variance, i.e.

Martin Burger

Cetraro, September 2008

Total Variation

34

Denoising Example
Consider MAP estimate for Gaussian noise with above regularization

Unconstrained optimization, simple optimality condition

Martin Burger

Cetraro, September 2008

Total Variation

35

Reminder: Gateaux-Derivative
Gateaux derivative of a functional is the collection of all directional derivatives

Martin Burger

Cetraro, September 2008

Total Variation

36

Optimality Condition
Compute Gateaux-derivative

Optimality:

Weak form of:

Martin Burger

Cetraro, September 2008

Total Variation

37

Elliptic Regularity
We were looking for a function in

Martin Burger

Cetraro, September 2008

Total Variation

38

Elliptic Regularity
Regularity theory for the Poisson equation implies

Hence u has even second derivatives and may be oversmoothed Note: derivatives go to infinity for

Martin Burger

Cetraro, September 2008

Total Variation

39

Scale Space and Inverse Scale Space


The square root of the Lagrange parameter defines a scale

Hence u varies at a scale of order

Smaller scales in f are suppressed

Martin Burger

Cetraro, September 2008

Total Variation

40

Scale Space and Inverse Scale Space


Multiple scales by iterating the variational method for small

Scale Space Methods (Diffusion Filters): Start with noisy image (finest scales) and gradually coarsen scales until a certain minimal scale is reached Inverse Scale Space Methods (Bregman Iterations): Start with the most rough information about the image (largest scale = whole image, i.e. start with mean value) and gradually refine scales until a certain minimal scale is reached
Martin Burger Cetraro, September 2008

Total Variation

41

Variational Methods
Variational method can be interpreted as both a - Scale space method:

- Inverse scale space method:

Martin Burger

Cetraro, September 2008

Total Variation

42

Scale Space Methods: Diffusion filters


Alternative construction of a scale space method: Reinterpret optimality condition

Martin Burger

Cetraro, September 2008

Total Variation

43

Scale Space Methods


Iterate the variational method using the previous result as data

Martin Burger

Cetraro, September 2008

Total Variation

44

Scale Space Method


Start with

Evolve u by

Denoised result:

Martin Burger

Cetraro, September 2008

Total Variation

45

Inverse Scale Space Method


Alternative by starting with coarsest scale

Martin Burger

Cetraro, September 2008

Total Variation

46

Inverse Scale Space Method


Opposite limit (oversmoothing) yields flow

Denoised result: T depends on noise level


Martin Burger Cetraro, September 2008

Total Variation

47

Variational Methods
Alternative smoothing via penalizing coefficients in orthonormal bases

Martin Burger

Cetraro, September 2008

Total Variation

48

Fourier Series
Rewrite functional

Equivalent minimization

Martin Burger

Cetraro, September 2008

Total Variation

49

Fourier Series
Explicit solution of the minimization problem

Martin Burger

Cetraro, September 2008

Total Variation

50

Other Orthonormal Bases / Wavelets


Analogous approach for other orthonormal bases, minimization of coefficients

Martin Burger

Cetraro, September 2008

Total Variation

51

Problems with Quadratic Regularization


Quadratic regularization yields simple linear equations to solve, but has several disadvantages - Oversmoothing (see above) - Edges are destroyed - Bias from operator A (see later) Alternative: other functions of the gradient

Martin Burger

Cetraro, September 2008

Total Variation

52

Nonquadratic Regularization
Optimality condition

Linearization for smooth and strictly convex G

Martin Burger

Cetraro, September 2008

Total Variation

53

Total Variation
Only way to penalize oscillations without full elliptic regularity is to choose G not smooth / not strictly convex Canonical choice

Martin Burger

Cetraro, September 2008

Total Variation

54

Minimization Problem
The minimization problem

has no solution in general (more later) Problem needs to be defined on a larger space

Martin Burger

Cetraro, September 2008

Total Variation

55

Total Variation
Rigorous definition

Martin Burger

Cetraro, September 2008

Total Variation

56

Why TV-Methods ?
Cartooning

Linear Filter
Martin Burger Cetraro, September 2008

TV-Method

Total Variation

57

Why TV-Methods ?
Cartooning

ROF Model with increasing allowed


variance
Martin Burger Cetraro, September 2008

Total Variation

58

Sparsity
Analogous approach in orthormal basis by penalization with weighted 1-norm

Martin Burger

Cetraro, September 2008

Total Variation

59

Sparsity
Most coefficients will be zero (sparse solution), the shrinkage of coefficients is a data-dependent Total variation leads to sparsity in the gradient, hence gradient will be zero in most points (usually in the others there are edges)

Martin Burger

Cetraro, September 2008

You might also like