You are on page 1of 34

Monday, October 2nd, 2013

Biochemistry 405 Lecture #4

PROTEIN CHARACTERIZATION
Kane Hall 130
10:30- 11:20 am
Lecturer: Wim Hol
Slide Set #2:
Only slide # 33 with suggested Problems updated

BIG PICTURE ITEMS

Proteins differ tremendously in size and properties

Several properties are useful for protein purification

Specific properties are used for protein characterization

Different proteases have different specificities

Mass spectrometry is a powerful analytical method

Protein sequences reveal evolutionary relationships

The rate of evolution of different proteins differs

The Building Blocks of All Proteins


hydrophobic

Gly
hydrophillic
Ala
neutral
Val
Ile
Leu
Met
Phe

Ser
Cys

His
ASP

Thr

Lys
Glu

Asn

Arg

Gln

Tyr

Trp
Pro

The Twenty Amino Acids Ultra-schematic

A Polypeptide Chain
Driven by the hydrophobic effect

Linking amino acids by forming peptide units.


The order of the amino acids is called the Primary Structure of a protein

How many proteins are possible?


The average protein chain is ~ 400
residues in length.
At each position any of the 20 amino acids
could occur, so that the number of
possibilities is: 20400 = 2.6 x 10520.
The number of atoms in the universe is
estimated as 9 x 1078.
VVP 3/e Fig on p. 91

So the number of possible proteins of


length 400 residues would exceed the
universe in size by many orders of
magnitude.
Protein sizes range from
~ 30 to ~ 35,000 amino acids

Proteins chains differ considerably in length

From the third column you see that proteins often contain multiple chains.
The last column gives only the molecular mass of a single chain.
You have to know that the average Molecular Mass of an amino acid residue is about 110.
(You do not need to know the names and numbers in the Table unless they come back explicitly in later lectures)

Protein Production

Very often large amounts of proteins are needed for e.g. 3D


structure determination or drug screening: these usually require
multi-milligram quantities of pure protein.
One can:
1. obtain protein from natural sources
2. clone and overexpress a gene from one species in a rapidly
growing cell of another species: heterologous expression.
Very popular these days are expression systems in
- Escherichia coli
- Saccharomyces cerevisiae, Pichia pastoris
- insect cell lines in culture.
- human cell lines in culture
7

Protein purification procedures


Characteristic
Solubility

Purification Procedure
Salting out
Often called: ammonium sulphate fractionation

Size

Selective Dialysis
Gel filtration chromatography

Charge

Ion exchange chromatography

Binding Specificity

Affinity chromatography

Protein purification based on solubility differences

VVP 3/e Fig. 5-5

a: mixture of three proteins


b: salt added, and centrifuged: the red protein is precipitated and in the pellet.
c: more salt added, and again centrifuged: the green protein is precipitated and in the pellet.
In this (highly idealized!) manner the mixture can be separated into three pure proteins.
In practice, one is already very happy if one protein can become more pure from a mixture.
9
The most commonly used salt is ammonium sulphate. Hence this method is often called ammonium sulphate fractionation.

Dialysis can be handy to remove smaller proteins from a mix


Dialysis is normally used for buffer exchange, but the newer membranes are produced with various
size cut-off limits which allow for removal of proteins below a certain molecular weight..

Entropy driven,
red one can
leave the
dialysis pouch,
blue one cannot

10
VVP 3/e Fig 2-14

Gel filtration chromatography


Gel beads used have cavities which are permeable to smaller molecules and
impermeable to larger molecules

11
Larger molecules come first off the column, smaller molecules are retarded and come later.

Isoelectric Point, aka as the pI


The isoelectric point of molecule is the pH at which the net charge of the molecule is zero.

If the pH is above the pI


the overall charge of the
protein is negative.

If the pH is below the pI


the overall charge of the
protein is positive.

The pI of a protein molecule obviously depends primarily on its amino acid composition.
However, since the pKs of individual functional groups in a folded protein depend also on
the environment of the group, the pI of a protein depends also on its conformation.
Hence, the precise calculation of the pI of a protein is quite a challenge.
YET : proteins with different overall charge run with different speed in an electrical field, which allows for 12
characterization and purification methods based on charge.

Principle of ion exchange chromatography.


Anion exchange:

Chemical groups R+ on a resin in a column are equilibrated


with anions A- at low ionic strength. R+A- ion pairs remain on
the column.
Polyanion Pn- (i.e. a protein with an overall charge of -n) is then
added to the column.
R+A- + Pn- <==> R+Pn- + A Pn- is attached to the column matrix and excess A- flushes out.
The column is then washed with several volumes of Na+A- at
low concentration to elute weakly bound impurities.
Now the column is washed with Na+A- at higher concentration
which elutes the bound Pn- :
R+Pn- + A- <==> R+A- + Pn The Pn- polyanion (i.e. the protein) is collected in a fraction
collector.

13

Practice (schematic) of ion exchange chromatography


Matrix is either cellulose or agarose.
Popular groups attached to the matrix:
DEAE = diethylaminoethyl : anion exchange
CM = carboxymethyl
: cation exchange

14

Affinity chromatography
A very powerful method but column preparation can be time consuming and/or pricey

Green and purple has no affinity to


the orange compound.
Goal: to elute the yellow

1
3
2
4

Once impurities are washed off,


elute desired protein (orange)
with ligand to remove it from
the column.

Impurities coming out with the wash

5
15

Protein concentration determination and A280

(Note that the vertical scale


is logarithmic)

Proteins usually contain several Trp or Tyr or Phe residues.


The side chains of Trp and Tyr absorb quite strongly UV light at 280 nm
Hence, ideal for concentration determination by absorption spectroscopy
Hence this method is often used.
But not all proteins contain Trp or Tyr or Phe and then other methods are needed
16
And the UV method is not very sensitive, rarely lower than 50 to 100 g per mL

Protein concentration determination

Coomassie brilliant blue binds to proteins.


In acidic solutions, the absorbance shifts from 465 to 595 nm upon binding to proteins.
Hence the 595 nm absorbance provides a way to measure the total protein concentration.
In the so-called Bradford assay, which uses this absorption shift of Coomassie,
17
protein concentrations as low as 1 g per mL can be detected

Concentration determination of a specific protein

It is often important to find a rapid &


reliable assay for the specific protein you
wish to purify.
A popular method is:
Antigenic specificity such as ELISA
(Enzyme-Linked Immunosorbent Assay)
(But you need to generate specific antibodies)
The principle is shown on the right.

18

Protein characterization methods


Characteristic

Characterization method

Size

Gel filtration chromatography(1)


SDS-PAGE chromatography

Charge

Ion exchange chromatography(1)

Binding specificity

Enzyme-linked Immunosorbent Assay (ELISA) (1)


Enzyme specificity(2)

Amino acid sequence

Mass spectrometry

(1) Already

discussed on previous slides in this lecture


(2) Will be discussed in this lecture and later lectures

19

SDS-PAGE
For analyzing the size and purity of your protein sample

SDS = Sodium dodecyl sulfate = [CH3- (CH2)10-CH2-O-SO3-]Na+

SDS denatures proteins & binds to denatured protein with ~one SDS per two amino acids
largely independent of amino acid sequence.
SDS-treated proteins of similar length have similar, rod-like, shapes, and a charge which is
proportional to the length. The larger the protein is the slower it runs in electrophoresis
during SDS-PAGE. (PAGE = Poly-Acrylamide Gel Electrophoresis)
Vv f06.20,25 3rd ed.

20

By using controls, estimates of the molecular mass (within 10 - 20%) can be obtained.

Amino Acid Sequence


Determination Methods
1. Chemical

(explained in
detail in book; you dont need
to know those details)

2. By Mass Spectroscopy
(aka as Mass Spec)

In general: based on a
divide-and-conquer
approach

21

Some proteases and their specificity

Remember enzyme for exam

You have to know the protease names and specificities not the source.
22

Mass spectrometry

- Mass determination of purified proteins


Electrospray, MALDI and Fast Atom Bombardment techniques.
- Requires only picomoles (10-12 mole) of material.
- Time required is very short.
- Mass is accurate to 1 Dalton for proteins to 300 kD.

- Peptide sequencing using tandem mass spectrometry.


- Proteases are employed to fragment the protein into peptides.
- Sequences are determined by matching the masses observed
to expected peptide masses for 1,2,3,... residues

23

Electron Spray Ionization Mass Spectrometry (ESI)

Nifty droplet
making device

Dry N2 or some other gas promotes the evaporation of solvent from charged droplets
containing the protein of interest, leaving gas-phase ions, whose charge is due to the
protonation of Arg and Lys residues, thereby yielding so-called (M + nH)n+ ions.
The mass spectrometer then determines the mass-to-charge (m/z) ratio of these ions.
The resulting mass spectrum consists of a series of peaks corresponding to ions that
differ by a single ionic charge and the mass of one proton.
24

Electron Spray Ionization Mass Spectrometry (ESI)

The ESI-MS spectrum of horse heart apomyoglobin (myoglobin that lacks its heme group).
The measured m/z ratios and the inferred charges for most of the peaks are indicated.
The data provided by this spectrum permit the mass of the original molecule to be calculated.
(See Sample Calculation on VVP 3rd Ed p. 111 or VVP 4th Ed p. 113 if you really interested.
It is great fun. But you dont need to know for the exam).

The peaks all have shoulders because the polypeptide's component elements contain small mixtures
25
of heavier isotopes (e.g., naturally abundant carbon consists of 98.9% 12C and 1.1% 13C, and naturally
33
34
35
abundant sulfur consists of 0.8% S, 4.2% S, and 95.0% S).

The use of a tandem mass spectrometer (MS/MS)


in amino acid sequencing of a protein.

Electrospray ionization (ESI), the ion source, generates gas-phase peptide ions, labeled P1, P2,
etc., from a digest of the protein to be sequenced. These peptides are separated by the first
mass spectrometer (MS-1) according to their m/z values, and one of them (here P3) is directed
into the collision cell, where it collides with helium atoms.
This treatment breaks the peptide into fragments (F1, F2, etc), which are directed into the second
mass spectrometer (MS-2) for determination of their m/z values, from which the amino acid
sequence of the fragment can be determined.
(This method enables also discovering and determining post-translational modifications)
26
VVP 3/e Fig 5-17, VVP 4/e Fig 5-18

Protein Sequence and Protein Evolution


Sequence information is extremely useful, when placed in the broad
context of known sequences. For instance for:
- checking if a protein with a new amino acid sequence might have a
similar fold as a protein with known structure. E.g. if 25% or more
sequence identity is observed, and not too many gaps occur in the
alignment, the proteins adopt the same overall fold.
- creating a family sequence alignment. Invariant amino acids in the family
are likely to be important for function or for a fold characteristic.
- constructing phylogenetic trees. These can give profound insight into
evolutionary relationships between species.
- estimating evolutionary rates of evolution of proteins. Since not all proteins
are subject to the same evolutionary pressure, due to their different functions,
their rates of change in the course of time is not the same.
- discovering the distribution of domains having the same fold and yet quite
different sequence, often with a characteristic sequence motif, in a wide 27
variety of otherwise often unrelated multi-domain proteins.

Sequence alignment
Alignment of human myoglobin and the human hemoglobin -chain.

The chains are 27% identical which is sufficient similarity to conclude they
are homologous; that is, derived from a common ancestor.
They occur in the same organism, so most likely they are the result of:
Gene duplication & divergence of sequence and function.

Note that it is necessary to insert gaps (so-called indels) denoted by _,


in order to maximize the identities.
28
VV ( NOT VVP) 3/e Fig 7.27.

29

Family sequence alignment of cytochrome c (ctd.)

30

Phylogenetic Tree of Cytochrome c

Each branch point represents an organism ancestral to the species connected above it.
The number beside each branch indicates the number of inferred differences per 100
residues between the cytochromes c of the flanking branch points or species. 31

Rates of evolution of four proteins

Different protein families have clearly remarkably different rates of


change in amino acid sequence during evolution
(Horizontal axis according to the fossil record)
32

Useful Problems at end of Chapter 5, 3rd Ed VVP:


1, 7, 12, 14, 17,18
Useful Problems at end of Chapter 5, 4th Ed VVP:
1, 3, 8, 10, 19, 23, 25
Useful Problems at end of Chapter 3, 7th Ed VVP:
13, 17 (was: 10,11,13)

33

34

You might also like