Professional Documents
Culture Documents
Jian Gu, Baylor College of Medicine, Houston, Texas, USA Ram Reddy, Baylor College of Medicine, Houston, Texas, USA
In the cell, there are three major types of RNA directly involved in protein synthesis. In addition, many other cellular RNAs also play important functions.
Introductory article
Article Contents
. Introduction . Overview of mRNA, rRNA and tRNA in Different Species . Small Nuclear RNA (snRNA) . Small Nucleolar RNA (snoRNA) . Small Cytoplasmic RNAs (scRNAs)
Introduction
RNA is a molecule of many facets and subtleties, participating in almost all macromolecular processes. The central dogma states that genetic information ows from DNA to RNA and then to protein: DNA! RNA!Protein. There are three major types of cellular RNA directly involved in protein synthesis: messenger RNA (mRNA), ribosomal RNA (rRNA) and transfer RNA (tRNA). In addition, there are many other RNAs playing varied roles in the cell. In this article, we rst briey review the general aspects of three major cellular RNAs involved in protein synthesis, then discuss some other important RNA species. Organellar RNAs, i.e. eukaryotic mitochondrial and plant chloroplast RNAs, many of which are transcribed from their independent genomes by distinct RNA polymerases, are covered elsewhere.
. MRP RNA . RNase P RNA . The 7SL RNA Component of the Signal Recognition Particle . 4.5S RNA . Alu transcripts . Telomerase RNAs . RNA Primers for Okazaki Fragments . Non-protein-coding mRNAs . Maternal mRNAs in Development
so newly transcribed mRNAs are used directly as templates for translation. In eukaryotes, however, a much more complex process occurs to produce mature mRNA templates for translation.
5 Capping
Most eukaryotic mRNAs contain a 7-methylguanosine residue (called cap) attached to the terminal residue of initial transcript through a 5 ppp 5 linkage. This cap structure is required for translation initiation and contributes to mRNA stability and export. Prokaryotic mRNAs do not contain a 5 cap structure, but the initiation codon is preceded by a short stretch of purine-rich sequence (the ShineDalgarno sequence), which facilitates translation initiation.
3 Polyadenylation The 3 end of both prokaryotic and eukaryotic mRNAs are polyadenylated, but there are fundamental dierences between them. Most eukaryotic mRNAs contain 50200 adenylic acid residues at their 3 ends, but the poly(A) tracts of prokaryotic mRNA are generally shorter, ranging from 15 to 60 adenylic acid residues and are associated with only 260% of the molecules of a given mRNA species. The eukaryotic polyadenylation machinery recognizes a specic consensus near the 3 end, whereas the sites of polyadenylation of prokaryotic mRNA are diverse, and the reaction does not require a consensus sequence. The poly(A) tail functions in mRNA turnover and also in mRNA translation.
1
Splicing of intervening sequences (introns) Genes in eukaryotes are often interrupted by intervening sequences called introns, which do not code for proteins. The DNA sequences are transcribed with no discrimination between introns and the coding exon regions, therefore the primary transcript (pre-mRNA) is littered with segments of genetic nonsense. Consequently, most protein-coding transcripts must be processed to remove these introns before protein expression can occur. The process by which introns are removed and the anking exons are stitched back together is called RNA splicing. Splicing occurs within a large ribonucleoprotein (RNP) complex called the spliceosome. A unique collection of cellular RNAs, called small nuclear RNAs (snRNAs), are critical for pre-mRNA splicing. Other posttranscriptional events In addition to the above common processing events, there are some other events which occur in a few individual mRNAs, including nucleotide modications and RNA editing.
50S subunit
34 Proteins 5S rRNA 2 RNAs 21 Proteins 70S ribosome (a) ~49 Proteins 5S rRNA 60S subunit 3 RNAs ~33 Proteins 80S ribosome (b)
Figure 1 A comparison of the prokaryotic and eukaryotic ribosomal RNAs. The large subunit (LSU) is shown in blue and the small subunit (SSU) is shown in yellow. (a) In prokaryotes there are three types of rRNA: 5S and 23S rRNAs in LSU, 16S rRNA in SSU. (b) In eukaryotes there are four types of rRNA: 5S, 5.8S and 28S rRNAs in LSU, 18S rRNA in SSU.
23S rRNA
30S subunit
40S subunit
regions, which provide a structural scaold for protein binding. There are many interactions in the ribosomes among rRNAs, mRNA and tRNA. Recent advances in ribosome research provided convincing evidence that rRNAs, rather than ribosomal proteins, play a central role in catalysing the formation of peptide bonds. In all organisms, rRNAs are transcribed as large precursors containing structural gene products anked by extra sequences. In prokaryotes like Escherichia coli, the three types of rRNA are transcribed as one long RNA molecule, which is then processed by nucleolytic cleavage to release full-length, mature rRNAs. Furthermore, during the maturation process, the base and ribose modications found in mature 16S and 23S rRNA are generated. The specicity of the initial cleavage sites depends on the ability of pre-rRNA to form stem structures involving sequences anking both the 16S and 23S rRNAs. Eukaryotic rRNAs undergo a similar processing pathway but with more complexity. The 18S, 5.8S and 28S (25S in yeast) rRNAs are initially transcribed as a single large precursor molecule by RNA polymerase I and subsequently processed through a series of cleavage reactions into the mature species. In addition, the primary rRNA transcript also undergoes methylation and pseudouridation on 18S, 5.8S and 28S rRNA. What is most unique to eukaryotes is the participation of small nucleolar RNAs (snoRNAs) as guide molecules in the accurate processing and modication of rRNAs. In addition, unlike prokaryotes, eukaryotic 5S rRNA is transcribed separately by RNA polymerase III with little posttranscriptional processing and modication.
U1 GU
U2 A AG exon2
U4 U5
U6
U4
U6 A U2
U1 U4
U6 U2 A 1st Step
U2
A 2nd Step
Figure 2 A simplified view of the spliceosome assembly and rearrangement. U1 snRNP binds to the 5 splice site, U2 subsequently binds to the branch site and then U4/U5/U6 triple snRNPs join in. After a dynamic rearrangement, U1 and U4 are destabilized, and the spliceosome is activated for the two steps of cleavage ligation event.
nucleolar RNAs (snoRNAs). More than 100 distinct snoRNA sequences have been identied in vertebrates and yeast. In yeast, most of the snoRNAs are transcribed from independent genes using their own promoters. However, the majority of mammalian snoRNAs are processed from introns of pre-mRNAs. These snoRNAs are responsible not only for orchestrating the cleavage events that cut the long pre-rRNA into 18S, 5.8S and 28S, but also for determining the specic sites for modication. Vertebrate rRNAs contain approximately 105 methylated sugars, 95 pseudouridines, and 10 methylated bases,
3
GU
exon1
exon2
GU
GU GU
U1
exo
U5
n1
AG
exon2
exo
U5
n1
AG
exon2
exo
n1
U5 U6 AG exon2
AG
whereas yeast rRNAs have about half as many modications. In prokaryotic rRNAs, there are even fewer modications and it is believed that site-specic enzymes are responsible for these modications. In contrast, rather than developing a specic enzyme for each modied nucleotide, eukaryotes evolved a unique mechanism for site-specic modication using probably a very limited pool of modifying enzymes. snoRNAs exhibit extensive or short complementarity to the rRNA sequence anking the nucleotide to be modied and directs either sugar methylation or pseudouridation. All known snoRNAs, except for the mitochondrial ribosomal protein (MRP) RNA, can be simply classied into two large families. One family is dened by conserved boxes C and D and the other by a consensus ACA triplet positioned three nucleotides before the 3 end of the RNA. U3 snoRNA is the rst snoRNA to be identied and the most abundant snoRNA. Phylogenetic comparison of U3 snoRNAs from various species revealed conserved sequence elements called boxes C (UGAUGA) and D (CUGA), which were later found to be present in many snoRNAs. All of the C/D box snoRNAs bind to an evolutionarily conserved nucleolar protein, brillarin, and function in various steps of pre-rRNA maturation. The U3, U8, U14 and U22 snoRNAs have been shown to participate in the processing of rRNAs at various cleavage steps. The vast majority of box C/D snoRNAs have an extensive sequence complementarity (ranging from 10 to 21) to highly conserved regions of rRNA and serve as guide molecules for site-specic ribose methylation. A model for the selection of 2-O-methylated nucleotides in rRNA sequences by interaction with box C/D snoRNAs is shown in Figure 3a. According to this model, the RNA double helix formed by the snoRNA and the rRNA is followed by the D
box of the snoRNA. A nucleotide in the rRNA sequence, which is located in the snoRNArRNA helix opposite to the fth nucleotide upstream from the D box of the snoRNA, is selected for ribose methylation. The box ACA snoRNAs share a phylogenetically conserved secondary structure. The ACA snoRNAs fold into two hairpin structures connected by a single-stranded hinge region and followed by a short 3 tail. The hinge region carries an extra conserved motif, called box H (consensus, AnAnnA). The box ACA snoRNAs lack extensive sequence complementarity to rRNA, but they function as guide RNAs in the site-specic pseudouridylation of pre-rRNA via an elegant mechanism (Figure 3b). In the 5 or 3 hairpin element of the snoRNA, an internal loop structure, called the pseudouridylation pocket, selects the target rRNA sequence by forming two short (310 bp) helix structures that are separated by two unpaired ribosomal nucleotides. The rst unpaired nucleotide in the selected rRNA sequence (in a 5 to 3 orientation) is a uridine residue that is converted into pseudouridine.
Figure 3 Schematic representation of box C/D and box ACA snoRNAs in directing 2-O-methylation and pseudouridine (C) formation. (a) Box C/D snoRNA directs 2-O-methylation. (b) Box ACA snoRNA directs pseudouridine formation. Each box ACA snoRNA may contain one or both of the internal loop structure, called pseudouridylation pocket. Modified from Tollervey D and Kiss T (1997) Current Opinion in Cell Biology 9: 337 342.
2 O Me
5 rRNA
snoRNA 5 (b)
Box H ANANNA
function of Ro RNPs is unknown, their evolutionary conservation and their involvement in human pathologic conditions suggest an important biological role.
MRP RNA
MRP RNP was originally characterized as a site-specic ribonuclease that cleaves an RNA sequence priming leading-strand DNA synthesis in mitochondria. The RNA component of MRP is encoded by a nuclear gene and must be imported into the mitochondria in order to process mitochondrial RNA in vivo. However, cellular fractionation and immunolocalization have shown that a vast majority of the RNase MRP is located in the nucleolus. Therefore, MRP RNA represents a unique member of the snoRNAs. MRP RNA has been found in many eukaryotes including human, yeast and plant cells. In yeast (Saccharomyces cerevisiae) RNase MRP cleaves prerRNA in a region upstream of the 5.8S rRNA and this cleavage can be reproduced in vitro by the highly puried enzyme. At least three protein components have been found in yeast in MRP particles. Both the RNA and protein components of MRP are essential for viability in yeast.
be folded into strikingly similar secondary structures. Both RNase P and MRP share some common protein components. Moreover, both of the in vitro substrates recognized by RNase MRP, the mitochondrial D loop region and the yeast pre-rRNA, can also be cleaved in vitro by RNase P. The structural and functional similarities between RNase MRP and RNase P have led to the suggestion that RNase P RNA and MRP RNA originated from a common ancestor.
RNase P RNA
RNase P is an RNP responsible for the generation of the mature 5 end of tRNAs from precursor tRNAs by a single endonucleolytic cleavage. Bacterial RNase P is composed of a catalytic RNA subunit of 350450 nucleotides and a small protein subunit of about 120 amino acids. Under in vitro reaction conditions of high ionic strength, the RNA itself can cleave precursors of tRNA in the absence of the protein subunit. This is the rst true RNA enzyme characterized. However, the protein subunit is essential for activity in vivo. Bacterial RNase P can cleave not only all the dierent precursors of tRNAs but also other nontRNA substrates including precursor 4.5S RNA and prerRNA. Eukaryotic RNase P enzymes are more complex compared with the bacterial ones in that eukaryotic enzymes have a signicantly higher protein content and attempts to show that RNA alone has catalytic activity have not been successful. Several lines of evidence suggest that RNase P and MRP are related to each other. Both are endoribonucleases which cleave RNA to generate 5-phosphate and 3hydroxyl termini in a divalent cation-dependent manner. Both have activity in the nucleus and mitochondria. Both RNase P RNA and MRP RNA are synthesized by RNA polymerase III and are not capped. Although their primary sequences are not highly homologous, they do contain distinct conserved regions. In addition, the two RNAs can
4.5S RNA
Transport of protein across bacterial cytoplasmic membrane is evolutionarily related to transport of proteins across the ER membrane in eukaryotic cells. In E. coli, the translocation into the periplasm of secretory proteins mostly depends on the so-called general secretory pathway. However, an alternative SRP-dependent targeting pathway has also been identied. E. coli SRP is relatively simple and contains a 4.5S RNA and a single protein which is homologous to mammalian SRP54. 4.5S RNA (114 nucleotides) is a stable, abundant RNA that is essential for viability. It forms an extended stemloop structure, which is homologous to the most conserved domain of 7SL RNA.
5
Alu transcripts
The Alu family is an extremely abundant, repetitive sequence representing around 613% of human genomic DNA. They were named after the AluI restriction site within this consensus sequence. Consensus Alu sequences are approximately 300 bp in length, and consist of two similar, but distinct monomers linked by an oligo-d(A) tract. At the designated 3 end of the Alu transcript there is an oligo-d(A) of variable length. Despite the high numbers of Alu repeats in the human genome, Alu RNA transcripts are very scarce in normal cultured cells. There are two main forms of pol IIItranscribed cytoplasmic Alu transcripts. Full-length Alu RNA (Alu) contains the typical dimeric Alu sequences and the 3 poly(A). A fraction of Alu is processed into more stable, small cytoplasmic Alu (scAlu) RNA, which corresponds to the left monomer of the dimeric structure. The functional role of Alu transcripts remains a mystery. However, cell stress, viral infection and translational inhibition increase the abundance of human Alu RNAs, suggesting a physiological role of Alu RNAs.
Telomerase RNAs
Telomeres are the specialized structures comprising the termini of eukaryotic chromosomes. In nearly all eukaryotes examined, the telomeric DNA consists of tracts of tandemly repeated sequences extending to the chromosomal ends. These telomeric repeats are generally short, Grich tandem repeats and are required for chromosomal stability and complete replication. Cellular DNA polymerase can only synthesize in the 5 to 3 direction, and requires an RNA primer to initiate synthesis. Without some form of terminal replication, chromosomes would progressively recede from their ends since the initiating RNA primers have to be removed. It is telomerase, a unique cellular reverse transcriptase, that plays a critical role in telomere maintenance. Telomerase is an RNP particle which contains an RNA template as an integral part of the enzyme. The RNA component contains a sequence complementary to the telomeric repeats and serves as the template for their synthesis. The gene encoding telomerase RNA has been sequenced from more than 20 ciliate species, yeast, mouse and human. The primary sequences of telomerase RNAs have diverged considerably, however, the secondary structure of telomerase RNA is highly conserved. Telomerase RNA is transcribed by RNA polymerase III in ciliates, however, in yeast and mammals, it is transcribed by RNA polymerase II. In yeast, a portion of telomerase RNA is polyadenylated, although the functional signicance of this is unknown.
Non-protein-coding mRNAs
Ever since the discovery of the split gene in eukaryotes, introns have been considered as junk DNA, while spliced exons are meaningful sequences which code for proteins. The rst blow to this traditional view came several years ago when most mammalian snoRNAs were found to be processed from introns of pre-mRNA. The most stunning challenge to the traditional denition of exon and intron is the elucidation of UHG (U22 host gene) structure. UHG encodes eight box C/D snoRNAs (U22, U25U31) within its nine introns. The spliced UHG mRNA is poorly conserved between human and mouse, lacks a long open reading frame, and is rapidly degraded in the cytoplasm. The introns, on the other hand, are highly homologous. It seems that only the introns of UHG encode function, whereas the exons are junk DNA to be discarded. Several similar genes have been identied, including Gas5, which encodes several snoRNAs, and U19H (U19 host gene). There are several examples of non-protein-coding mRNAs which have been implicated in important regulatory functions. For example, Xist RNA is essential for inactivation of most genes along the X-chromosome in female mammals and Drosophila; Xlsirt RNAs are a crucial part of a genetic pathway necessary for the normal pattern formation in Xenopus; Pgc RNA is required for germline development in Drosophila; etc.
Further Reading
Baserga SJ and Steitz JA (1993) The diverse world of small ribonucleoproteins. In: Gesteland RF and Atkins JF (eds) The RNA
World, pp. 359382. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Grosjean H and Benne R (1998) Modication and Editing of RNA. Washington, DC: ASM Press. Sharp PA (1994) Split genes and RNA splicing. Cell 77: 805815. Simons RW and Grunberg-Manago M (1998) RNA Structure and Function. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Soll D and RajBhandary UL (1995) tRNA: Structure, Biosynthesis, and Function. Washington, DC: ASM Press. Staley JP and Guthrie C (1998) Mechanical devices of the spliceosome: motors, clocks, springs and things. Cell 92: 315326. Tollervey D and Kiss T (1997) Function and synthesis of small nucleolar RNAs. Current Opinion in Cell Biology 9: 337342. Walter P and Johnson AE (1994) Signal sequence recognition and protein targeting to the endoplasmic reticulum membrane. Annual Review of Cell Biology 10: 87119. Zimmermann RA and Dahlberg AE (1996) Ribosomal RNA Structure, Evolution, Processing, and Function in Protein Biosynthesis. Boca Raton, FL: CRC Press.