2. 319
1
51
i01
151
201
251
301
351
401
451
501
551
601
651
701
751
801
851
901
951
10Ol
1051
Ii01
1151
1201
1251
1301
1351
1401
1451
1501
1551
1601
1651
1701
1751
1801
1851
1901
1951
2001
2051
2101
2151
2201
2251
2301
2351
2401
2451
2501
2551
2601
2651
2701
2751
2801
2851
2901
2951
3001
3051
3101
3151
3201
GATCCGGACT CGAGATCAAA AGACCGGAGG
CCCTCGAAOG GGGGGTCGTA CAGGGGT~C
CCTGGGCACG GAACGGATCG GGGGTCCATG
TGGGTCCTCA CGGGGGCCTT CCCTCCAGGG
~GGTCCCCTG QGTCCATGGG GCCCTTGAAG
~ C C G GTCCTCACGG GGTGGGGCCC
CTCCATCGGG GGGCGGGGAC TGGGGTCCAC
ATCCGGGTCA TCGGACCATG TGGTCATATG
CGGTCCCTCG GGGGTGGGTC ATCGAATGGG
TTGGGGTCCG GGGTCCCCCT CGACCCTCAC
ATGAACGGGA CGGAGGTCCA ~ C C A G
TTGAATCTCT GGGGTCTCGG GCCGGACCCT
CCCAAGCCCT CACGGGATCA TCGTACGGGT
CATACGAGGG ACGGGGGCT~ CCGGGGCCCT
ACGGGGCCTC CCCCCATAGG TCCATGGACC
CCATCATACG GATGGGGGTT C ~ T C G
G C G A ~ G T C ~ T
CCGTGGGCCC ACGGCGTCCC
CCCTGGCCCG ~AGTCCACA
GTCATTCGAG GTGTCCCCGG
TCCTTCAAGA TGAACGCKT-GA
TCACACGGGG GGGCTCATGG
CCCCCGCCCT CCTTCGAC-C4/
GATGGGCCTT ~ C A
ATGGGGGACC ATCITCCCCA
AGGTCCCCCT CGAGTCCTCC
GGGATCCCCG TGGGGTCCCC
CACGGGATTA TAATC-C-C-Cs2~
A ~ T C ATGGCTCCAT
CACGGGGTCC TIW_AAGACGA
TTCATGGGTG GGGGTGTGGT
GGGGTCCGTC CGGTCCTCAC
GGGACAGGAT .C_CAGGGGGCC
CGGGTCATCA GGGGGATCCG
CATGGACGGG GACTCCCCCA
CTCACC-GATC ~ITCAGAGGG
GCCCCCACGG GTCATCGGAC
GTAGAGGTAG TGGGTCCATG
GGGAGGGAAC CCTTCCCGCT
GGCGCCGAAG CGA~CC
ACGGAGGCCA CTGTG~CAC
GGGCC4SCCCG CGTCTCCGTG
CTCATCACCT CTCATCACTT
TTATCGTCTC TTATCATCTC
CATCGCGTCC CCAGCCCCCC
CTCATGGATC ~I~ATGGATC
CCTGACTCCT ~ T C A C
TTCCTCCCTC AGCCCTCCCC
CTACCCCACG TCGC4~AGCCC
TTCACGTCCC CTCATACGGG CCACATGGAC
GGTCATCATA TGGGGCCCGG ~CCTCATATC
~ k ~ CCCCTTCATC GATCGGGTCC
ACACGGCATG GGAt.-J.-J.-~'CAG
GGACGATCGA ACGGGCCCCC GAGGGCACGA
GGGCCCGGGA CGACGCTCAT CATTCCAGAC
TTCTCCTTCA TTCAAGTTTA GGACCGGAGG
CCgCAGAGGG GGACCGAAGG GGCGGCCCGA
CGGCCACTGA CTGGGCGTGT CTGCGTCCCC
TCCCTCGCAT CTCTCGCATC TCCCGCGTCT
ATTATCGTCC CTTATCGTGT CTCGTGTCCC
TTATCGTCCC CCGTCATCIX2 TCATCGCTCT
CACTCCTCTC GTGGATCCCC CTGGCCCCCT
~'TCATGCCCC CTCATGCCCC CCCATGCCCC
CTCGTGGCCC CCCCTCTCAG GCCTCCCCCC
ACGCCTGCAC GCACGGGCCC CCATCCCCTC
TCTCCCACAT CCGTCTGACT CGTCCCCCCA
ACCCCCAGGT
GGGGAATCCC
CCCGTCCCCC
~ C C G
TCCI'IV,-GC4~A
GGGGCCTATG
GGGCCGGGGA
CGGACACCAA
GGCCGAt2-r~-J."
A ~ A C G
CTCCAGATCC TC~TTA
GGACAATAGA GGGTTCGATC
~CCCTT TCGACe-GTGT
~ T C G ~ T G G C
CGGGTCGAGA G G G A ~
AGCGA'hI~x-*-I"~TC'V~
CGGGTCGAGG CCTGGTGC4~C
CTTACACCTA A A A ~
CGGCTACTAT AGGGGAAGGA
AAGTG~
ATATTTACTA ATAAAATCCA
CATCATCTCG TCCCATGACT
TACGTGACTG TGCACGCAGT
AGGAGACTTG CATGGGGTCA
GC-CCACCTAA GGGCCTTATG
GAGGCTCCCG GGACGCTCCT
CA'~'t-~-L-~'*,~AAACTCGACTT
GAAAGTGCAC GAAAGATGGA
GGAAGGCCGG G C C ~
AC-G~4%AC-~_A_~Q~GAGGGGT
CACCTGGGGA CCCTCAATAA GTC-GACGAAA GGTGGAGGCC GAL-FFICCGC
TACTGTAGGG GCCAGGAGGC CGGGGCCCGG ~zGGCCCGAAG GGCGAGGCCG
GCCGA~CCCCCCTC AAATCGGACG ~ CGAGAGGCCC
AAATCGGACG GGGAGGAGGG CTCCAGGGC~CCCCCCCT CAAATCGAAC
GGGGGAGGGG CGAGAGC-CCC AAATCGGACG GGGAGGAGGG CTCCAGGGCC
........................................ C **********
GCCCCCCCTC AAATCGAACG ~ CGAGAGGCCC AAATCGGACG
GGGAGGAGGG CTCC~Q~CCCCCCCTC AAATCGAACG GGCGAGGCGA
GAGGCCCAAA TCGGACGGGA GGAGGC-CTCC AGGC~CCC CCCCTCAAAT
CGAACGGGGG GAGGGGCGAG ~.C.-GCCCCCAAATCGGACGGG ~ C T C
CAGGC~CC CCCCCTCAAA TCGAACGGGG G G ~ GAGGCCCCCA
********** ********** *****Vr*t** ******vr*** **********
AATCGGACGG C-C~GGAGGCC CC~CGGC CCCCTCCCCC CTCC-GTC
********** ******** ............... C .......... C .......
TTCCTCCCTC CCCCCCTCGC CCGGTCCTGA ACGAAGTGCC GTCGCAGGGA
GGGGAGTGAG CTGGGGAAGA AGGACCACAA CCCCCACCTA GAGACCTTCA
ATAAGTGGCC CAAAGATGGA GGCCGA'I-I-FI"CGGCTACTAT AGGGAAC-GCA
CGACGAGGCC GGACCGGGAC GAGAAGACCC ACCCGACGAC GAGTGTGC2_A~
~'GTACGACGA GAGGACGGAA GGGTCCGGCC CACAACGGAC GGATCGACCG
Vl
ATCGACAGAC CATCCGGTTG ATCCTGCCGG AGTACTACGC TACeCCAAGG
A~C, CCA TGCAAGCGGA CACGAGGTAT G A A ~ ACGGCTCGGT
ACAACC~TAC GAGTCTGACC GGGGGTGAAG GCTAGACGC~ TACCGCTGGC
AACCCAGCGC C/%AGACGAGT GCTCAAGAGC GGGGAAGGAA AGCACGCGAT
GGAGCGAATG CCCGGATGAG GTTCCGAGGT ATTACCTAGT CGGTAGAGTA
GTGGTCTACG GAGGGGATGA TGCCTGGCGG AGGATC/kGGG TTTGACTCCG
Fig. 1. The nucleotide sequence of Giardia muris rDNA. The
RNA-like DNA strand of the longest rDNA is given correspond-
ing to the map. (See insert.) The positions of the 5' and 3' ends
of the SSU-rRNA (numbers 1 and 2) and 5.8S rRNA (numbers 3
and 4) genes are indicated with numbered arrowheads. Here,
only the first 23 and last 37 bases of the LSU-rDNA are indicated
with arrowheads (5, 6). The sequence of a shorter spacer-specific
Sau3AI fragment, position 2139-2901, is aligned with and written
under the DNA sequence; the identical base sequences are indi-
cated by dots (.) under the bases, and deletions are indicated by
asterisks (*). The positions of a number of direct repeats are
indicated by underlining the beginning of each repeated sequence
with a short arrow (---~). The various restriction enzyme sites,
3251 GAGAACGGGC CTGAGAGACG GCCCGTACAT CC/%AGGACGG CAGCAGGCGC
3301 C,GAACTTGCC CAATGCGTGA AGGCGTC.AGG CAGCAACGGG GGATCCCATG
3351 AAATGGGAAG ACTGGGGGGT AGATGACCCC AGCACAAGTC GAGGGAAAGG
3401 TCTGGTGCCA GCAGCCGCGG TAATTCC.AC~ TCGGCAGGCG TCGTACGGCG
3451 CTGTTGCAGT TAAAACGTCC GGAGTCGAGA CGTCCAC, CCG GGAGGAAAGA
3501 GGAGCGCTTA AGGCGGGAGT GAGTACGAGA AAGCCCGGGA CGGACATGAA
3551 GGTGAATGGG TAAGGGCATG TGTATTGGTG ~CGGGT GAAATAGGAT
3601 GATCCGACCA AGACAGAeAA AGGCGTAGQC ACTTGCCAAG ACCATATCAG
3651 TCGAACCAGG ACGAKGCCCG GGGGCGAGAK GGCGATTAGA CACCACCGTA
3701 TTCCCGGGCG TAAACGATGC CACCGAGAGA CTGGCCAGGT CGTO_AGGATC
3751 GAAGGGAAAC CGATCAGGGT ACGGGCTCTG GGGGGAGTAT GGCCGCAAGG
3801 CTGGAACTTG I%AGGCATTGA CGGAGGGGTA ~d%CCAGAeG TGGAGTCTGC
3851 GGCTCAATTT ~ G C GAACACCTTA CCAGGCCCAG ACGTACGGAG
3901 GATCGACGGT TGAGAGGACe TTCGTGATCG TACGAGTGGT GGTGCATGGC
3951 CGTTCACAGC e e Q ~ AGCCGTCTGC TTGACTGCGA cAKeGAGCGA
4001 GACCCTAACC TGGATGGGAC CGCe~ATGGT G/%ATTGGAC,G AAGGTGGGGC
4051 GAT/%ACAGGT CTGTGATGCC CTTAGACGCC CTC,GGCTGCA CGCGTACTAC
4101 ACTGTGGGGA TC,AAACCACG TCGAGTTGTG /%AGCTTGATG AGATC/%ACCC
4151 CCACGTGGTT GGGATCGTGG ACTGGAKCGT CCTCGTGAAC CTC,GAATGTC
4201 TAGTAGC,CGT AGGTCATCAA TCTACC,CCGG ATACGTCCCT C,CCCCTTGTA
4251 CACACCGCCC GTCGCTCCTA C C G A ~ CTTCTGGCGA GCTCCTGGGA
4301 GGGATGAACC GAACAGGGAC GAACeGCGAG GCTTGGAGGA AGGAGAAGTC
"172
4351 GTAACAAGGT ATCCGTAGGT GAACCTGCGG ATC,GATCCAT CGAGAGGGA/~
V3
4401 GGTACGAGGA TGAAGC-CAAT GAGATGACGC GACCCGGTGG ATGCCTTGGC
4451 T C ~ C G ATGAAGGACG TGGCTGACGA CGATACTeGA TGTC,GTCCAA
V4
4501 GCACGTGACA TCGATCTTCG AATGGATCAT CGGGGTGGTG GGGTGATCTA
V5
4551 CGAACTGGGT TGTTCGAGGG GTTGAACGAT GGGGAATGAG ATGAGGATCC
4601 CCCCCCACTC CGATGAAGAT <Large subunit rDNA>
V6
7251 CGCCTGAGGT TTGGTTCGGG TTGGCATTCC CCCCGAGCCC CCGAGGGTCG
7301 AAGACTGAGG GGGTGAGGGA AC~GGCCGGAA CAACCCCCCC GAGGGTCGTG
7351 AGACGAAGGG G ~ AGGCCCCCCG CCCCGAGGAA GACCTGAAGG
7401 GGGGTGAGGG CATCCCCCGT CATACCAGC~ GGG GAA GGCTCCCTCC
7451 GGTCGTGGGA C C A ~ T GAGGGCCCCC GGTCGTAAGG GGGCATCCGA
7501 GGGGTGAGAC TTGAGGCGCC GAGGGTCATA C ~ T~%GGC42CCA
7551 ~ T CCCCCGCCCG GATCGACCTG AAGCGAGATC AGACCCCCGT
7601 GGGTCGAAGG ~ T A C G GGGGCCTACA GAAGGATACC AGATTTGGAA
7651 CGGTTACAGT GGA'I-I-I-I~
SI II II I
BX BXB BBB B88 S T B K HBBK PP' O8,, H • B,
I-,..,,., . . . . . i. , , i ..i , i , ,- | ,
1kElP| i
corresponding to the map, are indicated by underlining. The
closed arrows, position 128-973, indicate an open reading frame
for 281 amino acids. Insert: physical and genetic map of the G.
rnuris rDNA. The positions of the rRNA genes are indicated by
boxes over the restriction enzyme map; from left to right the
boxes represent SSU-rDNA, 5.8S rDNA, and LSU-rDNA. The
arrows indicate the direction and extent of DNA sequence anal-
ysis. An oligonucleotide, used to supplement the sequence anal-
ysis, is indicated at the 5' end of the SSU-rDNA as an upright bar
on an arrow and has the sequence 5'GATCCTGCCGGAGY.
Restriction enzyme sites: BamHI (B); HindlII (H); KpnI (K);
PvulI (P); Sau3AI (S); TaqI (T); XhoI (X); the short bars without
letters are the SmaI sites.
3. 320
tion enzyme recognition pattern than G. duodenalis
(van Keulen et al. 1991a). The rDNA of G. muris
also has a much lower G + C content (61.9%) than
that of G. duodenalis (Boothroyd et al. 1987). To
provide a basis for further analysis of the taxonomic
and evolutionary position of the genus Giardia, the
DNA sequence of the entire rDNA operon of Giar-
dia rnuris was determined from available clones.
This report will focus on the spacer and LSU-rRNA
gene and its RNA structure. A comparison between
the SSU-rRNA of G. muris, G. duodenalis, and G.
ardeae and its implication in phylogeny will be pre-
sented in a later publication.
Materials and Methods
Isolation of DNA. Giardia muris trophozoites were obtained
from cysts isolated from CFI mice by standard methods (Robert-
Thomson et al. 1976; Sauch 1984). Trophozoites were harvested
from G. muris cysts which were induced to excyst as previously
described (Campbell et al. 1990). Total genomic DNA was iso-
lated according to a standard procedure described previously
(van Keulen et al. 1985). Plasmid DNA was isolated by the al-
kaline minipreparation procedure of Birnboim (1983) modified by
the use of ammonium acetate for the precipitation of the genomic
DNA complex (Bird and Wu 1989).
Southern Blot Analysis. Total genomic DNA was digested (4
~g per digestion) with various restriction enzymes and fraction-
ated according to size on 1% agarose gels. The gels were stained
with ethidium bromide, photographed, and prepared for South-
ern transfer to nitrocellulose f'dters according to standard proce-
dures (Maniatis et al. 1982).
All cloned DNAs used as probes were digested with the ap-
propriate restriction enzymes and separated from vector DNA
by agarose gel electrophoresis. The DNA fragments were puri-
fied by the freeze-phenol procedure (Benson 1984). Nick-
translations and hybridizations were carried out as described
previously (Campbell et al. 1990).
DNA Sequence Analysis. The appropriate DNA restriction
enzyme fragments were size fractionated, purified, and cloned in
M13mp18 or rap19 vector DNA with JM 101 or JM 109 as host
strains (Messing 1983). For the single-stranded DNA isolations
and the sequencing reactions, the protocols supplied with the
Sequenase and TAQuence kits were used (United States Bio-
chemical Corp., Cleveland, OH). Areas with extreme band com-
pressions were analyzed by using dlTP with Sequenase or
7-deaza-dGTP with TAQuence in the reaction labeling mixture.
When necessary the standard 8% acrylamide/8 M urea sequenc-
ing gels were substituted for 8% acrylamide/8 M urea/20% form-
amide gels. Most of the sequence was determined by analysis of
both strands; in small remaining sections where this could not be
done the sequence analysis was repeated using the two different
enzymes. The oligonucleotide used as internal primer for se-
quence analysis was synthesized in the microchemical facility at
the University of Minnesota and had the sequence 5'GATCCT-
GCCGGAC3'. The sequence data were analyzed with the
DNASTAR sequence programs (DNASTAR Inc., Madison,
WI). The sequence of the LSU-rRNA of G. muris was compared
to those from G. duodenalis (Healey et al. 1990), G. ardeae (van
Keulen et al. 1991b), Escherichia coli (Brosius et al. 1980), and
Halobacteriurn halobiurn Mankin and Kagramanova 1986). Se-
quence similarity was determined by the DNASTAR alignment
program, based on the procedure of Wilbur and Lipman (1983)
with a K-tuple size of 5, range of 20, and gap penalty of 6.
Sequence alignment was performed by aligning the sequences
according to the procedure of Woese et al. (1983).
Secondary Structure. The secondary structure diagrams were
drawn in a format based on that of E. coli 23S rRNA model,
which may be regarded as the standard or prototype structure
(Gutell et al. 1990). The figures were laid out with the assistance
of a new RNA graphics program, XRNA, developed by Bryn
Weiser (unpublished). Identification of some of the variable re-
gions and estimates of their sizes were based on the compendium
of 23S-like rRNAs by Gutell et al. (1990).
Results
Structure of the rDNA Operon and Spacer DNA
The rDNA repeat of G. muris contains two HindlII
and two KpnI sites (Fig. 1). Cloning of the operon
was performed by isolation of the two HindlII and
three KpnI fragments from genomic DNA digests
and insertion in plasmid vectors. The clones consist
of HindlII fragments of 5.2-kilobasepair (kbp)
(pGmr5.2) and 2.55 kbp (pGmr2.5) and of KpnI
fragments of 6.5-kbp (pGmr6.5), 6.2 kbp (pGmr6.2),
and 1.15 kbp (pGmrl.15). Digestion of the clones
pGmr6.5 and pGmr6.2 with BamHI revealed that
the size difference resided in 1.8- and 1.5-kbp
BamHI fragments, respectively, which hybridized
to SSU-rDNA probes (not shown). The size differ-
ence was caused by the presence of differently
sized Sau3AI fragments in the various clones which
are located 5' to the SSU-rRNA gene (Fig. 1). Since
differently sized clones could be artifacts from ex-
tension/deletion of the cloned DNAs in E. coli due
to repeated sequences, the possible presence of var-
iously sized BamHI fragments in genomic DNA
was determined by hybridization of nick-translated
Sau3AI fragment to size-fractionatedrestriction en-
zyme digests of G. muris genomic DNA. Indeed,
four bands hybridized to the Sau3AI fragment, in-
dicating that there were several differently sized
spacer DNAs present. These same bands also hy-
bridized to a SSU-rDNA probe (data not shown).
The restriction enzyme map and entire sequence
Fig. 2. Sequence alignment of G. muris, G. ardeae, G. duode-
nalis LSU-rRNA with E. coli 23SrRNA. The sequence of the
LSU-rRNA of G. muris was aligned with the sequence of G.
ardeae (van Keulen et al. 1991b), with G. duodenalis from po-
sition 856-3745 of the sequence published by Healey et al. (1990)
for G. intestinalis except that a third C was inserted after posi-
tion 3500 (unpublished observation) and with E. coli (Brosius et
al. 1980). The dashes (-) indicate spaces created in the sequence
for optimal alignment.
4. 321
£.~5"
O3.m'6a¢l~
Om'~¢*
O.~,~
O ~
Ez~
Gm'd~
G~E~
O.,,w~
Om~m
0 i,,md,~Eu
O.~,~
E.c~E
GJ,w,~
E.coE
G.,,.M~
G.¢,I*~
G~tL*
£z*l~
G.,,.M*
Ez*t~
G.,,.M~
G..~,u
G~,&,.*
O ~
Ec.a
Gm,L,~
O ~
O ~
O ~ / ~
GJ~a~
O ~
O ~
O~,~
£.~
E~,;J
£.~
G.m~
G.mnr
£.~
IGGVUAAGCG&CUAAGCGUACACGGUGGAUGCCC UGGCAG UCAGAC, GCC.AUGAAGGA it~GUG (CUAAUC) UGCGAUAAGCGU CC~ i< .................... >AAGGU (GAUA~IA) ACCGUUA |01
I...... ugagaugacgcGACC~UGCt'~UGGCUCGGGGGA-CGAUGAAGGAI (~GUG {GCUGAC} GACGAUAC -ucgsugu t<gguccaa- --gcac~g- - -> ..................... $$
I ...... caugcagaccc~ccGGCGGAUGCCUGGGCUCGGGCCU-CC,/tGG~GGAICGCG(GCGC~C)AGCG~AU-gcg~gc I <a{j -cccg- - - 0cacga a- - - >..................... 87
I ........... alC{JCcCCGCCGGCGGAtJGCCUCC,CCCCGGGCGG- CC,^CGJ~AC.AGI C'GC'G(GCGGAG)CGCGAC,AC-{Jcggu{JC I<ggacccgccc{jcccc{jagaa> .....................
UAACCGGCG IauuUCCGAAUGGG (GAAA )CCC [AG'JGOGU (UUCG )ACACACUAUCAUUAACU- GA ........ (-AUC .... CA} ..... UA~GGCGAACCGGGGGAAC~J IGA (AACA) UC 201
--Itcauc{ja I -ucUUCC~U- -- ( .... ) - -- I ....... ( .... ) ..... g gaucauc{jggg - uggtt{jg.... (..... /uccc) -cccccacucc{jaugaagaugA~U [UA (AC,CA} UA 166
--gcacc{jgl -uuUCCC,~.... ( .... ) - -- I ....... (.... ) ..... c {jcucggccuc cc cc{jgga {jacc [{jgccg/-gac) uc cccc gagc g{jccggggac gACGGGCGGGACUI UA(AGCA)~ 174
--gcaccga I -¢¢cucgaacgc- (a{j¢- ) -go I {jc..... (cccg) ..... slcgccgcc{jccu-cg{jc{jc cc -- (..... / -gc{j) cggcccgaggc{jgc{jgg{jgcgACGGG~CUI UA{AGCA)U^ 182
t.t~Gt,~CCCCC~GG;~4+~UC (JO,CC) C~GI~WCC CCCAGU~;C(GGCC,A} ~ ~ ~ I <....................... ~CUG,~t~AG .... UG'UGUGUGUUAG~ 29"/
UCN;UACOCCCAGGAGJU~GAC~cc(AACC) GGGAt3~.'CtPOC.i~ (GG<3C,A) GCC,AUCCA(;GAGUAGt/O~GI <accuc g (aa{j) cg auga gc guug >.... u{jaggacuggg{jggauguccuaaacc¢ 2J3
UCAGI~CGCC~C~CACC (~CA) GGG&UUCCCG--GUAGC (GGCGA)GCC,AOG~GCCCG I <- -c~¢ (gaa ) gggcc- Oc~> ......... gggcguggcc{ji {j¢cgggg{j{ja 2g|
t3C.NCUACGCCCC~CC (~CC) GGGAUUCCCC--17JAGC(GGCGA)GCGACGCGGGN~C,CCCGI <--ccc¢ (gaa ) gg-c{jcgcu{jug> ......... {jg{jcgc aggc gcag gccc gccg 289
C,AAGCGUCU( ~ - )AGG~UACJ~<23 (UG~CJU;C) CCCGUAI CACJ~AAAUGCJ~CAUGOUGUGAGC~U....... IGAGUAGGGCGGG&(CA~UA )UCCUG'UCUG3~UAUGGGGG 4]0
gaagaugg{j(~gaaa)ccca-caccaagGJ'F~Q3(UGCCAGU)CC-GUA|gg{jau--gga¢acaa--ccc ......... ¢aacgacIGAGUACCUC~X;CO(---ugagag)-GUAGAGOGIa--~GGAG 3M
g~&gug~gg~ggg~gc~cuc-r-g~¢gga~G~P`~UC.J~C-'~GC~CCc(~3C~u~cc--ggg¢g¢gg~-gcc ......... aCgcggglC--,N;UT++C-CGCCGCt3(---u.gagag}UGC.lV~,~,CGII41--cC,OC,3~ 3~4
~{Jagggggc(ccgagg~g~¢-cg~{JaC`J¢'.~<23~t+@PJ~AC'C~CC~GUA~gg~---ggc~g~gggccu ......... {Jcgc{JgclGAGUAC,CGCt~Ct3(---ugagcg}UGCJV3~CGa---aG(3CJ.G 392
(G-ACCAU -}CC'OCCAAC,C~UAAAUACUCCUGACUGACCGAUAGUGAAC- CA(~JACC {Gt~,A )GGGAAAGGCGAAAAG~CCCC (~-) ~ ~ C C - ~ C C ~ A ~ < - -- 52~
~GUGGUA~C)CUUCt3g~-~3P.~CU~UA~G~CCCCC`J~A~C`Cm%U~CC3~AGUAGU{GL'~C~A~A<:Gj@+AGGUC~A~-~C~AUGC~(ga~a)G~CA-cg~.~-~~¢~g{Jg~g<~ 505
~CUUC~UACCQ~'CC(~3AC~AcCC`AUAC~K:GC~j~C`~J~`GtJAGU(QL`GA~ACGAA~[;;GL~.~`~J~C~ACGCC(ccc{Ja{J)G~CC~gcu-~C`AcC-~C~ccc{Jg¢uggcg<-c~ 503
(GCGCGGcC~CU11CCAAGGcU~UA~CC~cCC~3~ACCC`AUAGO~C~ACCJ~Jk~A~(~)G~~CC~u~gc-~g~CC~CC~CC~CC<gc~ 509
............................... >ACAA CGC{WAG )C~'~"0GUGAC UG O~ UA CC UUr~JGUAUAAUGGGUCAGCGAC t~AUAUI~C UG UAGCAAGGUU~C C -- (- --GAAU 6|5
cc¢{jaccugucga(-uga)ucgauuggucgg> ................... ( .... ) ........... -GCCCGUC~CACC, GACCGAGGAGt23AGCJ,CCGAUCGCCJ~+GUCauguggc{aaa{Jgga 593
ccgllgggcccagg(gCgll)Cc~4{jgccgtlc{jg> ................... { .... ) ........... -.GCCCGUCUCC,/~CACGGACCCJ+GGAGCCGCCCCCCGGCC.CGAGUCag&gga- (- - c {j{j{j{j 5~
ggcc ......... {¢{jcc] .......... {jg> ................... ( .... ) ............ CCCCGUCUCG~,AACACGGACCG/tGGAGCCACC,CGCCGCGGCC,~GCCc{Jaggg- (- - - agcc ~7~
A) -GGG--GAGCCGAAGG (GARA) CCG< ............................................. >AGUCU (UAACU )GGGCG~G~AG~JC, CAGGGUAUAGA¢CCGAAACCCGGUGAUCUAG 690
a) gucaa{jacuucaa{jg{j ({jc¢c) cuu<a¢ ................. (aa{jg{j) ................. ca> ................... GAGCC~UCG~.ICCUAC,ACCCC,M~4GGUGGUGAUCUAC 663
-) -,JCC-UgUGGCGC~GC|C,ACG)C,CG<Agau¢ccccg....... ga ({jcc¢-) ug .......... ¢ggggcc> ................... GAGCGGC~C,GGGGCG~.,GCCC~AAAGGCGGUGAUCUAC 672
-)-~--g~GCGC~`~`G~(GAGC)GC~¢g¢~ggg~g~g~g~c~c!-tgcgggcg~J{Jc{Jc{Jggc~> ................... GAGCCGCGGCGCGUGGGCCCG~GCGGUC~UCUAU 675
~.,~R]G~GUUC;GG [U~C )A C U ~ C C C , J~CCC~C-UA-AU (G ~ ] AUUAGCGGAUC~CUUGUGGC~U(~ G ) ~ ~ U ~ ~ 812
AC~I~ACCAGGGUGAAGCCAGAC (GAAA] CC~G -GUGCUGA (CG~T~AAA) UCC~UCGGUCGAGUUGAGUGUAGGGGC {GAAAG }ACUCAUCGAACCACC~JC~IA~ 786
GCCCGGCCAC-,GGCGAA~JCCAGGC (GAAA )GCC~GGAGGCCC~CCC~G -GUGCUGA (CG~?~3AA} ~ ~ C ~ C (~ G )A ~ U ~ ~ ~ 79~
GCCCGGCGAGGGC~ AC,GCCGGGC (GAAA )G C C ~ C C G C CG~G-GUGCOC,~ (CGCGCAGA} U~GCUCGUCGGAGC ~ C (~ G )A C ~ U ~ C C ~ ~ 798
t~CCCGAAAGCOAU {UUAG )GUAGCGCCU~G~JGAA~CAU !......... CUCCGG ]GGGt~GAGCAC~I/J-OCGGCAA- -GGGGGUC (AUCCC) GAC~JACCAACCCGAUGCAAACUGCC, AAUA |CC~ 924
CCUCCG~AAUG~JCJ [CCCA )GGAUAGCCGACAUC ...... ~uga~ca{J~u{Jc¢cu~AGG1~AC'GcCAc~AUCGgJGGGA~JAGGGAG~gu~a~UcCJ-UAU~cA~CcUCGAA~CAUGAAcA~g{Jg 901
CCt~CGA,4AUUUCU [CCCA) C,C,~U~CCGCCACC...... I¢ggcagc--u{jcg{j¢ ICGGU^GGGCCGOGACCGC,CGGGAGCGGC,GGA(¢cu{j- ) UCCCU-CGUCCGCCCCU~CUCCG~kcgI g¢cg 908
CCUCCGAAAUGUCU(CCCA) GGACAGCCGCCGCC...... ~c~gcagu-~u{Jcggc~CGUAGAGCGcUGGCCGGC@C~GAGCGG~C`G+~J(c¢ug-)CCCCt~-CGCCCGCCCc~CAAA~UCC~G~g~cg 911
................................. J~UGtT~UCAC~,C,AC-ACACAC(;~CGGGU(GCU,~C) GUCCGUCGUG~C,AGGG(~ ) C~ C ~ C ~ C ~ 1017
~Jgg~g~gg{J{J~sacuu~gug{J~c¢cccuca~gu;pJ~u~G~`UGUCC~A~GGGCCUCUCCUG(G~U`qAG~CA~`~A~GGGC.~J~G^CGG~C`AUG~CGACAGUCG-GGOC~J~G~`1JG1~GGa~UG 1023
¢{Jcc~cgggg~-g¢uuggcu{Jggcgc~gu{Jggg~gug~g~GPGUGG;cG~UG~+J~GGCCUCUCCUG(G~Uj~J+.G~cAGG^~'*GCCJ~GACGG~C~AUGAU~C~CG-~C'CcAC~GUGCGG-ga~CJ~+~G~ 1029
c{Jccgcccc{Jc~-gcuggc~ugg{J~{J{Jggcggg<~g~augc{J~cC~C~CGCG~C~GCCCCU~UG~G~UAAG~cA(;~`A~;G~GAGG~GG~C~AcGAU)~CGCJ+~G~GGC~AC~GGUGCGC-cg~cGC~GG 1032
UUAAGUGC,GPJ~CGA~CCAC, J~4GCC~,C~UGLV.K~ (Ut~N;JU~C,CA} C,C C A U C A ~ {~J,4AUA)GCI~ C ~ C ~ C ~ ~ -U 1141
CUcgau -c as a- C G G U G U C C G U C G ~ ~ C (tTO~C~GUA} ~,M, G C C U C U ~ {GUN,CA)AC I C C A C C A C , , C C C , ~ ~ C - AC~ 1145
c'Gga g¢ -g aaa- CGGCGUCGGUCC,GUCC~CO(:W,J~C, GUGGCC(CO~,GUC ) (IC.,~,UCCUCC~A~ (~ )AC I ~ C ~ ~ ~ C - ~ 1111
~gc--g{Jaa~cGGCGUCc~GCCGGUccCGA~AGCUGGA~.~3GUGGCC~ccAGNkGUc]GGCAUCCU~CAGG;AGUGU~G-GAAcA)A~t~CA~cAGCcc'g~UcGG~GGCCCGGAAAAUc`GAG~-C~C~cG |153
AAACCAUC,C l ACCr.,A%C,CUGCGG| CAGCC-,ACGC|~JAO- ) GCG~UIGUUGGGUAGGGGA~UGUaa- GCCUG-CGAAGGUG~ (G~ ] ~ IA ~ . U ~ ~ 1260
AAGCAUACGI AUCC,Ggt,ACCCGACI ¢ga{ja.... [u{ju{j-) .... '.~cucgC'GUAGGAGGUcguc~`ug~uc--ga{Ju-cgasgc~augg{J(GUGA]cac{Ju{Jgu{Jg~au~{JagucAUGAUUGC1J~AUCUCGG 12.%
C,AGCCU<3C,G IACUCACGCCCGGCI ¢ gcc a.... [¢gCg-] .... ugg~gG~U`~GC~A~Ccgc{J{J¢gcu~--gggu~{Ja~ggc{Jggg{J~C~CC'A~ccc{J~gcu`gg~gg-{J~;CCt3GUGCAC'AUcL+C~G 1261
C,&GCCCCC~IACCCC-CGCCCGGCI cgCc;r .... (¢gcg-| .... cggcgC,(;UAGGAGGCcgcI{J ag{jc ¢ccgggg{jc{ja aggcg{jc{jc (GCAG) gc cc cg¢cgg I accgg- cCUCUGGUGCAGAUCUCGG 1266
CA~P~GU~J~CC~U~G(:GGGU( ~ ) GCCCC,CUICC~CC,C,~- C~CCN~GGGU',JCC'~G~J~CG (U/J~AU}CGI GGGCAGGG~,AGUCC,ACCCCUAAGC,CC,J~GGCC(C.~) <............. 136"/
G%GUN;U~,GCCAINACUCC- - - (au¢ -u } o--GG~G IGCCU~aCgU~TJ~;kOGGGUUCCGU~UCCCOG(UCGAU) CAI ~ ~ g a -ucc~ a......... (.... ) <{jucc{jauggu {gu 131~
t.~GCAGUAGC~COACUCCG-- (ucacc } - ~ IGCCI~gg -uGGAC,J~GGGUUCC~'UGG~CCC (GGC~ )~ | A C ~ gg-C ccual ......... (.... ) <-{JC{JCgU{J{Ju (gU 1361
C3+.C-CAGt~C'CCC'~CUAC`UC~--~g~--CGGAG~GAC13g~gggGG]~G~'11~G~)~~C~{Jg~gccuaa ......... (.... ) <-{jcggcgggu ({jm 1367
................................... >GGCGUAGUCC~U-C.,GGJU~ (UUJI~UAIPJ} CCUGUACUUGGUGUUACU<................. >GC~CGGACJO~,GCUA 14~9
a¢ ] gccguaccaugggg (ucou )ccucllcaggacg> ......... ~tu~g~ugA~CGG~N~CAUc}~CGG-U~guaugau{J{J{Jg<u~u~ag~gu{Ja)cau~u>uu~a~aa~{Jguacuugaa~gaag~ 1465
c{j } accagaggacgua¢ (ccc¢ ) guacgc¢¢gucg> .......... aaafigg{JaA~3G~(;G`~J~`C~CU~ccGGU~G~{Jcg~{Jg~uc<-gc~g(gc{Ja)cg¢~{Ju>ccg~ccgag~a~-gugca~gg{Jcc 1468
ag)accgggaaggg{jcg(ugcc)c{j¢ccguc{jaac> ............. g{JggaGcC~G~CGC~I.`~GA`CU~(-~``~`3CAGgcgcg{J~cc~c<~g~g~gaga)~gc-cc>g~c~¢gg~ga~gc{Jci~g{Jgga 14"/1
UGUUGGCCGGG(C C . ~ ) CCC~CGUGUAGGC I .... UGGUU-UUCC~GG(CAJ~U) CCGC,J ~ ~ G G C G U G A U C ~ (UACG} GUGC~GC~AC3u~UGCC 155"/
~tcaug{Ju{J{J{Ja(gucauccc{J-}u~cu~tc~cc~guc¢cc~---~--cauggccag~-~ugu~(w4~g~gu¢gV`g¢augg¢ca-u{Jaug{Jat4ca ............ (.... ) ......... CaUguc-cuu¢ 15~
a{jgcgau{jg{ja({jccgccgcg-}ucccagccccc{jcu¢¢ .... I cgcgugcccccga{j-a¢ (c~U;lC) gug{jcgccg{jg{jcac{jg{ja{jcg~tcu .......... ..;{ .... ) ......... c{jcc.gc{j{jcc¢ 1~61
cc{jcggc{jg{jc (ggc{jc¢¢c{j- } {jccc{jcgaacgcccc ..... I{jcigccccc-{jgac-gc(cctugc)gcg{ja{ja{jg{j{jggcccgg{j{j{jcggac¢¢ ........ (.... ) ......... c{jcgc-guccc |~4
CI.P.,Cb'UCC-AGC.J~.JUU~,CCt,'C'U;O,C,CJI+IJCAGGI.t~ C.&UC.JU~UCGUACCCCJU~CC(~ )~ I~ ( ~ ) ~ C ~ = I I ~CUCGGG~CtJAI]:OC,'t,A~IA 15"/9
g-gucaucc~u{jaaacccguugu- --guCCCCcccauccU{JaUCGUAC-CACJU~CC(GCk~,CA} G(31A~U {~UCA) G C ~ - ! Ig{jga gAC.,AC¢.~.P~ ~ ~ ~ * • C¢~ ~ ~ 1673
g- 9uca¢cccugaaa -ccucg{j{j- -¢{jgggcgugcu{jcgcugCCGt3Aco - -CC&CC (GCIkGCA)GGIA ~ (GAGC~),)GCCUCUGGGC- I | g g g a g A C J ~ C ~ 16"16
cggccgccccu{jaaaagccgg{jg .... g{jcgcc{jgc ¢gcgcgCCGUAC-- -CGACC (C,CAGCA} GGIACOCCGGGGU(CkC~'.A)G C ~ I I g{jgagcc,,l~A GC 1610
UGGUGCC'GUJU~(UUCG)(~,AG~GGCACI GCU~UJ~U~J~,GUGA~'IX:C(CUCGC) <............ > G G A ~ U C ~ I . U t C ~ U G G C ~ - C ~ (AUUJ~)JU~ 1717
UJ~C,CUCCGU~C [CUCG)GC,AGAJ~GG,'~,'UIGGCUCU{Jau............ ( ..... }< .......... ¢g> .......... I+UCAC,k~ CGG~C~kCCGJ~GG~',~r~gCC~CUGU~(A~I/~) CJUt, 175"]
UJ~GCUCCGU~ (CUC'G)G~tJ~GGJ~'U IGGC~Ugau............ ( ..... ) <c¢(gc¢-] gga-> .......... cu ~CGG,'V=~--~,.T, GGAU~ -'~OLIGO~ (AJ~l~ ) CJ~ 17(6
CGGCUCCGUJU~(CUCG) GGAJU~GGJ~'U I GGCUCUgI¢............ ( ..... ) <gg (¢gcg) ¢cgg> .......... g U ~ C G ~ ~ - CGAOLIGUL~{ACUAG)~A 17"/1
CACA~'..;~.CUGUGC-AJ~ACAC[ - {jama) GUaGACGUAt~a;tGUGUr_,ACGCCUGCCCaGUIC,CCG-CJO,OGUT3~,k~JI~C,J~DI~(~t.U~C{C,C~ ] ~ ~ I CCGGUjUU~.CI~C 1G 1906
CAt,I~GCGUCGUGCcAGtCG%t[caug{j )A~cAcr.,AcGIJG&UUUCUGCCCJ~GU I~Ccgucl¢ .... ¢augs~ .... (gagg] ..... uucau .... ¢c~agC~ I CUG(P,JgUt~ IGGCIG 1162
CACAGcGUCGUG~.~C'~GU~c~-u{J~v`~CGA~CA~:~`~J~GLP`'J~+.UUUCUGCCC~J|GCCJ~C¢guc~-~-~gug~a .... (gig{J) ..... t.~C41~¢.... ¢¢&¢g~ | ~GUA~ |GGC|(; I1~1
CAP-AC,C~J~-~.,CCG (- C~¢¢ ) ¢ ~ ¢ 1 3 ~ 0 ~ * ~ ~ i ~CCCNCa¢ .... ¢gUgla .... (gCgi) ..... UCCg¢.... ¢gaIgCCI~IGC,~IO 1174
C,CC'GU(~CIJ~UA) AC(~/CC'~,AGO (U~CC,,lUh~) CCb'UIG't..CG~ (IJIk~3) U C C C , ~ ~ ~ ~ ~ ~ {U 2021
Gr.,AGU(JO,CUAUG) ~ G G (U~,CC,~AUG) CCUCi GUC~GG( ~ ) OCCr.,AC .~C,AUCCP-.&CU~.~C~CU¢CgUpa- ccuacag,~CCG (G 1913
GGAGU(~ } A~-'~J~,GG (UJV'-,CC.AA,'~UG) CCUCIGUCGGG(UAAt~,I) UCOC~C~U~;,~Jt, UGGACC,kACC~J~-,AUCCC~L~I.t~.~'CCGAGI~C~'GC¢',dC¢ F gl t- ~ , ~¢~ (G 1919
GGAGU~AAC~&~G}AC~q~r~RLqAG~(~GCCAAAL~CCUC~G~CGGG(c~UU}UC~J~C~JGCA~C~AA~C~AC~~~~g-gag-~(~ 1994
C-,2~UG ) P-AGUGOACCC"C-,O~~~CG't]GA IACC'trtX~tlq~C~CO IG A A C A U O C . A G ( : ~ ~ (r.,~'CA}G~COC,C 2155
CAACGGG) C C - , G G ~ C C U U U t ~ /~ C , ~ . ~ C U l GGGGC,AJOGGG~UCAUCC,GUGCAG(~r.~CAGGGGJ~ ................ (g¢cg) ...... 2016
C.,~.CC,GG) Cr.,(~UC,C C ' G G ~ C C L ¢ , ' t ~ G IAP~U.~GACUCC~CC'CGG~COI . C ~ C G ~ ' r ~ ................ (9c¢g) ...... 2092
cJ~c(;~.,~ ) CGAGGGCC C,~C~CCCt-~',.q3G IAC,CUUGACUCCJU~CCGGG~COI ~UGGGGCGG~,CGGC~r~__ _n,~_-_~.~................ (gccg) ...... ~097
N.~,AC,CCC,ACC~,~AUACCAC~G-~UC,~UGU~'(]UUG~¢CG (~U~UC) CG~J,.P..C<............ ~ (0~} CGG'd-Cl ~ 2262
....... ¢~ J ~ . C C CUC~CGIJUGtr,.'C~ CAUCCCA¢ucic ......... ( ..... ) ........ < ¢ g a c u u a c a u ¢ ~ ~ # ~ ; ~ Q ; ~ O ~ (G~;) C~,~GC I~CCU 2110
....... ¢~C.&CCCUC~CGGU~CCACCC, CCC~ac ......... ( ..... ) ........ . , m c g g u u c g g c g s ~ (GGG)CGGO~ IgCCU 2116
....... CC,CCCC't;P+..k~P-ACCCUC,ACGC,CCC,CO~CGCCCCGCUcic......... ( ..... ) ........ <¢¢ggucgcpcgg>GG~C'CC,C ¢ ~ J O ~ Q (GGG)(~,a;ICl Gcco 2191
cc (U;U~'..~J;U~C] UaA<]C.P,GCACC,JU~; UUGC,CI Uk~.U~ (UC,~;,+.C~) ~U,JJ~,,AUOC,~ mJC,(~/¢;cG (UC,~OGG-) C~CJ~.AGG~C (a&~} ~..~ 2312
GC(UAC,AUCACJ~OG) GCN~,CGUCCUA~ IUCAGUGAGG(UCGGJ~A) ~U~-,CUUACUUr4~UA~IACC-(¢cugucc } - GG~J~Clb'0~'C~3 (~) G~0 2299
C.~(UJ~-,*'~CJ~CACC) ~CCAC(=GCC, AGCI ~ (~C,J~A] CCUC~ P . . t CC,CCC~C~CCCC,CC,CCA(¢¢agucc ) CGGC~'~-'gg-~C (C,jO,k) ~.~ 2306
C,C(UACACCUC,ACC) C,CAGG~UCCCACGG~;@C,C I ~ [3~'~,A) CCUCCCGCGGADCAC,AJ~GGCkCM,p . , C C ~ C ~ C (¢¢¢~ ) ~ (~ } ~ 2312
GUCAUAG'LP_~UCCGGU(~-UUCUG( -AAUG- -GA )&GGC,CCAUCGCI.tCJ~CGGA~(X;GGG&~C~C~ (~ )A ~ ~ ~ 2.~04
G ~ C c U l ~ c c ~ c t R , ~ l ~ c c g c u ¢ ( c g g u a l u c c J g a g c g u g g i g - g u g a c i g a - - ~ A C ~ ~ ~ ~ ( ~ U | ~ ~ ~ ~422
GG~LIACCC~UC~-¢gCU¢ (gggca - -u¢ ) gagcgcgg&g- gugaci ga - -IIJ~GINACCACAGGG&UAAC~~ (~ )~ .t~"IN~GAU 2426
GGCcL~CCGAUCCUUCGC-cgc ¢¢ (cggcc - - gC )gggcgcggig- guggCagl - - a+~GIPJACCkCAGGG& CC,CCC,AGO:;{CCCC,C )AGCC,Ik . <~,N3 2432
G U ~ |AUCACIAUC~CUG |AAGU}AGGUCCCAAGGGUIAUGGC(UGUU¢)C,CCAUUUAAAGUGGUACC,CC.~GCIr,GGUUI~GAACGI~C(~) ~ ~ ~ ~ 2625
CFJCGGCUCi UUCCOIACC<~JC-CGCAUG[CJ~OC| G<~GCGG/0kGCGUiCGGAU(UGUU¢)ACCCGUCUa.J~GGA~ .G~CCGU~(GUGA}~ C U s c ¢ g s c ¢ - 2541
GOCGC'CUCItFJ~CU i ACCG~rCCGCGCG(CA~JC) GGCGCGG~.(;CGU IC'~qAU(~ )ACCCGCUCC-A~GUG~-Gt~-~GGGt,FJ'JJ~CCGU~ (GU~Ik} ~ C U a c r . g a c ¢ - 2545
C,UCC,GCUCIUOCCUIAC'O~L~CCC,CC,CG(CkCC)C-P.,CC,C~.,J~.GC~J It'GG~U(UGUUC) ACCOGUUCt. ~ ~ ~ (G~) ~ ~ a ~ g c ¢ . ~$1
CC,CUGGAUJ~CUGCUCC (UAC,OJ~ [C,+~_~)GG~C } GG~UGC~oCJU+~C~ 1UGUCAU(GCCA) AUGC,C A ~ ~ ~ (0 2"/4"/
- auc u¢ caugcgu¢caguACGUCGGGU(ClkGUAC[~GA) GC,J~C } ACCCGUCGCC4k~cUccg IU .cUCCCGG |UUGIX~U(~) ~ ~ _~uc~gg.. ~ ¢ ~ (G 2651
..... ~cgggg~gagagcAC(~cGGGU(cAGUA~|Gk3A)~J~GCC~C~GCC.~Q~¢g~ggu-~v~CGCGG|~(~)~~-~g~¢~gg--~g~¢~(G ~$$
..... ccgggg¢CagIgcACGGCGGGC (CN~JAC (G~A) GG~AC} GCCCGCCGOGGGCcgcr~gc-CCCC~-GG IUUGCCC(GGCC)G G G ~ _ ~ g c c ~ g . . ~ (G
~I~.,CAUC~ ) A ~ C ~ C C ~ C , AUC,J~JUCUCC<.......................................... >CtJ~CCCU (Ut.~} AGG~J~GG |JU~I CG~I3 (AAGACC,A] 2126
3L~'GCCUCUA ) AG~GUCCAUc¢caccUa¢.... ucgcaug;Ig<-uugagggg ........ cuu.............. ¢cccucga> ........ (---) .......... uu¢ I g¢ I ccgg~l (uugaCgll) 2731
~JcUcUA)AGcGCGC~c~cg¢¢u¢g----g~gu~<~g~g~ccuagcc¢~gcg¢&c~-~¢¢s~gg¢~> ........ (.._} ........... ¢¢lflglccgg~;(ullgacga| 2"/49
A~3CCUcUA~AC~'CC~CGCAC~cgc~u~g----¢~c~g~<~{Jg~{Jcg~g~c~cag¢cc~{Ju{J¢~c¢gu~g¢¢gag~ggc¢¢> ........ (_..) ........... ¢¢I g¢I ¢¢g~;g (gllg&¢¢l} 2759
CC,ACG0"OC~GGCCGGGUGUGUAAGCGCA ........ (GCC.~-) ...... UGCGUUGAGCU ....... AACCGGUACUAAtP.~AC c~r~ c U-~AAC Cuu I 2~04
cccu~nau-cuccgaccuac-uguacggcguggagcuuc (uug¢g)ll¢cg¢cuga---ggut.n~.g{j,..,ucggguugg¢llU+UCC¢.¢ccgag.............. I 2~11
cccgc.cg- - - Cagcgcucc - ugua Cgcg;Icggag¢¢c¢ (gcgcg) accgccuga- - -ggaacgccc ......... cugccc-g¢cg¢ ............... I 2117
¢¢¢ggcg---cggcgcucc-uguacggcgcagagcccu (gcg--)iucgccuga---ggga¢ gcg¢¢--ugca{jagcgcg-gggcgg{jgcg ........... I o!~
5. 322
for the SSU-rDNA, 5.8S rDNA, and spacer DNA
and part of the LSU-rDNA of G. muris rDNA is
presented in Fig. 1. The position of the LSU-rDNA
is indicated by the first 23 and last 37 nucleotides.
The entire sequence of the LSU-rDNA is shown
separately in Figs. 2 and 3. The sequence of one of
the shorter Sau3AI fragments in the spacer is indi-
cated in Fig. 1, where it is aligned with the sequence
of the same region corresponding to the largest
Sau3AI fragment found. The observed size differ-
ence is obviously the result of variation in the num-
ber of repeated sequences in the spacer DNA.
These repeats are indicated in the figure. The
boundaries of the mature rRNAs that were deter-
mined previously (van Keulen et al. 1991a) are in-
dicated by arrowheads (Fig. 1). The exact position
of the 3' end of the LSU-rRNA is the least certain.
The entire operon is 7668 nucleotide (nt) long and
contains a SSU-rRNA gene of 1429 nt and a LSU-
rRNA gene of approximately 2698 nt. The overall G
+ C content of G. muris rDNA is 61.9%. The G +
C content of the LSU-rDNA is 57.2%. An open
reading frame of 281 amino acids was found in the
spacer DNA.
Structure of the LSU-rRNA
The sequence alignment of the LSU-rDNA includ-
ing 5.8S rDNA of G. muris with that of G. duode-
nalis, G. ardeae, and E. coli is presented in Fig. 2.
Overall similarity of the LSU-rDNA, not adjusted
for secondary structure, was determined from se-
quence alignment of G. muris LSU-rDNA with
those of G. duodenalis, G. ardeae, E. coli, and H.
halobium and was 65%, 71%, 46%, and 47%, re-
spectively.
Secondary structure models for G. muris,
G. ardeae, and G. duodenalis LSU-rRNA were
constructed. The structure of the G. muris rRNA
is shown in Fig. 3. Domains (roman numerals)
and regions of interest (A-H) are indicated which
are based on the secondary structure models de-
scribed by Gutell et al. (1990). Since the 5.8S do-
main differs among the three species, the ones for
G. ardeae and G. duodenalis are shown separately
in Fig. 4, where the tentative helices A and B are
indicated in the G. duodenalis structure. The size of
the same region in G. muris is smaller, 6 nt com-
pared to 19. The same is true for the G. ardeae
RNA, which has only 5 nt here. The regions where
the largest differences were found among the three
species, namely C, D, F, and H, are shown sepa-
rately in Fig. 5.
Region C consists of approximately 34 nt in G.
muris and 35 in G. ardeae, but only 16 nt in G.
duodenalis, with the helical segment of only 8 nt.
Region F shows less difference among the Giardia
rRNAs, being 20 for G. muris and 26 and 28 nt for
G. ardeae and G. duodenalis, respectively. Region
H is also larger in these latter two species. Of these
three regions, the ones of G. ardeae and G. duode-
nalis are more similar to each other than to that of
G. muris.
Discussion
The entire nucleotide sequence of the rDNA operon
of G. muris has been determined. The organization
of the rDNA of G. muris is different from that found
in two other Giardia species (G. duodenalis and G.
ardeae). The spacer of G. muris rDNA is larger
than the one in G. duodenalis (van Keulen et al.
1991) and it is the only one of the three that is het-
erogenous in size. The difference resides in se-
quences upstream of and close to the SSU-rRNA
gene. Many tandem repeats are localized in this seg-
ment of the rDNA. This phenomenon of tandem
repeats in spacer DNA is common in rRNA genes
and is believed to be involved in transcription reg-
ulation (Jacob 1986; Sollner-Webb and Tower
1986). Why there is heterogeneity in the G. muris
spacer DNA and not in those of G. duodenalis and
G. ardeae is unclear.
An open reading frame (ORF) in the antisense
strand of the G. duodenalis LSU-rDNA has been
described by Upcroft et al. (1991). There was no
evidence for a similar ORF in the G. muris se-
quence. However, an ORF for 281 amino acids was
present in the spacer DNA, on the same strand as
the rRNA genes, which would yield a protein of
about 30 kDa. Whether this ORF codes for a real
protein in G. muris remains to be determined.
When the SSU-rRNA of G. duodenalis was an-
alyzed, the rRNA appeared to resemble in size
prokaryotic 16S rRNA. Phylogenetic analysis
showed G. duodenalis as the earliest-branching eu-
karyote studied to date (Sogin et al. 1989). A similar
observation can be made in the case of G. muris and
G. ardeae (manuscript in preparation). As might be
expected, the LSU-rRNA of Giardia shows similar
features. It is, with respect to its size, similar to
prokaryotic rRNA, but contains in its secondary
structure many typical eukaryotic features. Nota-
bly, these eukaryotic features are among the small-
est described so far. Space limitations prevent the
discussion of all of the eukaryotic signature features
Giardia maintains, but a few will be discussed
below. For an optimal analysis of similarity and dif-
ferences among the Giardia LSU-rRNA, the sec-
ondary structures of these molecules were con-
structed. The structure of the G. duodenalis LSU-
rRNA was taken from the sequence determined by
6. 323
A GUA UAA AOu O
X OeAAGCCU 0--6 G OU U G A A
G II "III I II AA A °~uc UocUGGccAuG UGGAU
AA uGCGGUGGA ACGOAA G C III1"111111111 II1"
U AG_ G UuCAGAUGACCGGUAC CCUG A C
u • ccc COG--G
U--A
G--G G
AGAG-- CO G A
u A cG GU
A -- U GGGU~ -- G
O--C ACU G U--A
C--G A--U
U--A c --O
G o GA u A--U
C--G c c
G--G G~.U U u A--U
Uo_ G G~C U G A--U
G AGA G--C UO G~C
o--c u~ GGAG O--G G A G--G
A A G--COX A G C--O G--C GU A
A C A C--G GU U--A G IJ A G
cA AAU U A--U G GU U U A G
U ~ GG GUA GAUG CAAGG %
U II ..I Illl IIlil t~c
AA - OUUGC c
co.........,_~ %A C
GGC.%U%AAG GCGCA G U ^GuGuGGUAO G CU GA GAcA C-- GUAC CAAAGU
U ~0G G U O U G
A GA AUAOGGG A
O II IIIII1" C
~^ ~"* G°. ,
GA%*GC CuGU UAUGGGU A
U~u_.~ 2~u GGC u
G * U AA * Uo GGG / OCG
G--G
: .....-o°-G O'G,O.,
o * A U--A O--c
G--G G.U --AG_ G UG CAAGGUGA
0G II1" U
G--G Uo u
O--G G--G
U--AGA O--G GA~. G o
u ~ G~....
A--U Cu.A C--G
%G~ Go G-~o AC ~O G--G
au a U--AUu ca* uU TTI 3'half
AA A c GG
OGAOGU U UAG
^G
GGGGC G
AGGAGG'~ *lit IIIIII
OGUGGA A
A lilt uGGG • U U U
AuGGuGGUG %G A
, o .,°,>G/
oG-G ,.o2 ,°G°<'c GG'O::/ "
A~ GU- o A--U
U~G ~*U G
~.%o *Go .. G* G--o Go~.G U--A A
,o^Go~: ,o G . . . . .
A ~A~ cG U--A G
U ~G o A Gu G A--U C
C~A U.UG GA . O--G U
E c..~o ~,o G X e~ A C--G G
c. oG ¢Cc A o-~ u -c u
u G ^ ."G G:c Go A % U G C
Guu- AU~ a.u ,,* ~ o^ ~-,.G - A ~--~ 5'5.8s G o"
G~ O • G O~ ~O UG~%C"~ O--C U--A
AG~ G U--A G A U G o o G--o
A GG G-- C GGCGGGGAAGUUGAGAAOUGAAGGGAA A * U A A U ~A 5.8S 3'
G. G o^°-- GA * O'~ . . . . . . '
G..,CA u .GGU~ AGGG*GAOG <.'G G--G ~:g
G~ U O %% GGGG UGGUGoGA GU O--G U U GO%%GCCGC
°" .... oo . ° e °,A Gu cG ,,'GG GG.. u
G G u %%~ C A G A ~ GA'. u u ~- GOG%'UP s' 2as
A.*G A" A ^ A ~ A * A
u u G c G A G G A uG%%~k
o-,g.%, G,o ~GA"'''............. ~G-Gu~ oG ~G "A,,
U~G A G
A G A -- U G U
GU G A U
GA AG GGGUGGGGGGA
G II IIIIIIIIII G
G UGA GOGGAGGGOGOGAG A AGOG ~OGuGGA AG u
GA UA oGUAG U AG~,,% G C ~ ~ " I UAA
AOGAGGAG~ UG~CG~U AA U A ] CaGe I I O
G U
x;: ~ A %A G~A~Ac
u,~ G U A
o °-c; GAGA
UA ~GA G--G
O GGG~G G--C
A G--G
....
A
AAG A
caG--G
ACCUG~GAUG
A o*G~
UA ~'~ GG
AGG .% CU A
o t'cGU U
AGuU
oliiii III IIIII III GA
G A G c AGCAAG COG ACAGG GGG I A
AA C U A AUG AGo
C--G
u U
o o
Ace
Fig. 3. Secondary structure of the G. muris LSU-rRNA. The secondary structure is divided in six domains, indicated with roman
numerals (I-VI). The positions of some variable regions are indicated with A-H and are based on the compendium of Gutell et al.
(1990). (A) Domains I-III and (B) IV-VI. Continued on page 324.
Healey et al. (1990), referred to as G. intestinalis in
their paper. (Minor corrections were made based on
our own sequence analysis.) The following general
features appeared when the various domains and
helices were analyzed.
Domain I
The first feature is what appears to be a 5.8S rRNA
that is not a part of the LSU-rRNA as is the case in
prokaryotes and Vairimorpha (Vossbrinck and
7. 324
°A QG
A--U
A--U
a °--O
^--U
0 0
o ¢
° ° ¢ ^
G.V U °
O--~ G--O
GO ^ 0 o 0
O0_ou ° 0-°
O--G ~0u
°-- cA G • UA A AU--A
G--° U--A A U~ GU
O--O AG--O G.~A
O--G °--O °,~u
AG'UG U CU
U III I'I
FuO% ' °^ueu°u %° _ o^^°u^~°%~ ouooou^~..
A -- u 0 - G "Guo
°-° °0-o~ %^o u°°^ "• oV--A O--G G G • °
o-o o_,
U--A A -U
0 A CA G--O UuA°--0°
~°^, °_o °_co°^co,^~_~
°GGG GGAGU A
Ucl~ III1" O
0A 00VO~ A U
AuG UOAAACGGOUGAAG, UG
°--0
O--G
A--U
°--0
AOO--G
O~o_g
G--O
~°°--CGuo A
5' hall IV ~Cou
A
0A GGAA
UG
Fig. 3.
GOoG
AGOAuA GA O--Ouu
o 0AGO(I c
o tllll
U OU000 A°
A--U
C--G
U U
A--U
a -- 0A
G--cG--O oGGAA
o-, %_~A--U
G--O G--C
G--O A -O
G--¢
U--A
°--0
~-" °~°~°°°^~°,i,, o^(] .°-°°o
0 oGGOUA0 O _GO
o -- OOACA U --A GoAU
U.o II AA
A--U uOA O--a A OG°~
G--C GA OA G--CAA
O--° A ~; A--U U
A--U
U--A
0--0
U--A GO° Uo--°O
A--U ° G--O G--O
G.UA • G G~ G IJ-- A
C_G° U " _
O_GuU 0~0 O--O
A=U II I IIIIII G
oo OuU°AU°Oo0 v
G 0 Ua_oC
U--A
G'U °'UA A AAAG
A--U g °-c ° A
U --AA O
°--0 o O-co
U--A o-°UV ° °-°
U--A A--U
O--° O--0
O--G 0 0
A G A O--G AUAA OUAAUG
0G A
U °G 0 G U
Vili;iIIUGGUCGG GGG.II Akll ~cuUGuGGcoGcO AAGcGCc
^CCAa¢CAUCC UU • • • • I I I • I I I C
U UA U UAG UUUOGGUG °G A
U ¢ UU ¢A a AU
oo o .ooO. /
--,,, ° o .o oo°^
~ ° o oV o^ ....X 0,.,~ as/*" u° .% °%.°U • 00 A Lr'e A 0--°
Q A¢/• GU O G.% GO G--O
• • 0 U A °A--U
a-%-u u ^" ° o, ~o,,, A uo-°
o o~ ¢ a.u
....... ; ,,ooc0 III k
0 UUUU000 0 0 0 AA~UAocG UCQGOAU° °
0 AG ^UOO%~ /°A U¢ II IIIIII- A
Ao u o ~ooo°o~-o u uoo^oooouo°o ~
°-o ° ° %G°¢~ g:~ °"
°_0 oU. oA o .% ¢_o
~.o o~,o o ~ o o~-~,
° Oo Uv°
oO. u o o uc %~•%
Ao o ^ %,o
o
%. ~o°
VoG
0--o
o--0 A
OO--0U
0 u
A O
°CoO u
Continued from page 323.
Woese 1986). It is possible to isolate 5.8S rRNA
from G. duodenalis (Montanez et al. 1989) and S1
mapping has identified the position of the 5' end of
the LSU-rRNA (Boothroyd et al. 1987). The size of
the Giardia 5.8S rRNA, however, appears to be
smaller than that in other eukaryotes (Edlind and
Chakraborty 1987; van Keulen et al. 1991a). Al-
though the exact 5' and 3' ends of the RNA are not
known, the 5.8S rRNA sequence can be estimated
from the base-pairing scheme. A similar, though not
identical, structure was obtained by Edlind et al.
(1990) for the G. duodenalis 5.8S domain. The
shorter size of the Giardia 5.8S rRNA seems to be
the result of shortening of helices A and B in a
tentative G. duodenalis model. These helices are
absent in G. muris and G. ardeae. An absence of
helix A and B is also found in Pirulla marina; helix
B is absent in many but not all plastid RNAs [see
Gutell et al. (1990) for a survey of these irregular
helices]. The largest variation in size is seen in re-
gion C, also described as domain D2 (Michot and
Bachellerie 1987), which is considered a eukaryotic-
8. A uuUCGuuU
G 0-~ UA*- s's.ss ~
C -- cC GGU -- GAA S.8S 3'
U--C G--U
G--C U
G-- G--U C
°.... s °::°:::°ooooo--u A A
O -- G G U cG
A O
O O CG-~ U A U U o uU~,%GU t
A G G "- eG~.,U C--G cc G oGo~'uA 5'28S
.9° }.:o .... o o.-uA G AGO CG C UU
GG UG A
G A
GU GGUC GuUUAAAAGA UOU U uG
-- UIII
A A U C O
N U
GA GO GGUCOGGGOG G G
II IIIIIIIIII A
CG UG UUGGGUUUGU C
~, e A A~% UuGUUGAAo
GAAUoAGGcGAU GOGGu UA G% G d I[['r UUAAAGO%C OUGc I I G
CG~C
u- o ~ o , 2 ,o .... Ao,O
G % G N G% C G cGAG% U G G~ C A
A CC A%uCU A AG
e AGG~CA
C--G
A o--u
e G__ o G--OG--U
G--UGu A eU,~,A ,~G- o
GO %%CA ~' CG GOA sAG O UG GC
GU G ~' UG A CGCUGUGGGGC GGCG GCC GCUG CC
c G u U II'l'l-III IIII III IIII I G
e UC G A G CGGCGCGUCUGGGCUGU CGG UGGC GA
A e u AUGo --G A
O--G
OU -- Gu
GAA oG
GAA
U--U
CCC--GCG
A
325
5' 5.8S
/ G G e 5.8S 3"
AC ¢¢ AC I
Gu
~° ~ ~ o
% Ac_GA G
G--C GOG--C
G--C U--A O
U--A G--C A
G--cC--GA G--¢ G "~ ,
G--e AUGC--GG GA "oUtAGeS 28S
c - G C C C GG~C C
C -- G G U C~cO
C C A CG U
%. CG GGCG A ,GG
AGGGCCGCUCAAAAGAc U U GG
III C--G G
Oo................ :
AACG-C U C C
II IIIIIIIII C
CG UG CCGAGCCCGCC
GAACcA GGcGAU AUG U G G Illll UAAGG C AGC~c COOGc II
C U--UG A A A G UUAGG_ C A U
U_ ¢ G--C
O--G G A--U
AU AA UG -- C
U U C--GGCU GU A A A G--CG
A AGU%~"G GCGUGU A G G G C
GuGC~, %'CGC A UUUGUGGG UUU CCG GUGAG
i.l.,il ill Ill lili ~GG
GGU GGCA ~'CGU U G A G uGU CGCAUC CCCU GA
U U G
AGU GCG CUGU - UC
U--U
¢--G
C U
G U
AGA
Fig. 4. The structureof the 5.8S rRNA
domain. The 5.8S rRNA domain of G.
duodenalis(A) and G. ardeae(B). Helices
A and B are indicated in the G. duodenalis
structure.
specific secondary structure element (Michot and
BacheUerie 1987). This region in eubacteria is about
30 nt and in archaebacteria about 80 nt. In eukaryotes,
however, the size varies from 213 (Crithidia)-873
(Homo) nt. In contrast to these large sizes, this equiv-
alent region in G. muris is only 34 nt and in G. duo-
denalis only 16 nt. The size of this region in G. ardeae
is almost the same as that in G. muris (35 nt). Region
C in G. duodenalis is the most truncated found so far.
Domain H
Region D is variable in size and secondary struc-
ture. This region is similar in G. duodenalis and G.
ardeae in contrast to what is observed for region C.
No phylogenetically conserved helix can be formed
for D in G. muris. Variable region E is generallcon-
sidered kingdom specific and is a typical eukaryotic
signature sequence.
9. 326
o
o
° o° ~o o
o~o ~o
o~'< <~-oo r~
~oo
IIIIIIIIIIIII II
~ _ IIII
o
I.IIIII
o
Oo ~0o Ooo~ZZ~oo
<o-o2
IIIIIIIII I|1 I! <
°ooo o
III o
O0 ~ 0
o
o
~-~/~%<
oo -%, ??o~
0~0 ~
~<~<oo~<~o~o~ <
tttttttlltttl ~ o<
o II IIIIII I.
° o%~<o=o=ooo=
0 • 0
o IIIlllll I.
o OO~<~DODot~ D
IIIII II
D~DODo0~ D
M.
o
o ~O0
o • o
%
0000~0
0
: V~o° o
o o
°o~o<oO°
o
D D 0 0 D
~Z
10. 327
Domain Ili
A number of variable regions are present in this
domain. In Giardia, these appear to be truncated,
accounting in part for the small size of the rRNA.
Domain IV
The second characteristically eukaryotic variable
region, indicated as F in Fig. 3, or D8 in the nomen-
clature of Michot and Bachellerie (1987), resides in
this domain. It is 45 nt in E. coli, 22-37 nt in ar-
chaebacteria, and ranges from 143 (Tetrahymena)
to 718 (Homo) nt in eukaryotes, with lower eukary-
otes having the smaller size. This region is only 20
nt in G. muris and is 28 nt in G. duodenalis. In G.
ardeae it is again similar to that of G. duodenalis,
namely, 26 nt.
Domain V
Region G is reduced in all three Giardia species,
with a size (22 nt) which is similar to that of
prokaryotes (30 nt in E. coli). This region is smaller
in Giardia than the same region in all other analyzed
eukaryotes, where it ranges from 74 (Tetrahymena)
to 261 (Crithidia) nt. The overall structure of this
domain and the sequence of the unpaired bases in
the central loop are similar to E. coli LSU-rRNA,
with a high degree of sequence conservation.
Domain VI
This domain shows the least sequence homology to
other LSU-rRNA. However, when folded, this do-
main has, despite being quite variable, a secondary
structure that is similar to that same domain in all
LSU-rRNA. The putative 3' end of the LSU-rRNA
can therefore be localized with a reasonable degree
of certainty. Since it is not possible to isolate
enough rRNA from G. muris cysts to identify the
exact position of the 3' end of the LSU-rRNA, se-
quence comparison to G. ardeae and G. duodenalis
was used to estimate the possible 3' end. The loss of
sequence homology between G. ardeae and G. duo-
denalis was used as an indication of the 3' end (van
Keulen et al. 1991a). This suggests that the G. duo-
denalis LSU-rRNA ends much earlier than other
investigators have reported (Boothroyd et al. 1987;
Healey et al. 1990). Based on the putative folding of
domain VI, this conclusion still holds. Additional
evidence comes from a reevaluation of the size of
the LSU-rRNA from G. duodenalis by gel electro-
phoresis of glyoxal-treated RNA which included
more markers than previously used. A size of about
1.7 kb was obtained (unpublished results). Based on
the alignments of Fig. 2, this agrees well with the
observed size of the LSU-rRNA. The major vari-
able region in domain VI is indicated with an H in
Fig. 3. This region is much larger in other eukary-
otes (64 nt [Euglena]-230 nt [Rattus] than in Giar-
dia: 19 nt in G. muris and 38--40 nt in the other two
species of Giardia.
In conclusion, the size of the LSU-rRNA of G.
muris is, together with the similar structures of G.
duodenalis and G. ardeae LSU-rRNA, prokaryotic
rather than eukaryotic. However, a number of phy-
logenetic signatures link Giardia with eukaryotes.
Two variable regions in particular, C and F, but also
G and H, are examples of a considerable shortening
of the rRNA size where these structures are among
the shortest found in eukaryotic LSU-rRNA. All
this supports the suggestion that Giardia appears to
occupy a unique position having one of the most
truncated LSU-rRNAs found in eukaryotes so far.
This is consistent with other observations to the
effect that Giardia belongs to the earliest and deep-
est branching of the eukaryotic evolutionary tree.
However, to give Giardia the status of "missing
link" between pro- and eukaryotes, as others (Kab-
nick and Peattie 1991) have suggested, cannot be
maintained, since Giardia LSU-rRNA shows many
typical eukaryotic features. The rDNA gene, as a
whole, has many distinctly eukaryotic features, be-
ing tandemly repeated, having spacer DNA with
variable sizes, and containing a separate gene for
the 5.8S rRNA. The short size of the entire op-
eron--which is the result of a short spacer, a
smaller 5.8S rRNA, and the shortening in size of
important structural elements in the rRNAs, as
shown here for the LSU-rRNA--and the finding of
open reading frames in the rDNA operon, appear to
make Giardia rRNA genes unique.
Acknowledgments. We gratefully acknowledge Bryn Weiser's
programming expertise, which resulted in the secondary struc-
ture drawings. We would also like to thank the W.M. Keck
Foundation for their generous support of RNA Science on the
Boulder campus. This work was supported by grants from the
Ohio Boards of Regents Academic and Research Challenge Pro-
grams (H.v.K., E.L.J.) and the U.S. Environmental Protection
Agency cooperative agreement CR-816637-01-0 to the University
of Minnesota (S.L.E.) and Cleveland State University (E.L.J.).
The EMBL Data Library accession number is X65063. R.R.G. is
an associate in the Evolutionary Biology Program of the Cana-
dian Institute for Advanced Research.
References
Benson SA (1984) A rapid procedure for isolation of DNA frag-
ments from agarose gels. BioTechniques 2:66-68
Birnboim HC (1983) A rapid alkaline extraction method for the
isolation of plasmid DNA. Meth Enzymol 100:243-255
11. 328
Bird RC, Wu G (1989)An alternative to the dry ice/ethanol bath
for precipitation of nucleic acids. BioTechniques 7:337-338
Boothroyd JC, Wang A, Campbell DA, Wang CC (1987) An
unusually compact ribosomal DNA repeat in the protozoan
Giardia lamblia. Nucl Acids Res 15:4065--4084
Brosius J, Dull TJ, Noller HF (1980) Complete nucleotide se-
quence of a 23S ribosomal RNA gene from Escherichia coli.
Proc Natl Acad Sci USA 77:201-204
Campbell SR, van Keulen H, Erlandsen SL, SenturiaJB, Jarroll
EL (1990) Giardia sp: comparison of electrophoretic karyo-
types. Exp Parasitol 71:470--482
Edlind TD, Chakraborty PR (1987) Unusual ribosomal RNA of
the intestinal parasite Giardia lamblia. Nucl Acids Res 15:
7889-7901
EdlindTD, Sharetzsky C, Cha ME (1990)Ribosomal RNA of the
primitive eukaryote Giardia lamblia: large subunit domain I
and potential processing signals. Gene 96:289-293
Erlandsen SL, Bemrick WJ (1987)SEM evidence for a new spe-
cies, Giardia psittaci. J Parasitol 73:623-629
Erlandsen SL, Bemrick WJ, Wells CL, Feely DE, Knudson L,
Campbell SR, van Keulen H, Jarroll EL (1990) Axenic cul-
ture and characterization of Giardia ardeae from the great
blue heron (Ardea herodias). J Parasitol 76:717-724
Filice FP (1952) Studies on the cytology and life history of a
giardia from the laboratory rat. Univ Calif Publ Zool 57:53-
146
Gerbi SA (1985)Evolutionof ribosomal DNA. In: Maclntyre RJ
(ed) Molecular evolutionarygenetics. Plenum, New York, pp
410-517
GutellRR, Schnare MN, Gray MW (1990)A compilation of large
subunit (23S-like) ribosomal RNA sequences presented in a
secondary structure format. Nucl Acids Res 18(Suppl):2119-
2330
Healey A, Mitchell R, Upcroft JA, Boreham PFL, Upcroft P
(1990) Complete nucleotide sequence of the ribosomal RNA
tandem repeat unit from Giardia intestinalis. Nucl Acids Res
18:4006
Jacob ST (1986) Transcription of eukaryotic ribosomal RNA
gene. Mol Cell Biochem 70:11-20
Kabnick KS, Peattie DA (1991)Giardia: A missing link between
prokaryotes and eukaryotes. Am Scientist 79:34-43
Maniatis T, Fritsch EF, Sambrook J (1982)Molecular cloning, a
laboratory manual. Cold Spring Harbor Laboratory, Cold
Spring Harbor.
Mankin AS, Kagramanova VK (1986) Complete nucleotide se-
quence of the single ribosomal RNA operon of Halobacte-
rium halobiurn: secondary structure of the archaebacterial
23S rRNA. Mol Gen Genet 202:152-161
Messing J (1983)New MI3 vectors for cloning. Methods Enzy-
tool 101:20-78
Michot B, Bachellerie J-P (1987) Comparisons of large subunit
rRNAs reveal some eukaryote-specific elements of second-
ary structure. Biochimie 69:11-23
Montanez C, Cervantes L, Ovando C, Ortega-Pierres G (1989)
Giardia lamblia: isolation of RNA. Exp Parasitol 68:354-356
Roberts-Thomson IC, Stevens DP, Mahmoud AAF, Warren KS
(1976) Giardiasis in the mouse: an animal model. Gastroen-
terology 71:57-61
Sauch JF (1984)Purification of Giardia muris cysts by velocity
sedimentation. Appl Environ Microbiol 48:454-455
Sogin ML, GundersonJH, Elwood HJ, Alonso RA, Peattie DA
(1989) Phylogenetic meaning of the kingdom concept: an un-
usual ribosomal RNA from Giardia lamblia. Science 243:75-
77
Sollner-Webb B, Tower J (1986)Transcriptionof cloned eukary-
otic ribosomal RNA genes. Ann Rev Biochem 55:801-830
Upcroft JA, Healey A, Mitchell R, Boreham PFL, Upcroft P
(1991) Antigen expression from the ribosomal DNA repeat
unit of Giardia intestinalis. Nucl Acids Res 18:7077-7081
van Keulen H, Loverde PT, Bobek LA, Rekosh DM (1985)Or-
ganization of the ribosomal RNA genes in Schistosoma man-
soni. Mol Biochem Parasitol 15:215-230
van Keulen H, Campbell SR, Erlandsen SL, Jarroll EL (1991a)
Cloning and restriction enzyme mapping of ribosomal DNA
of Giardia duodenalis, Giardia ardeae and Giardia muris.
Mol Biochem Parasitol 46:275-284
van Keulen H, Horvat S, Erlandsen SL, Jarroll EL (1991b)Nu-
cleotide sequence of the 5.8S and large subunit rRNA genes
and the internal transcribed spacer and part of the external
spacer from Giardia ardeae. Nucl Acids Res 19:6050
Vossbrinck CR, Woese CR (1986) Eukaryotic ribosomes that
lack 5.8S RNA. Nature 320:287-288
Wilbur WJ, Lipman DJ (1983) Rapid similarity searches of nu-
cleic acid and protein data banks. Proc Natl Acad Sci USA
80:726-730
Woese CR, Gutell R, Gupta R, Noller HF (1983)Detailed anal-
ysis of the higher order structure of 16S-likeribosomal ribo-
nucleic acids. Microbiol Rev 47:621-669
Woese CR (1987)Bacterial evolution. Microbiol Rev 51:221-271
Received August 19, 1991/RevisedApril 22, 1992