More Related Content
Similar to 20110602labseminar pub (20)
20110602labseminar pub
- 1. RNA-Seq
(yag_ays)
http://yag-ays.jp/pdf/20110602labseminar_pub.pdf
- 2. r e d
n s o
c e
usagi
usamimi
- 3. NGS
(Next Generation Sequencing)
RNA-Seq
(Transcriptome Analysis)
de novo
Transcriptome Assembly
- 7. NGS RNA-Seq
A T G C
NGS
• illumina / Solexa GA
• ABI / SOLiD
• Roche / 454
• PacBio
• Helicos / Heliscope
• ion torrent etc...
mRNA
TTAGCCTTAGCTTCC
GTCGCAACTTCCTTA
TTCACGAGCTTGATG
TTGCGGATCACTTTG
- 8. NGS RNA-Seq
A T G C
NGS NGS
• illumina / Solexa GA
• • ABI / SOLiD
• Roche / 454
• • PacBio
• Helicos / Heliscope
• ion torrent etc...
•
mRNA
TTAGCCTTAGCTTCC
GTCGCAACTTCCTTA
TTCACGAGCTTGATG
TTGCGGATCACTTTG
- 10. RNA-Seq
•
•
•
ʻalign-then-assembleʼ ʻassemble-then-aliignʼ
approach approach
- 11. RNA-Seq
• 454
•
•
ʻalign-then-assembleʼ ʻassemble-then-alignʼ
approach approach
- 15. RNA-Seq
•
• cDNA
ʻalign-then-assembleʼ ʻassemble-then-alignʼ
approach approach
- 16. Sujai Kumar and Mark L Blaxter : Comparing de novo
assemblers for 454 transcriptome data (2010)
Newbler 2.5
- 17. Sujai Kumar and Mark L Blaxter : Comparing de novo
assemblers for 454 transcriptome data (2010)
Newbler 2.5
...
- 18. Sujai Kumar and Mark L Blaxter : Comparing de novo
assemblers for 454 transcriptome data (2010)
Newbler 2.5
...
Trinity...!!
- 19. 1. Newbler 2.5
• Roche 454
• 454
•
2. Trinity
• Broad Institute
• 454 ( )
• Nat Biotechnol. 2011 May *
* Grabherr MG, Haas BJ,Yassour M et al. : Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011 May 15
- 20. 1. Newbler 2.5
• Overlap-Layout-Consensus (OLC)
2. Trinity
I. Inchworm : k-mer graph
II. Chrysalis : Contig pool
III.Butterfly : De Bruijn Graph
2
- 21. Roche 454 pyrosequencing
usamimi 0.3M reads
(sff or fastq format)
Newbler 2.5 Trinity
(fasta format) (fasta format)
GMAP
with usagi CDS
(gff format) (gff format)
- 23. Newbler 2.5 Trinity
Newbler 2.5 Trinity
Number of
19,753 20,758
contigs
Total Bases 9,651,390 10,275,166
Max contig
2,878 2,151
length
Mean contig
488.6 495
length
N50 581 616
- 24. Newbler 2.5 N = 19,753
Trinity N = 20,758
http://edwards.sdsu.edu/prinseq_beta/
- 25. usagi CDS
all
usagi CDS 30,000 ≧ 80% alignment
≧ 90% alignment
≧ 95% alignment
Newbler 2.5 Trinity 100% alignment
16000
all 15,498 15,524
12000
≧ 80% 14,583 14,697
8000
≧ 90% 8,466 8,665
≧ 95% 1,059 1,191 4000
100% 66 30 0
Newbler 2.5 Trinity
- 26. usagi
Newbler 2.5 Trinity
12,417 10,433
genes genes
2,990 9,427 1,006
- 27. ...
S. Kumar et al.(2010)
Poly(A/T)
Poly(A/T)
Poly(A/T)
- 28. Poly(A/T) Trinity > Newbler 2.5
Newbler 2.5 Trinity
257 3,773
Poly T (1.30%) (18.18%)
20 bp 20 bp
539 2,349
Poly A (2.73%) (11.32%)
20 bp 20 bp
http://edwards.sdsu.edu/prinseq_beta/ ()
- 29. Poly(A/T) Trinity > Newbler 2.5
Newbler 2.5 Trinity
257 3,773
Poly T (1.30%) (18.18%)
Poly(A/T) Quality Value
20 bp 20 bp
→Newbler Quality trimming ...?
539 2,349
Poly A (2.73%) (11.32%)
20 bp 20 bp
http://edwards.sdsu.edu/prinseq_beta/ ()
- 34. Method : Parameters
• Newbler 2.5 • Trinity (20110519 ver.)
• -notrim • --seqType=fq
• -urt • --single
• --min_contig_length 50
• --run_butterfly
• --CPU 4
• --bfly_opts "--
compatible_path_extensi
on --stderr "