SlideShare a Scribd company logo
1 of 24
Download to read offline
Ⓒ 2014 Invitae
Reece Hart, Ph.D.Reece Hart, Ph.D.
reece@invitae.comreece@invitae.com
Human Variome Project Meeting 2014, ParisHuman Variome Project Meeting 2014, Paris
The Clinical Significance of TranscriptThe Clinical Significance of Transcript
Alignment DiscrepanciesAlignment Discrepancies
…… and tools to help you deal with them.and tools to help you deal with them.
2 / 24 Ⓒ 2014 Invitae
The fidelity of transcript-genome mapping matters.The fidelity of transcript-genome mapping matters.
Variants are identified
and computed on in
genome coordinates
Variants are analyzed and
communicated using
transcript coordinates
genome to
transcript
(g. to c.)
transcript
to genome
(c. to g.)
3 / 24 Ⓒ 2014 Invitae
Motivation 1: Discordant exon coordinatesMotivation 1: Discordant exon coordinates
NCBI and UCSC report different coordinates for NM_052813.3, exon 12NCBI and UCSC report different coordinates for NM_052813.3, exon 12
UCSC
(BLAT)
NCBI
(Splign)
Consequences:
1. An assay that targets the wrong genomic region will generate
uninformative sequence data.
2. A genomic variant will be interpreted as exonic when it is
intronic, or vice versa.
exon 12
displaced 322 nt
4 / 24 Ⓒ 2014 Invitae
Motivation 2: indels confound mappingMotivation 2: indels confound mapping
NM_006158.3 (NEFL) contains indel in CDSNM_006158.3 (NEFL) contains indel in CDS
5 / 24 Ⓒ 2014 Invitae
Challenges and Solutions in Transcript ManagementChallenges and Solutions in Transcript Management
➢ Biological
● Alternative splicing
● Paralogs
● Natural polymorphisms
● Alternative references
➢ Technical / Logistical
● Multiple transcript sources
● Multiple alignment methods
● Multiple references
● Genome-transcript sequence
differences
● Historical transcript alignments
➢ Existing resources
● RefSeq, UCSC, Ensembl
● Locus Reference Genomic
● Mutalyzer
➢ See also
● McCarthy DJ¸ et al. Genome
Medicine 6:26 (2014).
● Garla V, et al. Bioinformatics
27(3): 416–8 (2010).
6 / 24 Ⓒ 2014 Invitae
Universal Transcript Archive (UTA)Universal Transcript Archive (UTA)
➢ Single database of:
● Multiple transcripts and versions
● … from multiple sources
● … aligned to multiple references
● … by multiple alignment methods
➢ Freely available!
● Apache licensed
● Public PostgreSQL database instance at uta.invitae.com:5432
● Local installation instructions
● Code at http://bitbucket.org/invitae/uta/
7 / 24 Ⓒ 2014 Invitae
Our Bermuda TriangleOur Bermuda Triangle
RefAgree
Do transcript and
genome sequences agree?
Transcript Equivalence
Which RefSeq and Ensembl
transcripts are equivalent?
RefSeq
(NM)
Ensembl
(ENST)
Genome
(GRCh37)
➊SNV
➌
➋ Indel
➍Historical Transcripts
8 / 24 Ⓒ 2014 Invitae
Universal Transcript Archive (UTA)Universal Transcript Archive (UTA)
Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database
transcript
NM_01234.4
NM_01234.4
NM_01234.5
NM_01234.5
NM_01234.5
NM_01234.5
ENST012345
ENST012345
reference
NM_01234.4
NC_000012.3
NM_01234.5
NC_000012.3
AC_45678.9
NC_000012.3
ENST012345
NC_000012.3
method
self
splign
self
splign
splign
blat
self
genebuild
exons
exon set
9 / 24 Ⓒ 2014 Invitae
Universal Transcript Archive (UTA)Universal Transcript Archive (UTA)
Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database
transcript
NM_01234.4
NM_01234.4
NM_01234.5
NM_01234.5
NM_01234.5
NM_01234.5
ENST012345
ENST012345
reference
NM_01234.4
NC_000012.3
NM_01234.5
NC_000012.3
AC_45678.9
NC_000012.3
ENST012345
NC_000012.3
method
self
splign
self
splign
splign
blat
self
genebuild
exons
exon set
exon alignments
NM_01234.4 NC_000012.3 0 50=
NM_01234.4 NC_000012.3 1 100=1X49=
NM_01234.4 NC_000012.3 2 5=1I44=
➊➋
Alignments use
coordinates from source
databases.
10 / 24 Ⓒ 2014 Invitae
Universal Transcript Archive (UTA)Universal Transcript Archive (UTA)
Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database
transcript
NM_01234.4
NM_01234.4
NM_01234.5
NM_01234.5
NM_01234.5
NM_01234.5
ENST012345
ENST012345
reference
NM_01234.4
NC_000012.3
NM_01234.5
NC_000012.3
AC_45678.9
NC_000012.3
ENST012345
NC_000012.3
method
self
splign
self
splign
splign
blat
self
genebuild
exons
exon set
➌
11 / 24 Ⓒ 2014 Invitae
Universal Transcript Archive (UTA)Universal Transcript Archive (UTA)
Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database
transcript
NM_01234.4
NM_01234.4
NM_01234.5
NM_01234.5
NM_01234.5
NM_01234.5
ENST012345
ENST012345
reference
NM_01234.4
NC_000012.3
NM_01234.5
NC_000012.3
AC_45678.9
NC_000012.3
ENST012345
NC_000012.3
method
self
splign
self
splign
splign
blat
self
genebuild
exons
exon set
➍
12 / 24 Ⓒ 2014 Invitae
““RefAgree” Statistics by Protein Coding TranscriptRefAgree” Statistics by Protein Coding Transcript
Sequence concordance between RefSeq and GRCh37 primary assemblySequence concordance between RefSeq and GRCh37 primary assembly
c.f. Garla V, et al. Bioinformatics 27(3): 416–8 (2010).
34531 NM transcripts (Jan 2014)
760 0.2% with length discrepancies
3481 10% with substitutions
321 0.9% with deletions
255 0.7% with insertions
➊➋
13 / 24 Ⓒ 2014 Invitae
NCBI (Splign) v. UCSC (BLAT) Alignment StatisticsNCBI (Splign) v. UCSC (BLAT) Alignment Statistics
Splign and BLAT provide significantly different exon structures for 886 transcriptsSplign and BLAT provide significantly different exon structures for 886 transcripts
Are Splign
and BLAT
similar ?
31472 (97.3%)
transcriptsY
N
32358
transcripts
w/exon structures
➌
886 (2.7%)
transcripts
“similar” means either
1) identical exon coordinates, or
2) coordinates that differ only by
short 3' terminal artifacts
14 / 24 Ⓒ 2014 Invitae
Characterization of transcripts discrepanciesCharacterization of transcripts discrepancies
Whether alignments provided by NCBI and UCSC agree with GRCh37 primary sequence.Whether alignments provided by NCBI and UCSC agree with GRCh37 primary sequence.
Splign
BLAT
T F
T 14 18
F 545 311
886 transcripts with
significant discrepancies
15 / 24 Ⓒ 2014 Invitae
Characterization of transcripts discrepanciesCharacterization of transcripts discrepancies
Reference agreement (blue) and alignment “simplicity” (green)Reference agreement (blue) and alignment “simplicity” (green)
Splign
BLAT
T F
T 14 18
F 545 311
Splign
BLAT
T F
T 200
(0)
4
(97)
F 90
(82)
16
(84)
Splign
BLAT
T F
T 6
(41)
12
(180)
F
Splign
BLAT
T F
T 434
(7)
F 110
(652)
Splign
BLAT
T F
T 14
(11)
F
886 transcripts with
significant discrepancies
16 / 24 Ⓒ 2014 Invitae
Summary of Splign-BLAT gene-wise coordinate deltas.Summary of Splign-BLAT gene-wise coordinate deltas.
delta # genes # ACMG must
report
=0 15206 44
>=1 183 8
>=10 116 0
>=25 6 0
>=50 5 0
>=250 13 0
>=1000 94 2
ND 3
delta ≝ minimum per gene of maximum per transcript of
difference of exon coordinates between NCBI and UCSC.
MYH7, TNNI3
(all trivial diffs)
LDLR, MYL2,
PRKAG2, SDHB,
SDHC, TGFBR1,
TGFBR2, WT1
APOV,
MYHBPC3, NTRK
17 / 24 Ⓒ 2014 Invitae
HGVS Python PackageHGVS Python Package
http://bitbucket.org/invitae/hgvs/http://bitbucket.org/invitae/hgvs/
➢ Parser
● HGVS Python object→
● Based on a Parsing Expression
Grammar
➢ Formatter
● Python object HGVS→
➢ Validator
● intrinsic & extrinsic validation
➢ Mapping tools indel-aware!
● g. c. p. (m,n,r also supported)↔ →
● transcript-to-transcript liftover
● uses on UTA data
18 / 24 Ⓒ 2014 Invitae
Example: Variant liftover between transcriptsExample: Variant liftover between transcripts
Map
from NM_182763.2:c.688+403C>T➀
to NC_000001.10:g.150550916G>A➁
to ➂ NM_001197320.1:281C>T
with Splign alignments
NM_001197320.1
NP_001184249.1
NM_182763.2
NP_877495.1
➀
➂
➁
NC_000001.10
19 / 24 Ⓒ 2014 Invitae
Developer InfoDeveloper Info
Testing
➢ 91% code coverage
➢ 25665 tests variants
● ~200 hand curated, rest from
dbSNP
● 23436 sub, 1254 del, 908 ins, 45
delins, 22 dup
● 44 distinct transcripts, many
selected for difficulty
Upcoming issues
(all issues are publicly readable)
➢ multi-variant alleles
➢ release LRG
➢ GRCh38
➢ API changes
20 / 24 Ⓒ 2014 Invitae
AcknowledgementsAcknowledgements
➢ Vince Fusaro
➢ John Garcia
➢ Emily Hare
➢ Kevin Jacobs
➢ Geoff Nilsen
➢ Rudy Rico
➢ Jody Westbrook
http://bitbucket.com/invitae/
➢ Code (Python)
➢ Documentation & Examples
➢ Issues
➢ BED files
➢ Code testing is public
Or just:
pip install hgvs
21 / 24 Ⓒ 2014 Invitae
22 / 24 Ⓒ 2014 Invitae
T
RefSeq
NM_01234.4
UTA solves four issues with transcript management.UTA solves four issues with transcript management.
RefSeq
NM_01234.5
InDel
UCSC
NM_01234.5
➌
Exon coordinate differences between sources for same accession➍
Historical transcripts alignments no longer available
➊ SNV
A
➋
Transcript =≠ Genome Reference
24 / 24 Ⓒ 2014 Invitae
ENSTs equivalent with NMsENSTs equivalent with NMs
=> select N.hgnc,N.es_fingerprint,N.tx_ac,E.tx_ac
from uta_20140210.tx_exon_set_summary_mv N
join uta_20140210.tx_exon_set_summary_mv E
  on N.es_fingerprint=E.es_fingerprint
  and N.tx_ac ~ '^NM_' and E.tx_ac ~ '^ENST'
  and N.alt_aln_method='transcript'
  and E.alt_aln_method='transcript';
┌─────────┬──────────────────────────────────┬────────────────┬─────────────────┐
  │ hgnc              es_fingerprint                tx_ac             tx_ac      │ │ │ │
├─────────┼──────────────────────────────────┼────────────────┼─────────────────┤
 │ AFF2      db0e20be1a2bb687c33227d2e6bf9d53   NM_002025.3      ENST00000370460 │ │ │ │
 │ UBE3A     d1eace7da295c45378fa5f898f2f03f6   NM_130838.1      ENST00000438097 │ │ │ │
 │ ANXA8L1   1f6fd4f3fe9854aa468489ec7f507512   NM_001098845.1   ENST00000359178 │ │ │ │
 │ APOL5     939a9e9e4a46ef9aef862cf9b369afe6   NM_030642.1      ENST00000249044 │ │ │ │
 │ ARID4B    524fc954d10b08a4014e86aee81d0358   NM_016374.5      ENST00000264183 │ │ │ │

More Related Content

What's hot

RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2BITS
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonGenome Reference Consortium
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Maté Ongenaert
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizeAnn Loraine
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalJoachim Jacob
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012Dan Gaston
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataJoachim Jacob
 
Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Deanna Church
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysismikaelhuss
 
rnaseq_from_babelomics
rnaseq_from_babelomicsrnaseq_from_babelomics
rnaseq_from_babelomicsFrancisco Garc
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3BITS
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilChristian Frech
 
Part 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionPart 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionJoachim Jacob
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1BITS
 
New methods deep variant evaluation of draft v4alpha
New methods   deep variant evaluation of draft v4alphaNew methods   deep variant evaluation of draft v4alpha
New methods deep variant evaluation of draft v4alphaGenomeInABottle
 

What's hot (20)

RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
 
Workshop NGS data analysis - 1
Workshop NGS data analysis - 1Workshop NGS data analysis - 1
Workshop NGS data analysis - 1
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualize
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw data
 
Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013Church_GenomeAccess_2013_genome2013
Church_GenomeAccess_2013_genome2013
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
rnaseq_from_babelomics
rnaseq_from_babelomicsrnaseq_from_babelomics
rnaseq_from_babelomics
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
Part 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expressionPart 5 of RNA-seq for DE analysis: Detecting differential expression
Part 5 of RNA-seq for DE analysis: Detecting differential expression
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
New methods deep variant evaluation of draft v4alpha
New methods   deep variant evaluation of draft v4alphaNew methods   deep variant evaluation of draft v4alpha
New methods deep variant evaluation of draft v4alpha
 

Viewers also liked

PEI - Projecto Promoção do Empreendedorismo
PEI - Projecto Promoção do EmpreendedorismoPEI - Projecto Promoção do Empreendedorismo
PEI - Projecto Promoção do EmpreendedorismoAssociação Mais Brasil
 
Palestra sobre E-Mail Marketing - Marketing Digital
Palestra sobre E-Mail Marketing - Marketing DigitalPalestra sobre E-Mail Marketing - Marketing Digital
Palestra sobre E-Mail Marketing - Marketing DigitalRodrigomarroni
 
Atps 2 eletromagnetismo etapa 2 passo 1
Atps 2 eletromagnetismo etapa 2 passo 1Atps 2 eletromagnetismo etapa 2 passo 1
Atps 2 eletromagnetismo etapa 2 passo 1Renato Oliveira
 
Aula3 sistema tributario_e_principios
Aula3 sistema tributario_e_principiosAula3 sistema tributario_e_principios
Aula3 sistema tributario_e_principiosACCDias
 

Viewers also liked (9)

Materi 6
Materi 6Materi 6
Materi 6
 
Wk mai
Wk maiWk mai
Wk mai
 
PEI - Projecto Promoção do Empreendedorismo
PEI - Projecto Promoção do EmpreendedorismoPEI - Projecto Promoção do Empreendedorismo
PEI - Projecto Promoção do Empreendedorismo
 
Palestra sobre E-Mail Marketing - Marketing Digital
Palestra sobre E-Mail Marketing - Marketing DigitalPalestra sobre E-Mail Marketing - Marketing Digital
Palestra sobre E-Mail Marketing - Marketing Digital
 
Iso9000 mx 3
Iso9000 mx 3Iso9000 mx 3
Iso9000 mx 3
 
Atps 2 eletromagnetismo etapa 2 passo 1
Atps 2 eletromagnetismo etapa 2 passo 1Atps 2 eletromagnetismo etapa 2 passo 1
Atps 2 eletromagnetismo etapa 2 passo 1
 
Fórmula capilar
Fórmula capilarFórmula capilar
Fórmula capilar
 
Lady Zeitung
Lady ZeitungLady Zeitung
Lady Zeitung
 
Aula3 sistema tributario_e_principios
Aula3 sistema tributario_e_principiosAula3 sistema tributario_e_principios
Aula3 sistema tributario_e_principios
 

Similar to The Clinical Significance of Transcript Alignment Discrepancies … and tools to help you deal with them - Reece Hart

2014 Wellcome Trust Advances Course: NGS Course - Lecture2
2014 Wellcome Trust Advances Course: NGS Course - Lecture22014 Wellcome Trust Advances Course: NGS Course - Lecture2
2014 Wellcome Trust Advances Course: NGS Course - Lecture2Thomas Keane
 
Enhanced structural variant and breakpoint detection using SVMerge by integra...
Enhanced structural variant and breakpoint detection using SVMerge by integra...Enhanced structural variant and breakpoint detection using SVMerge by integra...
Enhanced structural variant and breakpoint detection using SVMerge by integra...Thomas Keane
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSHAMNAHAMNA8
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyDan Gaston
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim D. Pruitt
 
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...Thermo Fisher Scientific
 
Bioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisBioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisDespoina Kalfakakou
 
Systematic evaluation of spliced alignment programs for RNA-seq data
Systematic evaluation  of spliced alignment programs  for RNA-seq dataSystematic evaluation  of spliced alignment programs  for RNA-seq data
Systematic evaluation of spliced alignment programs for RNA-seq dataMonica Dragan
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016Reece Hart
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityMonica Munoz-Torres
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingStephen Turner
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGenomeInABottle
 

Similar to The Clinical Significance of Transcript Alignment Discrepancies … and tools to help you deal with them - Reece Hart (20)

2014 Wellcome Trust Advances Course: NGS Course - Lecture2
2014 Wellcome Trust Advances Course: NGS Course - Lecture22014 Wellcome Trust Advances Course: NGS Course - Lecture2
2014 Wellcome Trust Advances Course: NGS Course - Lecture2
 
20140710 3 l_paul_ercc2.0_workshop
20140710 3 l_paul_ercc2.0_workshop20140710 3 l_paul_ercc2.0_workshop
20140710 3 l_paul_ercc2.0_workshop
 
Enhanced structural variant and breakpoint detection using SVMerge by integra...
Enhanced structural variant and breakpoint detection using SVMerge by integra...Enhanced structural variant and breakpoint detection using SVMerge by integra...
Enhanced structural variant and breakpoint detection using SVMerge by integra...
 
Iplant pag
Iplant pagIplant pag
Iplant pag
 
RNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGSRNA sequencing analysis tutorial with NGS
RNA sequencing analysis tutorial with NGS
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and Pathology
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...
From NGS Back to Sanger Sequencing: Synchronizing Variant Files with the VR T...
 
Bioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisBioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysis
 
Systematic evaluation of spliced alignment programs for RNA-seq data
Systematic evaluation  of spliced alignment programs  for RNA-seq dataSystematic evaluation  of spliced alignment programs  for RNA-seq data
Systematic evaluation of spliced alignment programs for RNA-seq data
 
RNA-Seq with R-Bioconductor
RNA-Seq with R-BioconductorRNA-Seq with R-Bioconductor
RNA-Seq with R-Bioconductor
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016
 
ISHIposter16_f
ISHIposter16_fISHIposter16_f
ISHIposter16_f
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Use of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay DesignUse of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay Design
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Church gmod2012 pt2
Church gmod2012 pt2Church gmod2012 pt2
Church gmod2012 pt2
 
GIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM ForumGIAB for AMP GeT-RM Forum
GIAB for AMP GeT-RM Forum
 

More from Human Variome Project

ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa Landrum
ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa LandrumClinVar: Aggregating Data to Improve Variant Interpretation - Melissa Landrum
ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa LandrumHuman Variome Project
 
The BRCA Share(TM) Consortium - Christophe Beroud
The BRCA Share(TM) Consortium - Christophe BeroudThe BRCA Share(TM) Consortium - Christophe Beroud
The BRCA Share(TM) Consortium - Christophe BeroudHuman Variome Project
 
Establishing validity, reproducibility, and utility of highly scalable geneti...
Establishing validity, reproducibility, and utility of highly scalable geneti...Establishing validity, reproducibility, and utility of highly scalable geneti...
Establishing validity, reproducibility, and utility of highly scalable geneti...Human Variome Project
 
The PhenX Toolkit: Standard Measures for Collaborative Research - Wayne Huggins
The PhenX Toolkit: Standard Measures for  Collaborative Research - Wayne HugginsThe PhenX Toolkit: Standard Measures for  Collaborative Research - Wayne Huggins
The PhenX Toolkit: Standard Measures for Collaborative Research - Wayne HugginsHuman Variome Project
 
Legal and regulatory challenges to data sharing for clinical genetics and ge...
Legal and regulatory challenges to  data sharing for clinical genetics and ge...Legal and regulatory challenges to  data sharing for clinical genetics and ge...
Legal and regulatory challenges to data sharing for clinical genetics and ge...Human Variome Project
 
Report from the International Confederation of Countries Advisory Council - M...
Report from the International Confederation of Countries Advisory Council - M...Report from the International Confederation of Countries Advisory Council - M...
Report from the International Confederation of Countries Advisory Council - M...Human Variome Project
 
Human variome project quality assessment criteria for variation databases - M...
Human variome project quality assessment criteria for variation databases - M...Human variome project quality assessment criteria for variation databases - M...
Human variome project quality assessment criteria for variation databases - M...Human Variome Project
 
HVP Country Node: Venezuela - Aida Falcon de Vargas
HVP Country Node: Venezuela - Aida Falcon de VargasHVP Country Node: Venezuela - Aida Falcon de Vargas
HVP Country Node: Venezuela - Aida Falcon de VargasHuman Variome Project
 
Human Genetics of Infectious Diseases - Laurent Abel
Human Genetics of Infectious Diseases - Laurent AbelHuman Genetics of Infectious Diseases - Laurent Abel
Human Genetics of Infectious Diseases - Laurent AbelHuman Variome Project
 
HVP Country Node: Malaysia - Zilfalil bin Alwi
HVP Country Node: Malaysia - Zilfalil bin AlwiHVP Country Node: Malaysia - Zilfalil bin Alwi
HVP Country Node: Malaysia - Zilfalil bin AlwiHuman Variome Project
 
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès Rötig
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès RötigGENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès Rötig
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès RötigHuman Variome Project
 
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschThe BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschHuman Variome Project
 
Richard GH Cotton: He may have been a bit before his time - Michael Watson
Richard GH Cotton: He may have been a bit before his time - Michael WatsonRichard GH Cotton: He may have been a bit before his time - Michael Watson
Richard GH Cotton: He may have been a bit before his time - Michael WatsonHuman Variome Project
 
Professor Richard Cotton - Finlay Macrae
Professor Richard Cotton - Finlay MacraeProfessor Richard Cotton - Finlay Macrae
Professor Richard Cotton - Finlay MacraeHuman Variome Project
 
HVP Country Node: Canada - Matthew Lebo
HVP Country Node: Canada - Matthew LeboHVP Country Node: Canada - Matthew Lebo
HVP Country Node: Canada - Matthew LeboHuman Variome Project
 
Use of open, curated variant databases: ethics? Liability? - Bartha Knoppers
Use of open, curated variant databases: ethics? Liability? - Bartha KnoppersUse of open, curated variant databases: ethics? Liability? - Bartha Knoppers
Use of open, curated variant databases: ethics? Liability? - Bartha KnoppersHuman Variome Project
 
HVP6: Final Thoughts - John Burn & Raj Ramesar
HVP6: Final Thoughts - John Burn & Raj RamesarHVP6: Final Thoughts - John Burn & Raj Ramesar
HVP6: Final Thoughts - John Burn & Raj RamesarHuman Variome Project
 
Report from the International Scientific Advisory Committee - John Burn
Report from the International Scientific Advisory Committee - John BurnReport from the International Scientific Advisory Committee - John Burn
Report from the International Scientific Advisory Committee - John BurnHuman Variome Project
 
HVP Country Node: Italy - Domenico Coviello
HVP Country Node: Italy - Domenico CovielloHVP Country Node: Italy - Domenico Coviello
HVP Country Node: Italy - Domenico CovielloHuman Variome Project
 
Rare and common variants contribute to the complex inheritance of Hirschsprun...
Rare and common variants contribute to the complex inheritance of Hirschsprun...Rare and common variants contribute to the complex inheritance of Hirschsprun...
Rare and common variants contribute to the complex inheritance of Hirschsprun...Human Variome Project
 

More from Human Variome Project (20)

ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa Landrum
ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa LandrumClinVar: Aggregating Data to Improve Variant Interpretation - Melissa Landrum
ClinVar: Aggregating Data to Improve Variant Interpretation - Melissa Landrum
 
The BRCA Share(TM) Consortium - Christophe Beroud
The BRCA Share(TM) Consortium - Christophe BeroudThe BRCA Share(TM) Consortium - Christophe Beroud
The BRCA Share(TM) Consortium - Christophe Beroud
 
Establishing validity, reproducibility, and utility of highly scalable geneti...
Establishing validity, reproducibility, and utility of highly scalable geneti...Establishing validity, reproducibility, and utility of highly scalable geneti...
Establishing validity, reproducibility, and utility of highly scalable geneti...
 
The PhenX Toolkit: Standard Measures for Collaborative Research - Wayne Huggins
The PhenX Toolkit: Standard Measures for  Collaborative Research - Wayne HugginsThe PhenX Toolkit: Standard Measures for  Collaborative Research - Wayne Huggins
The PhenX Toolkit: Standard Measures for Collaborative Research - Wayne Huggins
 
Legal and regulatory challenges to data sharing for clinical genetics and ge...
Legal and regulatory challenges to  data sharing for clinical genetics and ge...Legal and regulatory challenges to  data sharing for clinical genetics and ge...
Legal and regulatory challenges to data sharing for clinical genetics and ge...
 
Report from the International Confederation of Countries Advisory Council - M...
Report from the International Confederation of Countries Advisory Council - M...Report from the International Confederation of Countries Advisory Council - M...
Report from the International Confederation of Countries Advisory Council - M...
 
Human variome project quality assessment criteria for variation databases - M...
Human variome project quality assessment criteria for variation databases - M...Human variome project quality assessment criteria for variation databases - M...
Human variome project quality assessment criteria for variation databases - M...
 
HVP Country Node: Venezuela - Aida Falcon de Vargas
HVP Country Node: Venezuela - Aida Falcon de VargasHVP Country Node: Venezuela - Aida Falcon de Vargas
HVP Country Node: Venezuela - Aida Falcon de Vargas
 
Human Genetics of Infectious Diseases - Laurent Abel
Human Genetics of Infectious Diseases - Laurent AbelHuman Genetics of Infectious Diseases - Laurent Abel
Human Genetics of Infectious Diseases - Laurent Abel
 
HVP Country Node: Malaysia - Zilfalil bin Alwi
HVP Country Node: Malaysia - Zilfalil bin AlwiHVP Country Node: Malaysia - Zilfalil bin Alwi
HVP Country Node: Malaysia - Zilfalil bin Alwi
 
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès Rötig
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès RötigGENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès Rötig
GENETIC HETEROGENEITY OF MITOCHONDRIAL DISORDERS - Agnès Rötig
 
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar RätschThe BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
The BRCA Challenge & Exchange: Progress and Plans - Gunnar Rätsch
 
Richard GH Cotton: He may have been a bit before his time - Michael Watson
Richard GH Cotton: He may have been a bit before his time - Michael WatsonRichard GH Cotton: He may have been a bit before his time - Michael Watson
Richard GH Cotton: He may have been a bit before his time - Michael Watson
 
Professor Richard Cotton - Finlay Macrae
Professor Richard Cotton - Finlay MacraeProfessor Richard Cotton - Finlay Macrae
Professor Richard Cotton - Finlay Macrae
 
HVP Country Node: Canada - Matthew Lebo
HVP Country Node: Canada - Matthew LeboHVP Country Node: Canada - Matthew Lebo
HVP Country Node: Canada - Matthew Lebo
 
Use of open, curated variant databases: ethics? Liability? - Bartha Knoppers
Use of open, curated variant databases: ethics? Liability? - Bartha KnoppersUse of open, curated variant databases: ethics? Liability? - Bartha Knoppers
Use of open, curated variant databases: ethics? Liability? - Bartha Knoppers
 
HVP6: Final Thoughts - John Burn & Raj Ramesar
HVP6: Final Thoughts - John Burn & Raj RamesarHVP6: Final Thoughts - John Burn & Raj Ramesar
HVP6: Final Thoughts - John Burn & Raj Ramesar
 
Report from the International Scientific Advisory Committee - John Burn
Report from the International Scientific Advisory Committee - John BurnReport from the International Scientific Advisory Committee - John Burn
Report from the International Scientific Advisory Committee - John Burn
 
HVP Country Node: Italy - Domenico Coviello
HVP Country Node: Italy - Domenico CovielloHVP Country Node: Italy - Domenico Coviello
HVP Country Node: Italy - Domenico Coviello
 
Rare and common variants contribute to the complex inheritance of Hirschsprun...
Rare and common variants contribute to the complex inheritance of Hirschsprun...Rare and common variants contribute to the complex inheritance of Hirschsprun...
Rare and common variants contribute to the complex inheritance of Hirschsprun...
 

Recently uploaded

Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsSérgio Sacani
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)kushbuR
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfbyp19971001
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Sérgio Sacani
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxmuralinath2
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysBrahmesh Reddy B R
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...mikehavy0
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfhoangquan21999
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxGOWTHAMIM22
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloChristian Robert
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfRevenJadePalma
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxArunLakshmiMeenakshi
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationAreesha Ahmad
 
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxPOST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxArpitaMishra69
 
Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesjyothisaisri
 

Recently uploaded (20)

Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
 
Heat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree daysHeat Units in plant physiology and the importance of Growing Degree days
Heat Units in plant physiology and the importance of Growing Degree days
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxPOST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
 
Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
 

The Clinical Significance of Transcript Alignment Discrepancies … and tools to help you deal with them - Reece Hart

  • 1. Ⓒ 2014 Invitae Reece Hart, Ph.D.Reece Hart, Ph.D. reece@invitae.comreece@invitae.com Human Variome Project Meeting 2014, ParisHuman Variome Project Meeting 2014, Paris The Clinical Significance of TranscriptThe Clinical Significance of Transcript Alignment DiscrepanciesAlignment Discrepancies …… and tools to help you deal with them.and tools to help you deal with them.
  • 2. 2 / 24 Ⓒ 2014 Invitae The fidelity of transcript-genome mapping matters.The fidelity of transcript-genome mapping matters. Variants are identified and computed on in genome coordinates Variants are analyzed and communicated using transcript coordinates genome to transcript (g. to c.) transcript to genome (c. to g.)
  • 3. 3 / 24 Ⓒ 2014 Invitae Motivation 1: Discordant exon coordinatesMotivation 1: Discordant exon coordinates NCBI and UCSC report different coordinates for NM_052813.3, exon 12NCBI and UCSC report different coordinates for NM_052813.3, exon 12 UCSC (BLAT) NCBI (Splign) Consequences: 1. An assay that targets the wrong genomic region will generate uninformative sequence data. 2. A genomic variant will be interpreted as exonic when it is intronic, or vice versa. exon 12 displaced 322 nt
  • 4. 4 / 24 Ⓒ 2014 Invitae Motivation 2: indels confound mappingMotivation 2: indels confound mapping NM_006158.3 (NEFL) contains indel in CDSNM_006158.3 (NEFL) contains indel in CDS
  • 5. 5 / 24 Ⓒ 2014 Invitae Challenges and Solutions in Transcript ManagementChallenges and Solutions in Transcript Management ➢ Biological ● Alternative splicing ● Paralogs ● Natural polymorphisms ● Alternative references ➢ Technical / Logistical ● Multiple transcript sources ● Multiple alignment methods ● Multiple references ● Genome-transcript sequence differences ● Historical transcript alignments ➢ Existing resources ● RefSeq, UCSC, Ensembl ● Locus Reference Genomic ● Mutalyzer ➢ See also ● McCarthy DJ¸ et al. Genome Medicine 6:26 (2014). ● Garla V, et al. Bioinformatics 27(3): 416–8 (2010).
  • 6. 6 / 24 Ⓒ 2014 Invitae Universal Transcript Archive (UTA)Universal Transcript Archive (UTA) ➢ Single database of: ● Multiple transcripts and versions ● … from multiple sources ● … aligned to multiple references ● … by multiple alignment methods ➢ Freely available! ● Apache licensed ● Public PostgreSQL database instance at uta.invitae.com:5432 ● Local installation instructions ● Code at http://bitbucket.org/invitae/uta/
  • 7. 7 / 24 Ⓒ 2014 Invitae Our Bermuda TriangleOur Bermuda Triangle RefAgree Do transcript and genome sequences agree? Transcript Equivalence Which RefSeq and Ensembl transcripts are equivalent? RefSeq (NM) Ensembl (ENST) Genome (GRCh37) ➊SNV ➌ ➋ Indel ➍Historical Transcripts
  • 8. 8 / 24 Ⓒ 2014 Invitae Universal Transcript Archive (UTA)Universal Transcript Archive (UTA) Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database transcript NM_01234.4 NM_01234.4 NM_01234.5 NM_01234.5 NM_01234.5 NM_01234.5 ENST012345 ENST012345 reference NM_01234.4 NC_000012.3 NM_01234.5 NC_000012.3 AC_45678.9 NC_000012.3 ENST012345 NC_000012.3 method self splign self splign splign blat self genebuild exons exon set
  • 9. 9 / 24 Ⓒ 2014 Invitae Universal Transcript Archive (UTA)Universal Transcript Archive (UTA) Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database transcript NM_01234.4 NM_01234.4 NM_01234.5 NM_01234.5 NM_01234.5 NM_01234.5 ENST012345 ENST012345 reference NM_01234.4 NC_000012.3 NM_01234.5 NC_000012.3 AC_45678.9 NC_000012.3 ENST012345 NC_000012.3 method self splign self splign splign blat self genebuild exons exon set exon alignments NM_01234.4 NC_000012.3 0 50= NM_01234.4 NC_000012.3 1 100=1X49= NM_01234.4 NC_000012.3 2 5=1I44= ➊➋ Alignments use coordinates from source databases.
  • 10. 10 / 24 Ⓒ 2014 Invitae Universal Transcript Archive (UTA)Universal Transcript Archive (UTA) Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database transcript NM_01234.4 NM_01234.4 NM_01234.5 NM_01234.5 NM_01234.5 NM_01234.5 ENST012345 ENST012345 reference NM_01234.4 NC_000012.3 NM_01234.5 NC_000012.3 AC_45678.9 NC_000012.3 ENST012345 NC_000012.3 method self splign self splign splign blat self genebuild exons exon set ➌
  • 11. 11 / 24 Ⓒ 2014 Invitae Universal Transcript Archive (UTA)Universal Transcript Archive (UTA) Multiple sources, multiple versions, multiple alignment methods in one databaseMultiple sources, multiple versions, multiple alignment methods in one database transcript NM_01234.4 NM_01234.4 NM_01234.5 NM_01234.5 NM_01234.5 NM_01234.5 ENST012345 ENST012345 reference NM_01234.4 NC_000012.3 NM_01234.5 NC_000012.3 AC_45678.9 NC_000012.3 ENST012345 NC_000012.3 method self splign self splign splign blat self genebuild exons exon set ➍
  • 12. 12 / 24 Ⓒ 2014 Invitae ““RefAgree” Statistics by Protein Coding TranscriptRefAgree” Statistics by Protein Coding Transcript Sequence concordance between RefSeq and GRCh37 primary assemblySequence concordance between RefSeq and GRCh37 primary assembly c.f. Garla V, et al. Bioinformatics 27(3): 416–8 (2010). 34531 NM transcripts (Jan 2014) 760 0.2% with length discrepancies 3481 10% with substitutions 321 0.9% with deletions 255 0.7% with insertions ➊➋
  • 13. 13 / 24 Ⓒ 2014 Invitae NCBI (Splign) v. UCSC (BLAT) Alignment StatisticsNCBI (Splign) v. UCSC (BLAT) Alignment Statistics Splign and BLAT provide significantly different exon structures for 886 transcriptsSplign and BLAT provide significantly different exon structures for 886 transcripts Are Splign and BLAT similar ? 31472 (97.3%) transcriptsY N 32358 transcripts w/exon structures ➌ 886 (2.7%) transcripts “similar” means either 1) identical exon coordinates, or 2) coordinates that differ only by short 3' terminal artifacts
  • 14. 14 / 24 Ⓒ 2014 Invitae Characterization of transcripts discrepanciesCharacterization of transcripts discrepancies Whether alignments provided by NCBI and UCSC agree with GRCh37 primary sequence.Whether alignments provided by NCBI and UCSC agree with GRCh37 primary sequence. Splign BLAT T F T 14 18 F 545 311 886 transcripts with significant discrepancies
  • 15. 15 / 24 Ⓒ 2014 Invitae Characterization of transcripts discrepanciesCharacterization of transcripts discrepancies Reference agreement (blue) and alignment “simplicity” (green)Reference agreement (blue) and alignment “simplicity” (green) Splign BLAT T F T 14 18 F 545 311 Splign BLAT T F T 200 (0) 4 (97) F 90 (82) 16 (84) Splign BLAT T F T 6 (41) 12 (180) F Splign BLAT T F T 434 (7) F 110 (652) Splign BLAT T F T 14 (11) F 886 transcripts with significant discrepancies
  • 16. 16 / 24 Ⓒ 2014 Invitae Summary of Splign-BLAT gene-wise coordinate deltas.Summary of Splign-BLAT gene-wise coordinate deltas. delta # genes # ACMG must report =0 15206 44 >=1 183 8 >=10 116 0 >=25 6 0 >=50 5 0 >=250 13 0 >=1000 94 2 ND 3 delta ≝ minimum per gene of maximum per transcript of difference of exon coordinates between NCBI and UCSC. MYH7, TNNI3 (all trivial diffs) LDLR, MYL2, PRKAG2, SDHB, SDHC, TGFBR1, TGFBR2, WT1 APOV, MYHBPC3, NTRK
  • 17. 17 / 24 Ⓒ 2014 Invitae HGVS Python PackageHGVS Python Package http://bitbucket.org/invitae/hgvs/http://bitbucket.org/invitae/hgvs/ ➢ Parser ● HGVS Python object→ ● Based on a Parsing Expression Grammar ➢ Formatter ● Python object HGVS→ ➢ Validator ● intrinsic & extrinsic validation ➢ Mapping tools indel-aware! ● g. c. p. (m,n,r also supported)↔ → ● transcript-to-transcript liftover ● uses on UTA data
  • 18. 18 / 24 Ⓒ 2014 Invitae Example: Variant liftover between transcriptsExample: Variant liftover between transcripts Map from NM_182763.2:c.688+403C>T➀ to NC_000001.10:g.150550916G>A➁ to ➂ NM_001197320.1:281C>T with Splign alignments NM_001197320.1 NP_001184249.1 NM_182763.2 NP_877495.1 ➀ ➂ ➁ NC_000001.10
  • 19. 19 / 24 Ⓒ 2014 Invitae Developer InfoDeveloper Info Testing ➢ 91% code coverage ➢ 25665 tests variants ● ~200 hand curated, rest from dbSNP ● 23436 sub, 1254 del, 908 ins, 45 delins, 22 dup ● 44 distinct transcripts, many selected for difficulty Upcoming issues (all issues are publicly readable) ➢ multi-variant alleles ➢ release LRG ➢ GRCh38 ➢ API changes
  • 20. 20 / 24 Ⓒ 2014 Invitae AcknowledgementsAcknowledgements ➢ Vince Fusaro ➢ John Garcia ➢ Emily Hare ➢ Kevin Jacobs ➢ Geoff Nilsen ➢ Rudy Rico ➢ Jody Westbrook http://bitbucket.com/invitae/ ➢ Code (Python) ➢ Documentation & Examples ➢ Issues ➢ BED files ➢ Code testing is public Or just: pip install hgvs
  • 21. 21 / 24 Ⓒ 2014 Invitae
  • 22. 22 / 24 Ⓒ 2014 Invitae T RefSeq NM_01234.4 UTA solves four issues with transcript management.UTA solves four issues with transcript management. RefSeq NM_01234.5 InDel UCSC NM_01234.5 ➌ Exon coordinate differences between sources for same accession➍ Historical transcripts alignments no longer available ➊ SNV A ➋ Transcript =≠ Genome Reference
  • 23.
  • 24. 24 / 24 Ⓒ 2014 Invitae ENSTs equivalent with NMsENSTs equivalent with NMs => select N.hgnc,N.es_fingerprint,N.tx_ac,E.tx_ac from uta_20140210.tx_exon_set_summary_mv N join uta_20140210.tx_exon_set_summary_mv E   on N.es_fingerprint=E.es_fingerprint   and N.tx_ac ~ '^NM_' and E.tx_ac ~ '^ENST'   and N.alt_aln_method='transcript'   and E.alt_aln_method='transcript'; ┌─────────┬──────────────────────────────────┬────────────────┬─────────────────┐   │ hgnc              es_fingerprint                tx_ac             tx_ac      │ │ │ │ ├─────────┼──────────────────────────────────┼────────────────┼─────────────────┤  │ AFF2      db0e20be1a2bb687c33227d2e6bf9d53   NM_002025.3      ENST00000370460 │ │ │ │  │ UBE3A     d1eace7da295c45378fa5f898f2f03f6   NM_130838.1      ENST00000438097 │ │ │ │  │ ANXA8L1   1f6fd4f3fe9854aa468489ec7f507512   NM_001098845.1   ENST00000359178 │ │ │ │  │ APOL5     939a9e9e4a46ef9aef862cf9b369afe6   NM_030642.1      ENST00000249044 │ │ │ │  │ ARID4B    524fc954d10b08a4014e86aee81d0358   NM_016374.5      ENST00000264183 │ │ │ │