SlideShare a Scribd company logo
1 of 31
Download to read offline
Yaniv Erlich7/15/15 @erlichya
Expression
STRs
Yaniv Erlich
@erlichya
Yaniv Erlich7/15/15 @erlichya
We know quite a lot about genetic variations…
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
What about Short Tandem
Repeats (STRs)?
CTCAATACAAGTCTAACAGCAGCAGCAGCAGCAGCAGCAGCAGTTGATGAAC
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Short Tandem Repeats
•  1% of the human genome!
•  Fast mutation rates!
•  Multiple Mendelian
diseases!
•  Evolvability! HuntingtonFragile X
OPMDSynpolydactyly Ataxia (10 types)
HFG syndrome
Holoprosen-
cephaly
Pseudoach-
ondroplasia
Myotonic dystrophy
Cleidocranial
Dysplasia ALS-FTD
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
lobSTR – Whole genome solution for STR genotyping
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
lobSTR: An STR profiler for WGS
Method
lobSTR: A short tandem repeat profiler
for personal genomes
Melissa Gymrek,1,2
David Golan,2,3
Saharon Rosset,3
and Yaniv Erlich2,4
1
Harvard–MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139,
USA; 2
Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02142, USA; 3
Department of Statistics and Operations
Research, Tel Aviv University, Tel Aviv 69978, Israel
Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic gene-
alogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However,
mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment,
which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel
method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning
to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR
exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR’s accuracy by mea-
suring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same in-
dividual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by
comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used
the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced
the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome.
lobSTR’s implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with
the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM
format.
[Supplemental material is available for this article.]
Short tandem repeats (STRs), also known as microsatellites, are
a class of genetic variations with repetitive elements of 2–6 nu-
cleotides (nt) that consist of approximately a quarter million loci in
the human genome (Benson 1999). The repetitive structure of
those loci creates unusual secondary DNA conformations that are
Despite the plurality of applications, STR variations are not
routinely analyzed in whole-genome sequencing studies, mainly
due to a lack of adequate tools (Treangen and Salzberg 2011). STRs
pose a remarkable challenge to mainstream HTS analysis pipelines.
First, not all reads that align to an STR locus are informative
Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from
10.1101/gr.135780.111Access the most recent version at doi:
published online April 20, 2012Genome Res.
Melissa Gymrek, David Golan, Saharon Rosset, et al.
lobSTR: A short tandem repeat profiler for personal genomes
P<P Published online April 20, 2012 in advance of the print journal.
service
Email alerting
click heretop right corner of the article or
Receive free email alerts when new articles cite this article - sign up in the box at the
object identifier (DOIs) and date of initial publication.
by PubMed from initial publication. Citations to Advance online articles must include the digital
publication). Advance online articles are citable and establish publication priority; they are indexed
appeared in the paper journal (edited, typeset versions may be posted when available prior to final
Advance online articles have been peer reviewed and accepted for publication but have not yet
http://genome.cshlp.org/subscriptions
go to:Genome ResearchTo subscribe to
Copyright © 2012 by Cold Spring Harbor Laboratory Press
Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
New: HipSTR
•  HipSTR: Haplotype-based imputation, phasing
and genotyping of STRs
•  Major improvements:
– Learns locus-specific stutter models
– Physical-phasing
– Impute missing STRs or give priors based on
SNPs.
– Reports not only the length of an STR but also
its sequence
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
HipSTR: solving homoplasy
!
•  Can now correctly detect STRs with identical lengths but different
sequences (homoplasy)!
•  Real example: !
–  Length based genotype: -4/-4!
–  HipSTR genotype: (AGAT)8(ACAT)9 / (AGAT)10(ACAT)7!
•  HipSTR available at
https://github.com/tfwillems/HipSTR!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Capillary-based validation
!  Simons Genome Diversity
Project sequenced 280
individuals to 30x
!  For 105 of these samples,
~300 Marshfield STRs were
genotyped using capillary
electrophoresis
!  Compare the length of the
STR genotypes to the
capillary PCR products to
assess accuracy
R2=0.987
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Analyzing 1000Genomes STRs
Good allele frequency spectrum for 90% of the STRs in the
genome.!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
10.1101/gr.135780.111Access the most recent version at doi:
published online April 20, 2012Genome Res.
Melissa Gymrek, David Golan, Saharon Rosset, et al.
lobSTR: A short tandem repeat profiler for personal genomes
P<P Published online April 20, 2012 in advance of the print journal.
service
Email alerting
click heretop right corner of the article or
Receive free email alerts when new articles cite this article - sign up in the box at the
object identifier (DOIs) and date of initial publication.
by PubMed from initial publication. Citations to Advance online articles must include the digital
publication). Advance online articles are citable and establish publication priority; they are indexed
appeared in the paper journal (edited, typeset versions may be posted when available prior to final
Advance online articles have been peer reviewed and accepted for publication but have not yet
http://genome.cshlp.org/subscriptions
go to:Genome ResearchTo subscribe to
Copyright © 2012 by Cold Spring Harbor Laboratory Press
Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from
A catalog of STR variations
strcat.teamerlich.org!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Summary of the 1000 Genomes analysis
About 100,000 of the STRs in your genome
show are different from the person next to
you…!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Summary of the 1000 Genomes
analysis
About 100,000 of the STRs in your
genome show are different from the
person next to you…
Part 1: lobSTR
Challenges Algorithm Benchmarking Validation Summary
But do normal STR
variations have
phenotypic
consequences?!
Yaniv Erlich7/15/15 @erlichya
Paper
Expression STRs (eSTRs)
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
Contenteetal.,2002
PIG3
Warpehaetal.,1999
15 14 13 12
#of repeats
Expressi
on
NOS2A
EGFR
Gebhardtetal.,1999
MMP9 Shimajirietal.,1999
Expression STRs (eSTRs): single gene studies in human
But we want a genome wide analysis!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
STR
Expression
H0: effect = 0
H1: effect ≠ 0
Y
expression
Analysis pipeline
~190,000 tests for [genes x STR]!
+ negative controls!
X
STR calls
384 samples
RNA-seq
Regression tests +/-100kb from transcripts!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Genome-wide survey of eSTRs in human
Observedp-value[-log10]
Expected p-value under the null [-log10]
2060 eSTRs
Negative controls
follow the null
Signal
Expression STRs (eSTRs)
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
Orthogonal populations
Orthogonal expression assay (array) +
Replication
83% of eSTRs showed the same direction
of effect (N=822; p<10-93)
Also the effects were highly correlated
(R=0.73; p<10-140) Effect
RNA-
seq
Effect Array
+Data from Stranger et al., PLoS Genetics, 2012
Most of the eSTRs are replicable
Expression STRs (eSTRs)
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
SNPs or STRs?
gene
STRTF
Causality
Tagging
Biologically, not very interesting.!
SNP
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Decomposing variationsh2
b!
Simulations of negative controls (no eSTR contribution):!
h2
STR!
Simulated SNP-eQTL! Simulated SNP-eQTL!
+ XBY ~ XSTR
Take home message: LMM is calibrated.!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
LMM results of real data
Linear mixed model (LMM) for variance decomposition for all genes:!
eSTR vs. all common variants on the haplotype!
eSTRs contribute 10%-15% of the gene
expression on cis region.!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Mean STR allele!
Expression!
AA
AB
BB
Null hypothesis:!
random slopes!
Regressing conditioned on best SNP
gene
STRTF
Causality
Tagging
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Mean STR allele!
Expression!
AA
AB
BB
Slopes in the same
direction as the original
association!
Regressing conditioned on best SNP
gene
STR
TF
Causality?
Tagging
Alternative hypothesis:!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
75% of condition effects were in the same direction (p<10-108)
Regressing conditioned on best SNP
Unconditioned
effect
Conditioned Effect!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Expression STRs (eSTRs)
Yaniv Erlich7/15/15 @erlichya
Evidence for function of eSTRs
Conservation
Expression STRs (eSTRs)
PhyloP!
0
0.4
0.8
1.2
1.6
2
±1000 ±500 ±250 ±100 ±50
10-3×!
Window size(bp)!
p<7%
p<3%
p<0.1%
p<0.1%
p<1%
eSTRs are significantly enriched in more conserved regions!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
Co-localization with functional elements
Expression STRs (eSTRs)
Peak shift: eSTRs co-localizations with histone signatures: p<0.01!
But maybe these
signatures are created by
nearby causal variants?ENCODE LCL!
Null (peak shifting):!
Trynka, bioRxiv,2015!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
A potential role of eSTRs in human diseases
Expression STRs (eSTRs)
Associating the 2060 eSTRs x 31 phenotypes of
~1300 individuals in the UK10K!
FDR<10%!
Diastolic blood pressure!
CLCC1!
DIP2B!
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
A potential role of eSTRs in human diseases
Expression STRs (eSTRs)
Name% Symbol% P%value% Phenotype% Class%
4:9955416' SLC2A9' 3.49E008' Uric_Acid' Metabolic'funcCon'
10:27124545' Abi1' 4.61E007' Phosphate' Metabolic'funcCon'
17:44048491' KIAA1267' 6.86E006' FEV1.FVC_RaCo' Pulmonary'funcCon'
16:473880' DECR2' 2.51E005' ApoA1' Metabolic'funcCon'
1:109393265' CLCC1' 2.89E005' Diastolic_BP' Blood'Pressure'
6:20195837' MBOAT1' 3.26E005' Albumin' Metabolic'funcCon'
1:110516300' FAM40A' 5.07E005' Urea' Metabolic'funcCon'
12:51036810' DIP2B' 1.02E004' Diastolic_BP' Blood'Pressure'
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
Summary
The first genome-wide expression STR analysis.
1.  Over 2,000 eSTRs in the discovery set.
2.  Replication in independent platforms/populations.
3.  eSTRs account for 10-15% of cis-heritability by common variants
4.  Functional evidence
5.  eSTRs are associated with human phenotypes
Expression STRs (eSTRs)
How much missing heritability in GWAS studies by not
analyzing repetitive elements?
Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
Yaniv Erlich7/15/15 @erlichya
Team eSTR:!
Melissa Gymrek!
Thomas Willems!
Dina Zielinski !
Stoyan Georgiev!
Barak Marcus!
Alkes Price!
Mark Daly!
Jonathan Pritchard!
!
!
!
Acknowledgements
Funding
Burroughs Wellcome Career Award
National Institute of Justice
Yaniv Erlich7/15/15 @erlichya
Outline
Yaniv Erlich7/12/12 Towards a population scale map of STR variations
lobSTR: Profiling STR variations from WGS data
STR variations across 2,500 datasets: Preliminary results
All
CEU
GBR
FIN
IBS
YRI
LWK
ACB
ASW
CHB
CDX
CHS
JPT
KHV
0.0
0.2
0.4
0.6
0.8
1.0
Heterozygosity
The End

More Related Content

What's hot

1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that
1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that
1. Single nucleotide polymorphisms (SNPs) are single base pair mutations thatJarryMikols
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variantsDenis C. Bauer
 
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Denis C. Bauer
 
Homo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKHomo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKShreyaBhatt23
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2teamchaotex
 
Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...Jennifer Shelton
 
Legal issues related to dna fingerprinting in india
Legal issues related to dna fingerprinting in indiaLegal issues related to dna fingerprinting in india
Legal issues related to dna fingerprinting in indiaIndianScholars
 
Genetic Engineering and Biotechnology
Genetic Engineering and BiotechnologyGenetic Engineering and Biotechnology
Genetic Engineering and BiotechnologyStephen Taylor
 
Epigenetic and Environmental Influences on the Shellfish Immune Response
Epigenetic and Environmental Influences on the Shellfish Immune ResponseEpigenetic and Environmental Influences on the Shellfish Immune Response
Epigenetic and Environmental Influences on the Shellfish Immune Responsesr320
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataThomas Keane
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableDATAVERSITY
 
SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 posterMitchell Go
 
IB Biotechnology Review (3.5)
IB Biotechnology Review (3.5)IB Biotechnology Review (3.5)
IB Biotechnology Review (3.5)Jacob Cedarbaum
 

What's hot (20)

1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that
1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that
1. Single nucleotide polymorphisms (SNPs) are single base pair mutations that
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variants
 
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2
 
Homo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKHomo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANK
 
Forensic Dna Me
Forensic Dna MeForensic Dna Me
Forensic Dna Me
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
 
Hoofdstuk 16 2008 deel 1
Hoofdstuk 16 2008 deel 1Hoofdstuk 16 2008 deel 1
Hoofdstuk 16 2008 deel 1
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...Journal club slides to discuss "Differential analysis of gene regulation at t...
Journal club slides to discuss "Differential analysis of gene regulation at t...
 
Legal issues related to dna fingerprinting in india
Legal issues related to dna fingerprinting in indiaLegal issues related to dna fingerprinting in india
Legal issues related to dna fingerprinting in india
 
Genetic Engineering and Biotechnology
Genetic Engineering and BiotechnologyGenetic Engineering and Biotechnology
Genetic Engineering and Biotechnology
 
Epigenetic and Environmental Influences on the Shellfish Immune Response
Epigenetic and Environmental Influences on the Shellfish Immune ResponseEpigenetic and Environmental Influences on the Shellfish Immune Response
Epigenetic and Environmental Influences on the Shellfish Immune Response
 
20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop20140710 6 c_mason_ercc2.0_workshop
20140710 6 c_mason_ercc2.0_workshop
 
Bliss
BlissBliss
Bliss
 
Overview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence dataOverview of methods for variant calling from next-generation sequence data
Overview of methods for variant calling from next-generation sequence data
 
DNA Profiling
DNA ProfilingDNA Profiling
DNA Profiling
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
 
SURCA 2016 poster
SURCA 2016 posterSURCA 2016 poster
SURCA 2016 poster
 
Dna analysis
Dna analysisDna analysis
Dna analysis
 
IB Biotechnology Review (3.5)
IB Biotechnology Review (3.5)IB Biotechnology Review (3.5)
IB Biotechnology Review (3.5)
 

Similar to SMBE 2015: Expression STRs

Forensic Anthropology الاستعراف
Forensic Anthropology الاستعراف Forensic Anthropology الاستعراف
Forensic Anthropology الاستعراف Mohammad Ihmeidan
 
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESMICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESKaran Veer Singh
 
EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018Nils Gehlenborg
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomesdaforerog
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisLars Juhl Jensen
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR ProfilingCreative-Bioarray
 
Single-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem CellsSingle-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem CellsNacho Caballero
 
Forensic dna typing by John M Butler
Forensic dna typing by John M ButlerForensic dna typing by John M Butler
Forensic dna typing by John M ButlerMuhammad Ahmad
 
short tandem repeats profile
short tandem repeats profileshort tandem repeats profile
short tandem repeats profileBennie George
 
Cell authentication by str profile
Cell authentication by str profileCell authentication by str profile
Cell authentication by str profileBennie George
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayStefanie Yang
 

Similar to SMBE 2015: Expression STRs (20)

Dna fingerprinting
Dna fingerprintingDna fingerprinting
Dna fingerprinting
 
Forensic Anthropology الاستعراف
Forensic Anthropology الاستعراف Forensic Anthropology الاستعراف
Forensic Anthropology الاستعراف
 
Mikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_systemMikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_system
 
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESMICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
 
Molecular marker
Molecular markerMolecular marker
Molecular marker
 
U1 and U2 Exam Review from 28May
U1 and U2 Exam Review from 28MayU1 and U2 Exam Review from 28May
U1 and U2 Exam Review from 28May
 
Shahbaz Str
Shahbaz StrShahbaz Str
Shahbaz Str
 
Shahbaz Str
Shahbaz StrShahbaz Str
Shahbaz Str
 
Markers
MarkersMarkers
Markers
 
EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomes
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysis
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR Profiling
 
Single-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem CellsSingle-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem Cells
 
Forensic dna typing by John M Butler
Forensic dna typing by John M ButlerForensic dna typing by John M Butler
Forensic dna typing by John M Butler
 
short tandem repeats profile
short tandem repeats profileshort tandem repeats profile
short tandem repeats profile
 
Cell authentication by str profile
Cell authentication by str profileCell authentication by str profile
Cell authentication by str profile
 
Snp
SnpSnp
Snp
 
ResearchreportSTS
ResearchreportSTSResearchreportSTS
ResearchreportSTS
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research Essay
 

Recently uploaded

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 

Recently uploaded (20)

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 

SMBE 2015: Expression STRs

  • 2. Yaniv Erlich7/15/15 @erlichya We know quite a lot about genetic variations… Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 3. Yaniv Erlich7/15/15 @erlichya What about Short Tandem Repeats (STRs)? CTCAATACAAGTCTAACAGCAGCAGCAGCAGCAGCAGCAGCAGTTGATGAAC Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 4. Yaniv Erlich7/15/15 @erlichya Short Tandem Repeats •  1% of the human genome! •  Fast mutation rates! •  Multiple Mendelian diseases! •  Evolvability! HuntingtonFragile X OPMDSynpolydactyly Ataxia (10 types) HFG syndrome Holoprosen- cephaly Pseudoach- ondroplasia Myotonic dystrophy Cleidocranial Dysplasia ALS-FTD Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 5. Yaniv Erlich7/15/15 @erlichya lobSTR – Whole genome solution for STR genotyping Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 6. Yaniv Erlich7/15/15 @erlichya lobSTR: An STR profiler for WGS Method lobSTR: A short tandem repeat profiler for personal genomes Melissa Gymrek,1,2 David Golan,2,3 Saharon Rosset,3 and Yaniv Erlich2,4 1 Harvard–MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; 2 Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02142, USA; 3 Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv 69978, Israel Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic gene- alogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR’s accuracy by mea- suring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same in- dividual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR’s implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format. [Supplemental material is available for this article.] Short tandem repeats (STRs), also known as microsatellites, are a class of genetic variations with repetitive elements of 2–6 nu- cleotides (nt) that consist of approximately a quarter million loci in the human genome (Benson 1999). The repetitive structure of those loci creates unusual secondary DNA conformations that are Despite the plurality of applications, STR variations are not routinely analyzed in whole-genome sequencing studies, mainly due to a lack of adequate tools (Treangen and Salzberg 2011). STRs pose a remarkable challenge to mainstream HTS analysis pipelines. First, not all reads that align to an STR locus are informative Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from 10.1101/gr.135780.111Access the most recent version at doi: published online April 20, 2012Genome Res. Melissa Gymrek, David Golan, Saharon Rosset, et al. lobSTR: A short tandem repeat profiler for personal genomes P<P Published online April 20, 2012 in advance of the print journal. service Email alerting click heretop right corner of the article or Receive free email alerts when new articles cite this article - sign up in the box at the object identifier (DOIs) and date of initial publication. by PubMed from initial publication. Citations to Advance online articles must include the digital publication). Advance online articles are citable and establish publication priority; they are indexed appeared in the paper journal (edited, typeset versions may be posted when available prior to final Advance online articles have been peer reviewed and accepted for publication but have not yet http://genome.cshlp.org/subscriptions go to:Genome ResearchTo subscribe to Copyright © 2012 by Cold Spring Harbor Laboratory Press Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 7. Yaniv Erlich7/15/15 @erlichya New: HipSTR •  HipSTR: Haplotype-based imputation, phasing and genotyping of STRs •  Major improvements: – Learns locus-specific stutter models – Physical-phasing – Impute missing STRs or give priors based on SNPs. – Reports not only the length of an STR but also its sequence Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 8. Yaniv Erlich7/15/15 @erlichya HipSTR: solving homoplasy ! •  Can now correctly detect STRs with identical lengths but different sequences (homoplasy)! •  Real example: ! –  Length based genotype: -4/-4! –  HipSTR genotype: (AGAT)8(ACAT)9 / (AGAT)10(ACAT)7! •  HipSTR available at https://github.com/tfwillems/HipSTR! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 9. Yaniv Erlich7/15/15 @erlichya Capillary-based validation !  Simons Genome Diversity Project sequenced 280 individuals to 30x !  For 105 of these samples, ~300 Marshfield STRs were genotyped using capillary electrophoresis !  Compare the length of the STR genotypes to the capillary PCR products to assess accuracy R2=0.987 Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 10. Yaniv Erlich7/15/15 @erlichya Analyzing 1000Genomes STRs Good allele frequency spectrum for 90% of the STRs in the genome.! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 11. Yaniv Erlich7/15/15 @erlichya 10.1101/gr.135780.111Access the most recent version at doi: published online April 20, 2012Genome Res. Melissa Gymrek, David Golan, Saharon Rosset, et al. lobSTR: A short tandem repeat profiler for personal genomes P<P Published online April 20, 2012 in advance of the print journal. service Email alerting click heretop right corner of the article or Receive free email alerts when new articles cite this article - sign up in the box at the object identifier (DOIs) and date of initial publication. by PubMed from initial publication. Citations to Advance online articles must include the digital publication). Advance online articles are citable and establish publication priority; they are indexed appeared in the paper journal (edited, typeset versions may be posted when available prior to final Advance online articles have been peer reviewed and accepted for publication but have not yet http://genome.cshlp.org/subscriptions go to:Genome ResearchTo subscribe to Copyright © 2012 by Cold Spring Harbor Laboratory Press Cold Spring Harbor Laboratory Presson April 25, 2012 - Published bygenome.cshlp.orgDownloaded from A catalog of STR variations strcat.teamerlich.org! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 12. Yaniv Erlich7/15/15 @erlichya Summary of the 1000 Genomes analysis About 100,000 of the STRs in your genome show are different from the person next to you…! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 13. Yaniv Erlich7/15/15 @erlichya Summary of the 1000 Genomes analysis About 100,000 of the STRs in your genome show are different from the person next to you… Part 1: lobSTR Challenges Algorithm Benchmarking Validation Summary But do normal STR variations have phenotypic consequences?!
  • 14. Yaniv Erlich7/15/15 @erlichya Paper Expression STRs (eSTRs) Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 15. Yaniv Erlich7/15/15 @erlichya Contenteetal.,2002 PIG3 Warpehaetal.,1999 15 14 13 12 #of repeats Expressi on NOS2A EGFR Gebhardtetal.,1999 MMP9 Shimajirietal.,1999 Expression STRs (eSTRs): single gene studies in human But we want a genome wide analysis! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 16. Yaniv Erlich7/15/15 @erlichya STR Expression H0: effect = 0 H1: effect ≠ 0 Y expression Analysis pipeline ~190,000 tests for [genes x STR]! + negative controls! X STR calls 384 samples RNA-seq Regression tests +/-100kb from transcripts! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 17. Yaniv Erlich7/15/15 @erlichya Genome-wide survey of eSTRs in human Observedp-value[-log10] Expected p-value under the null [-log10] 2060 eSTRs Negative controls follow the null Signal Expression STRs (eSTRs) Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 18. Yaniv Erlich7/15/15 @erlichya Orthogonal populations Orthogonal expression assay (array) + Replication 83% of eSTRs showed the same direction of effect (N=822; p<10-93) Also the effects were highly correlated (R=0.73; p<10-140) Effect RNA- seq Effect Array +Data from Stranger et al., PLoS Genetics, 2012 Most of the eSTRs are replicable Expression STRs (eSTRs) Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 19. Yaniv Erlich7/15/15 @erlichya SNPs or STRs? gene STRTF Causality Tagging Biologically, not very interesting.! SNP Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 20. Yaniv Erlich7/15/15 @erlichya Decomposing variationsh2 b! Simulations of negative controls (no eSTR contribution):! h2 STR! Simulated SNP-eQTL! Simulated SNP-eQTL! + XBY ~ XSTR Take home message: LMM is calibrated.! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 21. Yaniv Erlich7/15/15 @erlichya LMM results of real data Linear mixed model (LMM) for variance decomposition for all genes:! eSTR vs. all common variants on the haplotype! eSTRs contribute 10%-15% of the gene expression on cis region.! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 22. Yaniv Erlich7/15/15 @erlichya Mean STR allele! Expression! AA AB BB Null hypothesis:! random slopes! Regressing conditioned on best SNP gene STRTF Causality Tagging Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 23. Yaniv Erlich7/15/15 @erlichya Mean STR allele! Expression! AA AB BB Slopes in the same direction as the original association! Regressing conditioned on best SNP gene STR TF Causality? Tagging Alternative hypothesis:! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 24. Yaniv Erlich7/15/15 @erlichya 75% of condition effects were in the same direction (p<10-108) Regressing conditioned on best SNP Unconditioned effect Conditioned Effect! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases Expression STRs (eSTRs)
  • 25. Yaniv Erlich7/15/15 @erlichya Evidence for function of eSTRs Conservation Expression STRs (eSTRs) PhyloP! 0 0.4 0.8 1.2 1.6 2 ±1000 ±500 ±250 ±100 ±50 10-3×! Window size(bp)! p<7% p<3% p<0.1% p<0.1% p<1% eSTRs are significantly enriched in more conserved regions! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 26. Yaniv Erlich7/15/15 @erlichya Co-localization with functional elements Expression STRs (eSTRs) Peak shift: eSTRs co-localizations with histone signatures: p<0.01! But maybe these signatures are created by nearby causal variants?ENCODE LCL! Null (peak shifting):! Trynka, bioRxiv,2015! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 27. Yaniv Erlich7/15/15 @erlichya A potential role of eSTRs in human diseases Expression STRs (eSTRs) Associating the 2060 eSTRs x 31 phenotypes of ~1300 individuals in the UK10K! FDR<10%! Diastolic blood pressure! CLCC1! DIP2B! Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 28. Yaniv Erlich7/15/15 @erlichya A potential role of eSTRs in human diseases Expression STRs (eSTRs) Name% Symbol% P%value% Phenotype% Class% 4:9955416' SLC2A9' 3.49E008' Uric_Acid' Metabolic'funcCon' 10:27124545' Abi1' 4.61E007' Phosphate' Metabolic'funcCon' 17:44048491' KIAA1267' 6.86E006' FEV1.FVC_RaCo' Pulmonary'funcCon' 16:473880' DECR2' 2.51E005' ApoA1' Metabolic'funcCon' 1:109393265' CLCC1' 2.89E005' Diastolic_BP' Blood'Pressure' 6:20195837' MBOAT1' 3.26E005' Albumin' Metabolic'funcCon' 1:110516300' FAM40A' 5.07E005' Urea' Metabolic'funcCon' 12:51036810' DIP2B' 1.02E004' Diastolic_BP' Blood'Pressure' Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 29. Yaniv Erlich7/15/15 @erlichya Summary The first genome-wide expression STR analysis. 1.  Over 2,000 eSTRs in the discovery set. 2.  Replication in independent platforms/populations. 3.  eSTRs account for 10-15% of cis-heritability by common variants 4.  Functional evidence 5.  eSTRs are associated with human phenotypes Expression STRs (eSTRs) How much missing heritability in GWAS studies by not analyzing repetitive elements? Intro Genotyping STRs eSTRs eSTR or eSNPs? eSTRs in diseases
  • 30. Yaniv Erlich7/15/15 @erlichya Team eSTR:! Melissa Gymrek! Thomas Willems! Dina Zielinski ! Stoyan Georgiev! Barak Marcus! Alkes Price! Mark Daly! Jonathan Pritchard! ! ! ! Acknowledgements Funding Burroughs Wellcome Career Award National Institute of Justice
  • 31. Yaniv Erlich7/15/15 @erlichya Outline Yaniv Erlich7/12/12 Towards a population scale map of STR variations lobSTR: Profiling STR variations from WGS data STR variations across 2,500 datasets: Preliminary results All CEU GBR FIN IBS YRI LWK ACB ASW CHB CDX CHS JPT KHV 0.0 0.2 0.4 0.6 0.8 1.0 Heterozygosity The End