Andres Metspalu: The Estonian Genome Project

Estonian Gemome Project:
From Biobanking to
Personalised Medicine
Andres Metspalu
Estonian Biobank, Institute of Genomics,
University of Tartu, Estonia
Genomics to Healthcare
THL, Helsinki, September 13, 2019

Estonian Biobank (started in 1999!)
• 1. Prospective, longitudinal, volunteer-based
• 2. Health records, diet, physical activity, etc.
DNA, plasma, 3000 WGS, 2500 WES, all GSA
array
• 3. 52,000 participants from 2002-2011,
additional 150 000 2018-2019
• 5. Open for research: Clear access rules, broad
informed consent, HGRAct, EstGSA array data
• 6. 200 000 individuals = 15% of the population of
Estonia

-0,05
-0,03
-0,01
0,01
0,03
0,05
0,07
-0,05 -0,03 -0,01 0,01 0,03 0,05
P21(4.68%)
PC1 (8.65%)
Austria
Bulgaria
Czech Republic
Estonia
Finland (Helsinki)
Finland (Kuusamo)
France
Northern Germany
Southern Germany
Hungary
Northern Italy
Southern Italy
Latvia
Lithuania
Poland
Russia
Spain
Sweden
Switzerland
Baltic countries,
Russia,
Poland
Sweden
Western and
Central Europe
Italy
Finland
European genetic map
Nelis M et al., PLoS ONE 4(5):e5472

Estonian biobank: omics profiling
Method Sample size
Whole genome sequencing (30X) 3,000
Whole exome sequencing 2,500
Genome-wide genotyping arrays 130,000
Genome-wide methylation arrays 700
Genome-wide expression arrays 1,100
mRNA sequencing 600
Total RNA sequencing 50
Metabolomics (NMR) 11 000
Metabolomics (MS/MS) 1,100
Telomere length 5,200
Clinical biochemistry 2,700
IgG glycosylation 1,000

Secure exchange of health data –
cornerstone of Estonian digital health architecture
PHARMACIES
AND
INDIVIDUAL
DOCTORS
X-Road, ID-card, State IS Service Register
STATEAGENCYOFMEDICINES
-CodingCentre
-Handlersofmedicines
POPULATIONREGISTER
BUSINESSREGISTER
ANONYMIZED
DATA WAREHOUSE
for ANALYTICS
(2013)
NATIONAL HEALTH INFORMATION
SYSTEM/ARCHIVE
(2008 december)
DOCTOR
PORTAL /
WORK-
STATION
(2013)
X-ROAD
GATEWAY
SERVICE
(2009)
PATIENT
PORTAL /
WORK-
STATION
(2009 v1
2013 v2)
SOCIAL
INSURAN
CE
BOARD
PORTAL
/ WORK-
STATION
(2012)
HEALTHINSURANCE
FUNDREGISTER
NATIONAL
SUPPORT
APPLICATION
S for
AMBULANCE
SERVICES
(2014)
STATISTIC
S PORTAL
(2013)
AMBULANCE
SERVICE
MOBILE
WORKSTATIO
NS
NATIONAL MEDICAL IMAGES
ARCHIVE PACS
(2005 | national 2014)
VPN
UNEMPLYMENTINSURANCE
FUND
Status 2017 Dec

6
1.1 mio persons medical data
National Patient Portal

Figure 3. National registries and databases for enrichment of phenotype data in the Estonian Biobank. The
schematic diagram illustrates the different layers of information available in the database of the Estonian
Biobank, which is continually being updated by queries to the Estonian Causes of Death Registry, the
Estonian Cancer Registry and the Digital Prescription Database of the Estonian Health Insurance Fund, as
well as electronic medical records (EMRs) from the databases of the two major hospitals in Estonia. Data
generated through research projects must be returned to the Biobank within 5 years of the original data
release from the Biobank.

Male, born 1944
Disease trajectories + treatment info for people in the biobank

Personalized medicine
• - rare disease – precision diagnostics
(WES, WGS)
• - cancer – precision treatment (e.g. mut
specific MaB)
• - common diseases – “personal
prevention” (PRS)
• Pharmacogenomics (drug response)

Genomics of the common disease
(CAD, T2D, etc.) “Estonian approach”
• 1. Sequence ca 1-2% of the population and
capture maximum amount of the genomic
heterogeneity and use it for imputations
• 2. Use arrays for the major part of the
population and impute the arrays
• 3. Use the data for PRS and
pharmacogenetics
• 4. It costs 50€ to recruit and genotype one
individual = population scale Per Med

Population-specific reference
panel for genotype imputation

Human Genes Research Act
Passed 13.12.2000
RT I 2000, 104, 685
• § 3. Chief processor of Gene Bank
• (1) The chief processor of the Gene Bank is the
University of Tartu whose objective as the chief
processor of the Gene Bank is to:
1) promote the development of genetic research;
2) collect information on the health of the Estonian
population and genetic information concerning the
Estonian population;
3) use the results of genetic research to improve
public health.

Return the genetic data back to
people from the Estonian Biobank
• Estonian biobank is returning the
research data back to the people who
want and agree to get it.
• We are inviting back approx. 3000
people, around 2000 have received by
now the polygenetic risk scores (PRS)
and 30 min counseling.

Topics
returned
Common
disorders
(PRS)
High-risk
actionable
variants
Carrier
status
Pharmaco-
genetics
Other
topics of
interest
Risk factors
with
mocerate
effect
• Alfa-1 antitrypsiin insufficiency
• Thrombophilia
• Glaucoma (exfoliative)
• Hypolactasia
• Cystic fibrosis
• Wilson’s disease
• Early menopause
• Ancestry (planned)
• HBOC
• Lynch sydrome, polyposes
• FH
• Arrhythmogenic right
venreickle cardiomyopathy
• T2D
• BrCa
• CAD, myocardial infarction
EGCUT broad feedback initiative
• 11 genes
• 30 compounds

4 examples:
• 1. Familial hypercholesterolemia
(FH) - Alver et al. (2018), GIM
Genetics first” approach
• 2. T2D – PRS - Läll et al (2016), GIM
• 2. Breast cancer – PRS, Läll et al
(2019), submitted
• 3. Pharmacogenetics - Reisberg et al
(2018) GIM

Abul-Husn et al. Science 2016
Khera et al. J Am Coll Cardiol. 2016
FH-linked variant (LDLR, APOB, PCSK9 gene) carriers display 50 mg/dl
(1.3 mmol/L) and greater and a wide spectrum of LDL-C level
Diagnostic LDL-C level cut-off
for FH cases >4.9 mmol/L
Alver et al. (2018) Genetics in
Medicine
Familiar hypercholesterolemia - FH

FH Summary
• Under-diagnosis and under-treatment
– reclassified 51% from having non-specific
hypercholesterolemia to having FH, half of them were on
statins, but none had LDL-C below treatment goals
– identified 32% who had gone unrecognized by the medical
system
– Reliable identification of new FH cases and people with high
GRS which has direct impact on family members
• Insensitivity of current criteria used in FH diagnosis
– wide spectrum of LDL-C levels
• 34% had LDL-C levels ≤4.9 mmol/L
– visible accumulations of lipid deposits detected in 5% only
– heterogeneity in clinical expression
– Cascade

Polygenic risk scores
• Most of the associated loci identified in GWAS have
very small effects
• Polygenic risk score can be constructed by
combining the effects of all associated loci
– unweighted: sum of all
risk alleles
– weighted: sum of all
risk alleles weighted by
their effect size

• PRS – this is what we are born with!
• Biomarkers (elevated LDL-C, systolic
blood pressure, glycose tolerance test
etc.) will change when disease process
is already ongoing

Polygenic Risk Scores
• There are several ways to calculate the
PRS, but they vary mostly in 3 aspects:
• 1. Number of the genetic variants used
(from few to millions)
• 2. Statistical model used
• 3. Ability of the PRS to generalize the
whole population
• Sugrue & Desikan 2019, JAMA, April 8

Polygenic risk scores (PRS) weighted: sum of all risk
alleles weighted by their effect size
Calculated as S = w1X1 + w2X2 + … + wkXk,
X1,…, Xk - allele dosages for k independent markers (SNP-s),
w1 , w2 , … , wk – weights
Individuals at
high genetic risk
Methodological
questions:
A)How to select the
SNPs – how many
and what are the
selection criteria?
B)How to select the
optimal weights?
K. Läll …. & K. Fischer,
GM, 2016

Genetic risk score distributions in different populations
GRST2D
Reisberg et al. 2017. PlosONE
How much does a risk model depend on
the population where it is developed?

Regional PRS are rather similar

PRS of Breast Cancer
• No BRACA1 & BRCA2, but ca 900
SNP variants
Läll et al (2019) BMC
Cancer 19, 557

Cumulative incidence by the
age of 75 in GRS top 10%
category was 12.6%. In middle
category 6.2% and in the
lowest 10% GRS category,
3.5%.
Median follow up 8.6 years,
total number of cases 361.

Population vs top 5% GRS based on NIHD
data

Polygenic Risk Scores for Prediction of
Breast Cancer and Breast Cancer Subtypes
• The American Journal of Human Genetics 104, 1–14, January
3, 2019
• Nasim Mavaddat*, Kyriaki Michailidou, Joe Dennis, Michael
Lush, Laura Fachal, Andrew Lee, ….. Paul D.P. Pharoah,
Antonis C. Antoniou,1Nilanjan Chatterjee, Peter Kraft,
Montserrat Garcı´a-Closas,6 Jacques Simard and Douglas F.
Easton.

Intervention?
Perhaps for the high risk group start
mammography/MRI 10-15 years earlier?
Clinical study to test it is underway in TU hospital

Importance of pharmacogenomics
98% of Europeans carry ≥ 1 mutation of
pharmacogenetic relevance
Pharmacogenetic study
ADR
diagnoses
Drug
prescriptions
WGS +
genome-wide
genotyping

On average 5.5% of individuals in
the population use at least one of
the 32 drugs associated with the
studied genes on a daily basis.

CYP2D6 Loss of function mutation and adverse
drug reactions
1
2 2 22 2
4
22
4
2 2 2 2 22
3
22
L27.0
7.6.09 1.22.10 8.10.10 2.26.11 9.14.11 4.1.12 10.18.12 5.6.13 11.22.13 6.10.14 12.27.14 7.15.15
1-3. Metoprololum
4. Tamoxifenum
2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1M60.8
9.14.11 4.1.12 10.18.12 5.6.13 11.22.13 6.10.14 12.27.14 7.15.15 1.31.16
1.Sertralinum
2.Venlafaxinum M60.8= Other myositis
1 1 1 1 11 1
T88.7 Y57.5
10.14.09 1.22.10 5.2.10 8.10.10 11.18.10 2.26.11 6.6.11 9.14.11 12.23.11 4.1.12 7.10.12
1. Escitalopramum T88.7 = Unspecified adverse effect of drug or
medicament
Y57.5 Drugs, medicaments and biological substances
causing adverse effects in therapeutic use
L27.0 = Generalized skin eruption due to drugs
and medicaments taken internally

PharmGKB
CPIC guidelines
We will determine genotypes of
Estonian Biobank participants for
variants in 8 genes that have
previously been shown to be
relevant in response of drugs with
certain active substances. They
will then receive a
pharmacogenetics report and
counselling based on peer-
reviewed and internationally
approved clinical practice
guidelines. More specific
prescribing decision support will
be provided for medical
professionals.

Pharmacogenetic feedback
• 33-y female with depression
• CYP2C19 slow metabolizer,
dose reduction to 50%
recommended
• Sertralin and escitalopram
formerly prescribed
• Both withdrawn, due to ADR –
agitation, aggressiveness,
pharyngitis, etc.
38

Decision support tools (DST)
• Population scale genomics based on
implementing the PRS is the “instrument” for
disease prediction and prevention and this
should happen on the primary care level
• GP need support in order to implement the
new genomics based information
• The DST should be easy to use, but PRS
must base on the inform updated information
in the relevant database

Challenges and issues
• Awareness executives, doctors and patients
• New technologies and data empower patient
with more possibilities to manage own health
• Ethical issues
– Right to know and right not to know
– Treatable and non-treatable conditions
– Big data, cloud
• Knowledge about associations between DNA
variants and diseases is improving, need for
better databases
• Large work-load to keep database of known
risk variance updated

Conclusion
Large prospective biobank cohorts
make it possible to move towards
personalized genetic risk prediction
and to use it in general medical
practice – however, there are still
many challenges on this road

Tarkvara TAK (STACC)
Tõnu Esko, Krista Fischer, Reedik Mägi, Maris Alver, Kristi Läll, Kristi Krebs,
Tõnis Tasa, Mart Kals, Tom Haller, Neeme Tõnisson, Tiit Nikopensius,
Anu Reigo, Liis Leitsalu, Kristjan Metsalu, Kairit Mikkel, Mari-Liis Tammesoo …
Prof. Jaak Vilo, Hedi Peterson

andres.metspalu@ut.ee
Thank you!
www.biobank.ee

Andres Metspalu: The Estonian Genome Project

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Andres Metspalu: The Estonian Genome Project

Similar a Andres Metspalu: The Estonian Genome Project (20)

Más de THL

Más de THL (20)

Último

Último (20)

Andres Metspalu: The Estonian Genome Project