1) The Estonian Biobank project has collected genetic and health data from over 200,000 Estonian volunteers, representing 15% of the country's population. Samples include whole genome sequences, exome sequences, and various omics data.
2) The biobank aims to facilitate personalized medicine by identifying individuals at high genetic risk and returning risk scores and other findings to participants. Studies on conditions like familial hypercholesterolemia and breast cancer have found many underdiagnosed cases using biobank data.
3) Secure electronic health records and national registries allow enrichment of biobank data with disease trajectories, treatments, and outcomes. The biobank also aims to use polygenic risk
1. Estonian Gemome Project:
From Biobanking to
Personalised Medicine
Andres Metspalu
Estonian Biobank, Institute of Genomics,
University of Tartu, Estonia
Genomics to Healthcare
THL, Helsinki, September 13, 2019
2. Estonian Biobank (started in 1999!)
• 1. Prospective, longitudinal, volunteer-based
• 2. Health records, diet, physical activity, etc.
DNA, plasma, 3000 WGS, 2500 WES, all GSA
array
• 3. 52,000 participants from 2002-2011,
additional 150 000 2018-2019
• 5. Open for research: Clear access rules, broad
informed consent, HGRAct, EstGSA array data
• 6. 200 000 individuals = 15% of the population of
Estonia
3. -0,05
-0,03
-0,01
0,01
0,03
0,05
0,07
-0,05 -0,03 -0,01 0,01 0,03 0,05
P21(4.68%)
PC1 (8.65%)
Austria
Bulgaria
Czech Republic
Estonia
Finland (Helsinki)
Finland (Kuusamo)
France
Northern Germany
Southern Germany
Hungary
Northern Italy
Southern Italy
Latvia
Lithuania
Poland
Russia
Spain
Sweden
Switzerland
Baltic countries,
Russia,
Poland
Sweden
Western and
Central Europe
Italy
Finland
European genetic map
Nelis M et al., PLoS ONE 4(5):e5472
5. Secure exchange of health data –
cornerstone of Estonian digital health architecture
PHARMACIES
AND
INDIVIDUAL
DOCTORS
X-Road, ID-card, State IS Service Register
STATEAGENCYOFMEDICINES
-CodingCentre
-Handlersofmedicines
POPULATIONREGISTER
BUSINESSREGISTER
ANONYMIZED
DATA WAREHOUSE
for ANALYTICS
(2013)
NATIONAL HEALTH INFORMATION
SYSTEM/ARCHIVE
(2008 december)
DOCTOR
PORTAL /
WORK-
STATION
(2013)
X-ROAD
GATEWAY
SERVICE
(2009)
PATIENT
PORTAL /
WORK-
STATION
(2009 v1
2013 v2)
SOCIAL
INSURAN
CE
BOARD
PORTAL
/ WORK-
STATION
(2012)
HEALTHINSURANCE
FUNDREGISTER
NATIONAL
SUPPORT
APPLICATION
S for
AMBULANCE
SERVICES
(2014)
STATISTIC
S PORTAL
(2013)
AMBULANCE
SERVICE
MOBILE
WORKSTATIO
NS
NATIONAL MEDICAL IMAGES
ARCHIVE PACS
(2005 | national 2014)
VPN
UNEMPLYMENTINSURANCE
FUND
Status 2017 Dec
7. Figure 3. National registries and databases for enrichment of phenotype data in the Estonian Biobank. The
schematic diagram illustrates the different layers of information available in the database of the Estonian
Biobank, which is continually being updated by queries to the Estonian Causes of Death Registry, the
Estonian Cancer Registry and the Digital Prescription Database of the Estonian Health Insurance Fund, as
well as electronic medical records (EMRs) from the databases of the two major hospitals in Estonia. Data
generated through research projects must be returned to the Biobank within 5 years of the original data
release from the Biobank.
9. Personalized medicine
• - rare disease – precision diagnostics
(WES, WGS)
• - cancer – precision treatment (e.g. mut
specific MaB)
• - common diseases – “personal
prevention” (PRS)
• Pharmacogenomics (drug response)
10. Genomics of the common disease
(CAD, T2D, etc.) “Estonian approach”
• 1. Sequence ca 1-2% of the population and
capture maximum amount of the genomic
heterogeneity and use it for imputations
• 2. Use arrays for the major part of the
population and impute the arrays
• 3. Use the data for PRS and
pharmacogenetics
• 4. It costs 50€ to recruit and genotype one
individual = population scale Per Med
12. Human Genes Research Act
Passed 13.12.2000
RT I 2000, 104, 685
• § 3. Chief processor of Gene Bank
• (1) The chief processor of the Gene Bank is the
University of Tartu whose objective as the chief
processor of the Gene Bank is to:
1) promote the development of genetic research;
2) collect information on the health of the Estonian
population and genetic information concerning the
Estonian population;
3) use the results of genetic research to improve
public health.
13. Return the genetic data back to
people from the Estonian Biobank
• Estonian biobank is returning the
research data back to the people who
want and agree to get it.
• We are inviting back approx. 3000
people, around 2000 have received by
now the polygenetic risk scores (PRS)
and 30 min counseling.
15. 4 examples:
• 1. Familial hypercholesterolemia
(FH) - Alver et al. (2018), GIM
Genetics first” approach
• 2. T2D – PRS - Läll et al (2016), GIM
• 2. Breast cancer – PRS, Läll et al
(2019), submitted
• 3. Pharmacogenetics - Reisberg et al
(2018) GIM
16. Abul-Husn et al. Science 2016
Khera et al. J Am Coll Cardiol. 2016
FH-linked variant (LDLR, APOB, PCSK9 gene) carriers display 50 mg/dl
(1.3 mmol/L) and greater and a wide spectrum of LDL-C level
Diagnostic LDL-C level cut-off
for FH cases >4.9 mmol/L
Alver et al. (2018) Genetics in
Medicine
Familiar hypercholesterolemia - FH
17.
18.
19. FH Summary
• Under-diagnosis and under-treatment
– reclassified 51% from having non-specific
hypercholesterolemia to having FH, half of them were on
statins, but none had LDL-C below treatment goals
– identified 32% who had gone unrecognized by the medical
system
– Reliable identification of new FH cases and people with high
GRS which has direct impact on family members
• Insensitivity of current criteria used in FH diagnosis
– wide spectrum of LDL-C levels
• 34% had LDL-C levels ≤4.9 mmol/L
– visible accumulations of lipid deposits detected in 5% only
– heterogeneity in clinical expression
– Cascade
20. Polygenic risk scores
• Most of the associated loci identified in GWAS have
very small effects
• Polygenic risk score can be constructed by
combining the effects of all associated loci
– unweighted: sum of all
risk alleles
– weighted: sum of all
risk alleles weighted by
their effect size
21. • PRS – this is what we are born with!
• Biomarkers (elevated LDL-C, systolic
blood pressure, glycose tolerance test
etc.) will change when disease process
is already ongoing
22. Polygenic Risk Scores
• There are several ways to calculate the
PRS, but they vary mostly in 3 aspects:
• 1. Number of the genetic variants used
(from few to millions)
• 2. Statistical model used
• 3. Ability of the PRS to generalize the
whole population
• Sugrue & Desikan 2019, JAMA, April 8
23. Polygenic risk scores (PRS) weighted: sum of all risk
alleles weighted by their effect size
Calculated as S = w1X1 + w2X2 + … + wkXk,
X1,…, Xk - allele dosages for k independent markers (SNP-s),
w1 , w2 , … , wk – weights
Individuals at
high genetic risk
Methodological
questions:
A)How to select the
SNPs – how many
and what are the
selection criteria?
B)How to select the
optimal weights?
K. Läll …. & K. Fischer,
GM, 2016
24.
25.
26.
27. Genetic risk score distributions in different populations
GRST2D
Reisberg et al. 2017. PlosONE
How much does a risk model depend on
the population where it is developed?
29. PRS of Breast Cancer
• No BRACA1 & BRCA2, but ca 900
SNP variants
Läll et al (2019) BMC
Cancer 19, 557
30. Cumulative incidence by the
age of 75 in GRS top 10%
category was 12.6%. In middle
category 6.2% and in the
lowest 10% GRS category,
3.5%.
Median follow up 8.6 years,
total number of cases 361.
32. Polygenic Risk Scores for Prediction of
Breast Cancer and Breast Cancer Subtypes
• The American Journal of Human Genetics 104, 1–14, January
3, 2019
• Nasim Mavaddat*, Kyriaki Michailidou, Joe Dennis, Michael
Lush, Laura Fachal, Andrew Lee, ….. Paul D.P. Pharoah,
Antonis C. Antoniou,1Nilanjan Chatterjee, Peter Kraft,
Montserrat Garcı´a-Closas,6 Jacques Simard and Douglas F.
Easton.
33. Intervention?
Perhaps for the high risk group start
mammography/MRI 10-15 years earlier?
Clinical study to test it is underway in TU hospital
34. Importance of pharmacogenomics
98% of Europeans carry ≥ 1 mutation of
pharmacogenetic relevance
Pharmacogenetic study
ADR
diagnoses
Drug
prescriptions
WGS +
genome-wide
genotyping
35. On average 5.5% of individuals in
the population use at least one of
the 32 drugs associated with the
studied genes on a daily basis.
36. CYP2D6 Loss of function mutation and adverse
drug reactions
1
2 2 22 2
4
22
4
2 2 2 2 22
3
22
L27.0
7.6.09 1.22.10 8.10.10 2.26.11 9.14.11 4.1.12 10.18.12 5.6.13 11.22.13 6.10.14 12.27.14 7.15.15
1-3. Metoprololum
4. Tamoxifenum
2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1M60.8
9.14.11 4.1.12 10.18.12 5.6.13 11.22.13 6.10.14 12.27.14 7.15.15 1.31.16
1.Sertralinum
2.Venlafaxinum M60.8= Other myositis
1 1 1 1 11 1
T88.7 Y57.5
10.14.09 1.22.10 5.2.10 8.10.10 11.18.10 2.26.11 6.6.11 9.14.11 12.23.11 4.1.12 7.10.12
1. Escitalopramum T88.7 = Unspecified adverse effect of drug or
medicament
Y57.5 Drugs, medicaments and biological substances
causing adverse effects in therapeutic use
L27.0 = Generalized skin eruption due to drugs
and medicaments taken internally
37. PharmGKB
CPIC guidelines
We will determine genotypes of
Estonian Biobank participants for
variants in 8 genes that have
previously been shown to be
relevant in response of drugs with
certain active substances. They
will then receive a
pharmacogenetics report and
counselling based on peer-
reviewed and internationally
approved clinical practice
guidelines. More specific
prescribing decision support will
be provided for medical
professionals.
38. Pharmacogenetic feedback
• 33-y female with depression
• CYP2C19 slow metabolizer,
dose reduction to 50%
recommended
• Sertralin and escitalopram
formerly prescribed
• Both withdrawn, due to ADR –
agitation, aggressiveness,
pharyngitis, etc.
38
39.
40. Decision support tools (DST)
• Population scale genomics based on
implementing the PRS is the “instrument” for
disease prediction and prevention and this
should happen on the primary care level
• GP need support in order to implement the
new genomics based information
• The DST should be easy to use, but PRS
must base on the inform updated information
in the relevant database
41. Challenges and issues
• Awareness executives, doctors and patients
• New technologies and data empower patient
with more possibilities to manage own health
• Ethical issues
– Right to know and right not to know
– Treatable and non-treatable conditions
– Big data, cloud
• Knowledge about associations between DNA
variants and diseases is improving, need for
better databases
• Large work-load to keep database of known
risk variance updated
42. Conclusion
Large prospective biobank cohorts
make it possible to move towards
personalized genetic risk prediction
and to use it in general medical
practice – however, there are still
many challenges on this road
43. Tarkvara TAK (STACC)
Tõnu Esko, Krista Fischer, Reedik Mägi, Maris Alver, Kristi Läll, Kristi Krebs,
Tõnis Tasa, Mart Kals, Tom Haller, Neeme Tõnisson, Tiit Nikopensius,
Anu Reigo, Liis Leitsalu, Kristjan Metsalu, Kairit Mikkel, Mari-Liis Tammesoo …
Prof. Jaak Vilo, Hedi Peterson