11. SIFT
• Broadly used, relatively old (2001)
• Based uniquely on protein sequence (amino acid) conservation
1. Start from query protein sequence
2. Identify similar protein sequences (PSI-BLAST)
3. Multiple alignment of protein sequences (orthologs and paralogs)
4. Amino acid x residue probability matrix (PSSM)
5. For every residue, amino acid probability reweighted by amino acid diversity at the position (sum of
frequency rank * frequency)
à Score: probability of observing amino acid normalized by residue conservation
cut-off: 0.05 (based on case studies)
Predicting deleterious amino acid substitutions.
Ng PC, Henikoff S. Genome Res. 2001 May;11(5):863-74.
12. PolyPhen2
• Integrates multiple features
• 8 sequence-based, 3 structure-based (nucleotide and amino acid level)
(e.g. side chain volume change, overlap with PFAM domain, multiple alignment metrics)
• Supervised machine learning method (Naïve Bayes) à Requires training set
• Set 1: HumDiv
• Positive: damaging alleles for known Mendelian disorders (Uniprot)
• Negative: nondamaging differences between human proteins and related mammalian homologs
• Performance 5-fold crossv: (TP ~ 80%, FP ~10%), (TP ~ 90%, FP ~ 20%)
• Set 2: HumVar
• Positive: all human disease causing mutations (Uniprot)
• Negative: non-synonymous SNPs without disease association
àRicher model than SIFT
àMore biased towards training set(s) than SIFT
A method and server for predicting damaging missense mutations.
Adzhubei IA, Schmidt S, Peshkin L, […], Bork P, Kondrashov AS, Sunyaev SR. Nat Methods. 2010 Apr;7(4):248-9.
13. CADD
• Intended as a measure of “deleteriousness” for coding and non-coding sequence,
not biased to known disease variation
• However non particularly effective for non-coding regulatory sequence (see lecture)
• Supervised machine learning model (Linear SVM)
• Negative training set: nearly fixed human alleles, variant if compared to inferred human-
chimp ancestral genome
• Positive training set: simulated variants based on mutation model aware of sequence context
and primate substitution rates
• Predictive features (63): VEP (Variant Effect Predictor) output, UCSC tracks, Encode tracks à
includes missense predictions and nucleotide-level conservation
• Performance assessment: using pathogenic variants from ClinVar performs a bit better
PhyloP for all sites and PolyPhen/SIFT for missense coding
A general framework for estimating the relative pathogenicity of human genetic variants.
Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. Nat Genet. 2014 Mar;46(3):310-5.
16. MutSigCV
• Goal: identify significantly mutated genes
à Important to model mutational background model
• Tumour-specific global mutation rate
• Trinucleotide context and substitution
• Expression level (impacting transcription-couple repair)
• Replication timing (later-replicating regions have higher tumour rates)
• Residual local genomic region mutation rate
Lawrence MS, ..., Getz G. Mutational heterogeneity in cancer and the
search for new cancer-associated genes. Nature 2013. PMID: 23770567
25. Fisher’s Exact Test (FET)
b a
d c
Exp_positive=yes Exp_positive=no
Gene-Set=yes a b
Gene-Set=no c d
Fisher’s Exact Test:
2 x 2 Contingency Table
Probability of one table to occur by random sampling:
Hypergeometric distribution formula:
Test p-value: sum of random sampling probabilities for tables
as extreme or more extreme than the real table
26. The Background is Important!
b a
d c
• Inappropriate modeling of the background will lead to
incorrectly biased results
– What genes are detectable by the experiment? E.g.: in a kinase
phosphorylation assay, only kinases can be detected
– The Fisher’s Exact Test, GSEA and other tests assume all genes have
the same “prior” probability of being experimentally positive à
they can be used only in absence of systematic selection biases
(example of bias: if you select genes with at least one mutation,
then longer genes are systematically more likely to be selected)
30. Gene-set Types
• Functions (e.g. Gene Ontology)
• Pathways (e.g. KEGG, Reactome)
• Genotype-phenotype/disease association (e.g. HPO)
• Protein Families / Domains (e.g. PFAM)
• Genomic position (e.g. cytobands)
• Gene expression signatures (e.g. MSigDB Cancer Hallmarks)
• Up/down after treatment or in relation to disease
• Targets of regulators
• Transcription factor targets
• miRNA targets
• Network-derived modules, e.g. protein-protein interactions
• Drug targets
31. Gene Ontology (GO) / 1
• Effort to standardize functional description of eukaryotic gene products
• Launched in 1998
• Many organism species supported
• Normal function (e.g. cell cycle), not disorder / disease (e.g. metastasis formation)
• Ontology defined by core team of curators who receive input from domain experts
• Corpus of gene annotations based on expert curation of the literature (> 140,000 published
papers in 2018), review of high-throughput data, or annotations in existing databases;
performed by curators at specific organism genome databases (human: UniProtKB)
38. Resources to Download Gene-sets
BaderLab (University of Toronto)
http://baderlab.org/GeneSets
• Gene Ontology; Reactome, Panther, NetPath, NCI, MSigDB C2 (Biocarta, ...), HumanCyc pathways; MSigDB cancer
hallmarks; MSigDB C3 (miRNA and TF targets)
• updated on a monthly basis
MSigDB (Broad Institute)
https://software.broadinstitute.org/gsea/msigdb/
• Gene Ontology; KEGG, Reactome, Biocarta, other pathways; cancer hallmarks; expression signatures; miRNA and TF
targets; interaction modules; Cytobands (positional)
• last update Oct 2017, several gene-set collections are derived from old research works (2004-2005)
Bioconductor org.Hs.eg.db
http://bioconductor.org/packages/release/data/annotation/html/org.Hs.eg.db.html
• Gene Ontology; KEGG pathways; PFAM (protein domains); Cytobands (positional)
• updated every 4 months
Notes:
• KEGG stopped being freely available on 2011, so freely-available resources have largely outdated gene-sets
• Carefully check how GO annotations are exported (e.g. all evidence codes, or excluding IEA)
41. Visualization: Cytoscape Enrichment Map
• Visualization framework for gene-set
analysis results
• Cytoscape network: nodes correspond to
gene-sets, edges correspond to gene-set
overlaps (i.e. share a fraction of their genes)
• Intuitive clustering of gene-sets that
converge on the same functional themes
• Determined by automatic network layout
algorithm, based on edge weights
• Overlaps < threshold are pruned, otherwise
network layout would work poorly
• Important: don’t confuse with gene
networks
• Nodes do not represent genes, they represent
gene-sets/pathways
• Edges do not represent physical interactions, they
represent overlaps between gene-sets
A
B
Edges represent
gene-set overlap
Merico D, Isserlin R, Stueker O, Emili A, Bader GD.
Enrichment map: a network-based method for gene-set
enrichment visualization and interpretation.
PLoS One 2010. PMID: 21085593
51. GSEA: Gene-Set Enrichment Analysis
• Popular threshold-free gene-set test
• Identifies gene-sets enriched in top- or bottom-ranking genes
• Suggest typically used as competitive test (see permutation settings), which takes
in input a ranked gene list
• Statistical test: empirical test based on permutations; includes permutation-
based FDR
• The NES (normalized enrichment score) is a particularly valuable measure of
enrichment effect size for visualization
53. GSEA Permutation Settings
• The permutation setting completely changes the nature of the GSEA test
• Gene-set permutations (aka pre-ranked)
• Takes in input a ranked gene list and permutes the genes in the gene-sets
• à competitive
• Recommended in presence of differential gene expression data for small or medium-scale
experiments (2-4 biological replicates per condition) with modest expression heterogeneity
• Phenotype permutation
• Permute the phenotype labels (e.g. treated, untreated), then repeat gene scoring; gene
scoring is performed within GSEA
• à competitive / self-contained hybrid
• Recommended for larger scale gene expression data (> 10 biological replicates per condition)
with high expression heterogeneity
• As an alternative, consider a pure self-contained test, or a self-contained test with a different
competitive correction
55. OICR PanCuRx: Dataset Summary
• 200 primary tumours and 41 metastases (pancreatic cancer)
• Whole genome sequencing à detection of SNVs, indels, SVs, copy number gains and losses
• Mutation load outlier removal criterion: median + 2 IQR
à Samples retained: 190/200 primaries and 41/41 metastases
Met Pri
3.54.04.55.0
SNV count
Log10(SNVcount)
Met Pri
1.52.02.53.03.54.04.5
Indel count
Log10(indelcount)
Met Pri
0.00.51.01.52.02.53.0
SV count
Log10(SVcount)
Unpublished data
56. OICR PanCuRx: Gene-set Analysis Strategy
1. Perform gene-set burden test, primaries vs metastases
• Logistic regression (metastases vs. primary), separating each variant type:
M0 = y ~ ns_tot + ms_tot + ss_tot + sv_tot + cL_tot + cG_tot
M1 = y ~ ns_tot + ms_tot + ss_tot + sv_tot + cL_tot + cG_tot +
ns_gs + ms_gs + ss_gs + sv_gs + cL_gs + cG_gs
• Multiple test correction by BH-FDR (significant when BH-FDR < 27.5%)
2. For significant gene-sets, categorize driver variant type(s) and extract genes
more often mutated in metastases for such variant types (“leading edge” gene)
3. Cluster pathways based on leading gene overlaps, visualize using Cytoscape
enrichment map plugin
4. Overlay key genes (even more stringent filter: mutation rate met/pri > 4.5)
5. Formulate hypotheses à correlation with other tumour properties
• RNA-seq based proliferation index (CCP) and missense mutations in cell cycle genes
Unpublished results;
Gallinger, PanCuRx TRI, Toronto
57. REACT:TELOMERE MAINTENANCE
REACT:ION CHANNEL TRANSPORT
KEGG:BASE EXCISION REPAIR
REACT:RESOLUTION OF ABASIC SITES
(AP SITES)
KEGG:MINERAL ABSORPTION
REACT:CHROMOSOME MAINTENANCE
REACT:BASE EXCISION REPAIR
REACT:TRANSMEMBRANE TRANSPORT
OF SMALL MOLECULES
REACT:NUCLEOSOME ASSEMBLY
REACT:HDACS DEACETYLATE HISTONES
REACT:DEPOSITION OF NEW
CENPA-CONTAINING NUCLEOSOMES AT
THE CENTROMERE
REACT:DNA REPLICATION
PRE-INITIATION
REACT:FORMATION OF THE
BETA-CATENIN:TCF TRANSACTIVATING
COMPLEX
REACT:G2/M CHECKPOINTS
KEGG:ECM-RECEPTOR INTERACTION
REACT:CELL CYCLE, MITOTIC
REACT:M/G1 TRANSITION
REACT:G1/S TRANSITION
REACT:MITOTIC METAPHASE AND
ANAPHASE
REACT:TRANSCRIPTION-COUPLED
NUCLEOTIDE EXCISION REPAIR (TC-NER)
REACT:GAP-FILLING DNA REPAIR
SYNTHESIS AND LIGATION IN TC-NER
KEGG:SEROTONERGIC SYNAPSE
KEGG:GNRH SIGNALING PATHWAY
KEGG:CIRCADIAN ENTRAINMENT
Missense
(gain and loss of function?)
Nonsense + missense
(loss of function?)
Nonsense
Nonsense +
copy number loss
Other combination
Driver variants
Copy number gain
Missense + SV
(loss and gain of function?)
For all clusters, only variants driving corresponding gene-sets
and with counts met >= pri are reported; considering the number
of met and pri, this is corresponds to an enrichment ratio > 4.5
Unpublished results;
Gallinger, PanCuRx TRI, Toronto
58. REACT:TELOMERE MAINTENANCE
REACT:ION CHANNEL TRANSPORT
KEGG:BASE EXCISION REPAIR
REACT:RESOLUTION OF ABASIC SITES
(AP SITES)
KEGG:MINERAL ABSORPTION
REACT:CHROMOSOME MAINTENANCE
REACT:BASE EXCISION REPAIR
REACT:TRANSMEMBRANE TRANSPORT
OF SMALL MOLECULES
REACT:NUCLEOSOME ASSEMBLY
REACT:HDACS DEACETYLATE HISTONES
REACT:DEPOSITION OF NEW
CENPA-CONTAINING NUCLEOSOMES AT
THE CENTROMERE
REACT:DNA REPLICATION
PRE-INITIATION
REACT:FORMATION OF THE
BETA-CATENIN:TCF TRANSACTIVATING
COMPLEX
REACT:G2/M CHECKPOINTS
KEGG:ECM-RECEPTOR INTERACTION
REACT:CELL CYCLE, MITOTIC
REACT:M/G1 TRANSITION
REACT:G1/S TRANSITION
REACT:MITOTIC METAPHASE AND
ANAPHASE
REACT:TRANSCRIPTION-COUPLED
NUCLEOTIDE EXCISION REPAIR (TC-NER)
REACT:GAP-FILLING DNA REPAIR
SYNTHESIS AND LIGATION IN TC-NER
KEGG:SEROTONERGIC SYNAPSE
KEGG:GNRH SIGNALING PATHWAY
KEGG:CIRCADIAN ENTRAINMENT
Missense
(gain and loss of function?)
Nonsense + missense
(loss of function?)
Nonsense
Nonsense +
copy number loss
Other combination
Driver variants Cell cycle (cell cycle progression and checkpoints), DNA replication (polymerase, replication initiation,
replication fork complexes), chromosome maintenance and segregation (centromere components,
centrosome components, spindle checkpoint) – missense, sometimes also sv [labelled]
CDT1 (4,0): prevents initiation of replication when DNA replication is ongoing
POLA1 (1,0) : DNA polymerases [POLD1, POLD3 and other DNA polymerases listed only in repair cluster]
MCM8 (2,0), MCM3 (1,0), MCM10 (1,1), MCM7 (1,1): replication fork complex – [MCM10 in CCP]
CENPA (1,0), CENPL (1,0), CENPJ (1,1), : centromere (chromosome segregation) – [CENPM, CENPF in CCP]
NCAPD3 (1,0), NIPBL (1,1): chromosome condensation and/or segregation
CEP57 (2,0), CEP152 (2,1), CNTRL (1,1): microtubule centrosome (chromosome segregation) – [CEP55 in CCP]
ERCC6L (2,1): spindle checkpoint; CKAP5 (2,2): spindle formation; CASC5/KNL1 (sv 1,0): kinetochore
E2F1 (1,0), E2F4 (1,0), TFDP1 (1,0; sv 1,1): TFs regulating cell cycle progression
ANAPC11 (1,0), ANAPC2 (1,0): anaphase promoting complex (cell cycle progression); FBXO5 (1,0; sv 1,0):
anaphase promoting complex inhibitor
ATM (sv 1,1), TP53BP1 (1,1): TP53 pathway and DNA damage response; HMG20B (1,0): DNA damage response
[histone and histone (de)acetylation listed for the separate subcluster]
Other: AHCTF1 (2,2; sv 2,1), B9D2 (1,0), BARD1 (1,0), GORASP1 (1,0), LEMD2 (1,0), NEDD1 (1,0), NUP205 (1,0),
NUP88 (1,1), NUP133 (1,1), PPP1R12A (1,0), PSMA3 (1,1), PSMD1 (1,1), SDCCAG8 (1,1), SGOL2 (1,1), TUBGCP5
(1,0), UBB (1,0), YWHAH (1,0), XPO1 (1,0), WRAP53 (sv 1,0), ZW10 (1,0)
DNA base excision repair – missense, sv
PARP1 (sv 1,0), PARP2 (ms 1,0), PARP4 (ms 1,0), POLD3 (sv 1,0), MPG (ms 1,0), RPA1 (sv 2,0),
RPA2 (1,0), TDG (ms 1,0)
Transcription-coupled nucleotide excision repair – only missense
COPS2 (ms 1,0), EP300 (ms 2,0), ERCC3 (ms 2,0), POLK (ms 1,0), UBB (ms 1,0)
Both – missense, sv
LIG3 (ms 1,0), POLD1 (ms 1,1; sv 1,0), XRCC1 (ms 1,0; sv 1,0)
Beta catenin pathway – only missense
CTNNB1 (2,2): beta catenin
TCF7L2 (2,0): TF that partners with CTNNB1 and
activates target genes
Extracellular matrix–receptor interactions
– only missense
LAMB4 (1,1), LAMC1 (1,1), LAMC2 (1,0)
COL4A2 (1,1), COL6A3 (2,2), COL9A2 (1,1), COL6A5
(4,2), HSPG2 (2,0)
COMP (1,0), TNR (1,1)
ITGA1 (2,0), ITGB4 (2,0), ITGB3 (2,0), ITGA2B (1,0),
ITGA11 (1,1), ITGAV (1,1)
CD47 (1,0), CD36 (1,0)
Histones and histone (de)acetylation
– only missense
HIST1H2BB (2,1), HIST1H2BD (1,0), HIST1H2BL (1,0),
HIST1H2BO (sv 1,0),: transcriptional activation,
response to DNA damage and other processes
H2AFB1 (sv 2,1)
CHD4 (1,0): nucleosome remodeling and histone
deacetylase complex
EP300 (2,0): histone acetyltransferase recognizing
enhancers, involved in cell cycle, DNA damage
response, …
KAT5 (1,0): histone acetyltransferase
ARID4B (1,1): histone deacetylase
WHSC1 (1,1; sv 1,0): histone methyltransferase
NCOR1 (1,1), TBL1XR1 (1,1): nuclear receptor
corepressor (N-CoR) and histone deacetylase 3
(HDAC 3) complexes
Misc. signalling
– only cnGain
ITPR2 (2,0)
ALOX12 (1,0)
GNAS (1,0)
MAP3K3 (1,0)
PRKCG (1,0)
Misc. signalling
– only nonsense
ADCY2 (1,0)
ADCY10 (1,1)
GUCY1A3 (1,0)
RYR3 (2,1)
Copy number gain
Missense + SV
(loss and gain of function?)
For all clusters, only variants driving corresponding gene-sets
and with counts met >= pri are reported; considering the number
of met and pri, this is corresponds to an enrichment ratio > 4.5
Unpublished results;
Gallinger, PanCuRxTRI, Toronto
59. All samples
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.42557 0.08620 4.937 2.39e-05 ***
# gsCC_ms_bin_stdz 0.14522 0.08739 1.662 0.1063
# gCDKN2ALOF_bin_stdz 0.16066 0.09449 1.700 0.0988 .
# vc_ms_tot_stdz 0.13934 0.08962 1.555 0.1298
Samples with <= 60 missense
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.31231 0.10020 3.117 0.00455 **
# gsCC_ms_bin_stdz 0.14051 0.09719 1.446 0.16068
# gCDKN2ALOF_bin_stdz 0.25673 0.11289 2.274 0.03181 *
# vc_ms_tot_stdz -0.09684 0.11187 -0.866 0.39489
Cell cycle missense x CDKN2A LOF (ns, sv, cL)
Met_CDKN2Ay_CCMSy
Met_CDKN2Ay_CCMSn
Met_CDKN2An_CCMSy
Met_CDKN2An_CCMSn
Pri_CDKN2Ay_CCMSy
Pri_CDKN2Ay_CCMSn
Pri_CDKN2An_CCMSy
Pri_CDKN2An_CCMSn
-3-2-1012
Met/pri x CDKN2A y/n x Cell Cycle ms y/n: ccp
ccpRNAindex
Met_CDKN2Ay_CCMSy
Met_CDKN2Ay_CCMSn
Met_CDKN2An_CCMSy
Met_CDKN2An_CCMSn
Pri_CDKN2Ay_CCMSy
Pri_CDKN2Ay_CCMSn
Pri_CDKN2An_CCMSy
Pri_CDKN2An_CCMSn
-3-2-1012
Met/pri x CDKN2A y/n x Cell Cycle ms y/n: ccp
ccpRNAindex
Unpublished results;
Gallinger, PanCuRxTRI, Toronto
63. General Tips for Gene-set Analysis / 2
• Chose a competitive of self-contained test
Competitive:
• requires meaningful gene seletion or ranking à typically suitable for differential gene
expression or genes with significant mutation burden
• if analyzing other –omics, model carefully the background distribution, do not simply assume
Fisher’s Exact Test or GSEA will be suitable (e.g. use GREAT for ChIP-seq, etc.)
Self-contained:
• typically suitable for sparser mutations, when differences are significant at gene-set level only
• ensure that different sample groups are comparable, correct for confounders
• Proper visualization is important to interpret results and to identify issues
• Use visualization solution like Enrichment Map
• Visualize the full gene-set results, do not cherry-pick based on prior expectation
• Unexpected results can suggest issues (e.g. contamination, statistical bias)
65. Time1
...
Zz34
13.56Aabc
Ranked List
1.07
...
Time3
PIK3CA
TP53
Gene List
VisualizeInterpret
Extractgenelist
froman'omics
experiment
Performpathway
enrichment
analysis
clusterMaker
Word
Cloud
Annotate
Auto
Cytoscape EnrichmentMap
REGULATION OF INTERFERON-GAMMA-MEDIATED
SIGNALING PATHWAY%GOBP%GO:0060334
Pathway P-value Q-value
POSITIVE REGULATION OF RHO PROTEIN
SIGNAL TRANSDUCTION%GOBP%GO:0035025
POSITIVE REGULATION OF RAS PROTEIN SIGNAL
TRANSDUCTION%GOBP%GO:0046579
0.00304414
0.0
0.004622496
0.0056384853
0.0038799183
0.008516296
positive regulation of small
GTPase mediated signal
transduction
positive regulation of Ras protein
signal transduction
regulation of
interferon-gamma-mediated
signaling pathwaypositive regulation of Rho protein
signal transduction
regulation of response to
interferon-gamma
gtpase signal transduction
regulation interferon gamma
Outputs
• Published on bioRxiv Jan 2017,
provisionally accepted by
Nature Protocols
• General concepts and
resources
• Step-by-step instructions for
gene-set analysis of gene
expression data
69. Network Visualization: Automatic Layout
Before layout After layout
• Yeast proteins annotated to GO cellular component "chromosome”
• Colored based on sub-component (nucleosome, kinetochore, replication fork)
• The layout (force directed) meaningfully arranges nodes (genes/proteins) and edges (interactions)
Merico D, Gfeller D, Bader GD. How to visually interpret biological data
using networks. Nature Biotechnology 2009. PMID: 19816451
72. Networks vs Pathways
Pathways
• Hand-curated à more accurate
• Represent biochemical
reactions, or molecular events,
or regulatory relations among
proteins, protein complexes,
metabolites and other bio-
entities
Networks
• Derived from experimental high
throughput methods or text
mining à more noisy
• Represent simple relations
among genes (e.g. binds, is
similar to, is co-expressed with,
regulates)
• Cover a larger number of genes
73. Gene Network Resources
iRefWeb/iRefIndex wodaklab.org/iRefWeb
• Resource integrating different databases
• Mainly protein interactions
• Useful to explore specific interactions, or bulk download
GeneMANIA www.genemania.org
• Multiple networks available (including iRefIndex protein interactions)
• Useful to construct, visualize, and evaluate networks from “seed” genes (network propagation
algorithm)
STRING string-db.org
• Integrated network, based on algorithm for function prediction
• Protein interactions, pathway interactions, co-expression, etc..
74. Network Analysis Overview
Most common analysis types:
• Subnetwork construction from seed genes à GeneMANIA
• Network clustering / module finding à ClusterMaker2 (MCODE, MCL, …)
• Enriched sub-network identification à Reactome FI, HyperModules, HotNet
Other types of analysis:
• Network inference from expression data à ARACNE
• Pathway/network activity inference à SPIA, PARADIGM
• Overall analysis of network topology
• Motif identification, motif content analysis
75. Gene-set vs Network Analysis
• Gene-set pros
• Better coverage of genes and known biological processes / components
• Simple algorithmics, a few well-established analysis options
• Gene-set cons
• Simple and flat structure, do not represent mechanistic details
• Pre-constructed based on “general biology”
• Network pros
• More structured, more insight on mechanistic details
• Can reveal new gene-gene associations
• Network cons
• More limited coverage of genes and known biological processes / components
• More complex algorithmic, more analysis options
79. Reactome FIViz
Components:
• Functional Interaction (FI) Network
• Use experimental protein interactions in human, protein interactions in model organism,
gene expression, to predict “functional interactions”
• Positive set: pathway-based interactions from Reactome
• Subnetwork construction algorithm
• Classical: only direct connections, or additionally linkers
• HotNet: heat kernel
• Clustering Algorithm
• Edge-betweenness used to find “local interaction communities” in the sub-network
81. GeneMANIA or Reactome FIViz?
• GeneMANIA: start from experimental genes, construct a larger network of
related genes (without further using the same experimental data); typically works
well when initial genes form one cluster, when genes are too diverse tends to
connect them using less specific hubs
• Reactome FIViz: start from experimental genes, inter-connect them using
functional interactions and potentially including some linker genes, cluster them
into modules