1. Identification of overlapping biclusters using Probabilistic Relational Models Tim Van den Bulcke, Hui Zhao , Kristof Engelen, Bart De Moor, Kathleen Marchal
2.
3.
4.
5.
6.
7. Probabilistic Relational Models (PRMs) Patient Treatment Virus strain Contact Image: free interpretation from Segal et al. Rich probabilistic models
8.
9.
10.
11.
12.
13. Pro Bic biclustering model PRM model Database instance ground Bayesian network Gene Expression Condition ? (0 or 1) ? (0 or 1) g2 ? (0 or 1) ? (0 or 1) g1 B2 B1 ID ? (0 or 1) ? (0 or 1) c2 ? (0 or 1) ? (0 or 1) c1 B2 B1 ID 1.6 c1 g2 (missing value) c2 g1 0.5 c2 g2 -2.4 c1 g1 level c.ID g.ID g1.B 1 g1.B 2 level 1,1 c1.B 1 c1.B 2 g2.B 1 g2.B 2 level 2,2 c2.B 1 c2.B 2 level 2,1 c1.ID c2.ID Gene Expression Condition B 1 B 2 B 1 B 2 P(e.level | g.B 1 ,g.B 2 ,c.B 1 ,c.B 2 , c.ID) = Normal( μ g.B,c.B, c.ID , σ g.B,c.B,c.ID ) ID level
21. Results: noise sensitivity Precision (genes) Recall (genes) Precision (conditions) Recall (conditions) A B A B σ σ σ σ Precision = TP / (TP+FP) Recall = TP / (TP+FN) A B A B A … B …
Animate! Prm model: classes like gene Every class has attributes. E.g. gene can have entrez id, sequence, promoter sequence, part of bicluster y/n, … There are relations between classes (~ foreign keys in databases)
Capitals = over all elements of a set (capital G = over all genes g)
Precision = p.p.v. = TP / (TP+FP) Recall = sensitivity = TP / (TP+FN) For bicluster noise levels up to 140% of the background noise: genes: perfect identification conditions: good identification (conditions similar to the background are not identified at high noise levels).
Capitals = over all elements of a set (capital G = over all genes g)
PRM = BN with extra constraints on probability distributions, given by relational structure
PRM = BN with extra constraints on probability distributions, given by relational structure