- The document discusses using protein features like ELMs (linear motifs) and domains to help predict protein function, as the function is unknown for 40% of human proteins.
- It proposes integrating ELMs with other data like protein interaction networks, gene expression data, literature co-mentions to help extract likely ELM-mediated interactions and transfer functional associations between species.
- Challenges mentioned include reducing the high false positive rate and developing better ELM models and filters to improve the predictive power for protein function.
9. And now for something completely different: Protein association networks Genomic Neighborhood Species Co-occurrence Gene Fusions Database Imports Exp. Interaction Data Co-expression Literature co-occurrence
10.
11. Mining microarray expression databases Re-normalize arrays by modern method to remove biases Build expression matrix Combine similar arrays by PCA Construct predictor by Gaussian kernel density estimation Calibrate against KEGG maps Transfer associations across species
12. Co-mentioning in the scientific literature Associate abstracts with species Identify gene names in title/abstract Count (co-)occurrences of genes Test significance of associations Calibrate against KEGG maps Transfer associations across species