2. CaImAn: Calcium Imaging Analysis
1. A Giovannucci, J Friedrich, P Gunn, J Kalfon, et. al. “CaImAn: An open source tool for scalable Calcium Imaging data
Analysis”, eLife
3. CaImAn: Calcium Imaging Analysis
1. A Giovannucci, J Friedrich, P Gunn, J Kalfon, et. al. “CaImAn: An open source tool for scalable Calcium Imaging data
Analysis”, eLife
4. PyCUB: Hidden Patterns of the Codon Usage Bias
1. J Kalfon, “PyCUB: A machine exploration of the Codon Usage Bias”, University of Kent.
2. Y Deng, J Kalfon, et. al., “Hidden pattersn of the Codon Usage Bias”, Nature Communication, in review
● GC content
● tRNA pool
● replication speed
● environment temperature
● nitrogen availability
● biased random mutations
5. PyCUB: Hidden Patterns of the Codon Usage Bias
1. J Kalfon, “PyCUB: A machine exploration of the Codon Usage Bias”, University of Kent.
2. Y Deng, J Kalfon, et. al., “Hidden pattersn of the Codon Usage Bias”, Nature Communication, in review
● frequency measures
● deviation-to-reference
measures
● entropy measures
6. PyCUB: Methods
● 500 species from
ensembl, python
pipeline, scikit learn...
● Vector comparison
● Preprocessing (wide
range of measures ~20)
7. PyCUB: Methods
● Entropy → Force driving the CUB
● DBscan to cluster with outliers
● t-SNE & PCA to represent the data
● modelisation of the process
8. Results
❖ Specific distribution by
species groups
❖ Importance sequence’s
age
❖ Correlation to
sequence’s position.
❖ multiplicity of latent
factors
❖ Most Species have
specific CUBs
1. J Kalfon, “PyCUB: A machine exploration of the Codon Usage Bias”, University of Kent.
2. Y Deng, J Kalfon, et. al., “Hidden pattersn of the Codon Usage Bias”, Nature Communication, in review
9. Results
❖ Consistent results
❖ A python package to
analyse the CUB across
species
❖ A new measure of the
CUB with a fast
computation time.
1. J Kalfon, “PyCUB: A machine exploration of the Codon Usage Bias”, University of Kent.
2. Y Deng, J Kalfon, et. al., “Hidden pattersn of the Codon Usage Bias”, Nature Communication, in review
10. Conclusion
❖ Not one determinant of the
CUB
❖ The entropy measure is a
suitable one
❖ There is specific distribution
across genes (SLS).
Future research and ideas:
Using more big data specific
approach to analyze other/richer
kingdoms.
Remarks:
➔ The data was displaying a lot of
improbable sequences,
homologies, etc…
➔ t-SNE allowed to see clearly
driving mechanisms
1. J Kalfon, “PyCUB: A machine exploration of the Codon Usage Bias”, University of Kent.
2. Y Deng, J Kalfon, et. al., “Hidden pattersn of the Codon Usage Bias”, Nature Communication, in review
11. The things I loved
● Machine Learning / Data Science
● genomics / multi-omics & visual
data
● understand and model how cells
work.
● translational applications in
biomedicine
● working with teams, freedom to
explore and create
12. Computer Science + Biology = <3
1. see: statement of research objectives (jkobject.com)
2. VCF2ancestry, github/jkobject
Thank you!
goals > topics
🎉 reproducible
research