A short concluding speech for the AI & Topology session at the 2020 Applied Machine Learning Days (28 January 2020).
We remark on the strength of the case for using topological methods in various domains of machine learning. We then comment on our views on integrating topology with the practice of machine learning at a fundamental level. We give an (inexhaustive) overview of the open-source landscape for topological machine learning and data analysis, including our contribution, the giotto-tda Python package. Finally, we mention some promising future directions in the field.
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"
1. Closing remarks: The open-source
landscape for topology in machine
learning and data analysis
Umberto Lupo
Research Scientist, L2F SA
AI & Topology @ AMLD 2020
28 January 2020
“Where do I start?”
2. Why topological machine learning?
Topological methods focus on connectivity properties and are uniquely able to reveal structure.
Applying them to multiple domains provides new insights and state-of-the-art performance.
... And several other applications seen today! Potential for an algorithmic spring in machine learning!
Time series analysis 3
Graph classification 1
Feature space analysis / dimensionality reduction 2
1 M. Carrière et al., arXiv:1904.09378
2 M. Moor et al., arXiv:1906.00722
3 A. Myers et al, arXiv:1904.07403
3. Objective: Place topological learning algorithms firmly alongside established machine learning
techniques.
ML ethos: Select the best combinations of techniques in a data-driven way. The best ones may well
include a number of topological steps as part of a greater ML pipeline.
Featurization: Turn PH information into features which are amenable to processing by ML
algorithms. Possibilities: explicit vectorisations, learned representations (cf. Frédéric’s talk).
Hyperparameters: Typically, several involved within each choice of featurization technique. Even the
choice of featurization (model choice) can be regarded as a hyperparameter in its own right.
Large-scale cross-validation routines: Must involve all hyperparameters and model choices at once,
topological or not.
Topology and the ML workflow Example: Persistent Homology (PH)
4. “Backend” libraries for persistent homology:
• GUDHI (C++ & Python bindings/components)
• Ripser (C++)
• PHAT (C++)
• JavaPlex (Java)
• Dyonisus 2 (C++)
• Aleph (C++)
• RIVET (C++ & Python bindings)
Toolkits in high-level programming languages:
• TDA (R)
• scikit-tda (Python)
• gda-public (Python)
• Eirene (Julia)
Visualization-oriented:
• TTK (VTK/C++, Python, ParaView plugins)
Mapper algorithm:
• TDAMapper (R)
• Tmap (Python)
Persistent homology and deep learning:
• TopologyLayer (Python)
• PersLay (Python)
Other topological algorithms:
• UMAP (Python)
• hdbscan clustering (Python)
• TdaToolbox (Python)
... And lots of “smaller” software projects linked to
specific research papers – many by today’s speakers and
collaborators!
The open-source landscape circa 2019
5. giotto-tda: Pillars
Seamless integration with widely used ML frameworks: inherit their strengths and allow for creation
of heterogeneous ML pipelines. Python + scikit-learn
Code modularity: “Lego blocks” approach. Algorithms as transformers
User-friendliness and familiarity to the broad data science community. Strict adherence to scikit-
learn API and developer guidelines, “fit-transform” paradigm
Standardisation: Allow for integration of most available techniques into a generic framework.
Consistency of API across different modules
Performance within the language constraints. Vectorized code, parallelism (likely in future: just-in-
time compilation and more)
Data structures: Support for time series, graphs, images.
7. Open-source TML: The future?
o Further growth of performance-oriented projects: fast approximate calculations, HPC solutions,
algorithmic breakthroughs.
o Strong community: Inclusive and ever closer to the broad ML community.
Ø Join Bastian’s awesome Slack community, tda-in-ml.slack.com!
o Standardized integration with deep learning.
• In progress for persistent homology: “backpropagation through topology”
• Good promise for: interpretability, generalization power, robustness, …