AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"

•

0 recomendaciones•178 vistas

A short concluding speech for the AI & Topology session at the 2020 Applied Machine Learning Days (28 January 2020). We remark on the strength of the case for using topological methods in various domains of machine learning. We then comment on our views on integrating topology with the practice of machine learning at a fundamental level. We give an (inexhaustive) overview of the open-source landscape for topological machine learning and data analysis, including our contribution, the giotto-tda Python package. Finally, we mention some promising future directions in the field.

Software

Closing remarks: The open-source
landscape for topology in machine
learning and data analysis
Umberto Lupo
Research Scientist, L2F SA
AI & Topology @ AMLD 2020
28 January 2020
“Where do I start?”

Why topological machine learning?
Topological methods focus on connectivity properties and are uniquely able to reveal structure.
Applying them to multiple domains provides new insights and state-of-the-art performance.
... And several other applications seen today! Potential for an algorithmic spring in machine learning!
Time series analysis 3
Graph classification 1
Feature space analysis / dimensionality reduction 2
1 M. Carrière et al., arXiv:1904.09378
2 M. Moor et al., arXiv:1906.00722
3 A. Myers et al, arXiv:1904.07403

Objective: Place topological learning algorithms firmly alongside established machine learning
techniques.
ML ethos: Select the best combinations of techniques in a data-driven way. The best ones may well
include a number of topological steps as part of a greater ML pipeline.
Featurization: Turn PH information into features which are amenable to processing by ML
algorithms. Possibilities: explicit vectorisations, learned representations (cf. Frédéric’s talk).
Hyperparameters: Typically, several involved within each choice of featurization technique. Even the
choice of featurization (model choice) can be regarded as a hyperparameter in its own right.
Large-scale cross-validation routines: Must involve all hyperparameters and model choices at once,
topological or not.
Topology and the ML workflow Example: Persistent Homology (PH)

“Backend” libraries for persistent homology:
• GUDHI (C++ & Python bindings/components)
• Ripser (C++)
• PHAT (C++)
• JavaPlex (Java)
• Dyonisus 2 (C++)
• Aleph (C++)
• RIVET (C++ & Python bindings)
Toolkits in high-level programming languages:
• TDA (R)
• scikit-tda (Python)
• gda-public (Python)
• Eirene (Julia)
Visualization-oriented:
• TTK (VTK/C++, Python, ParaView plugins)
Mapper algorithm:
• TDAMapper (R)
• Tmap (Python)
Persistent homology and deep learning:
• TopologyLayer (Python)
• PersLay (Python)
Other topological algorithms:
• UMAP (Python)
• hdbscan clustering (Python)
• TdaToolbox (Python)
... And lots of “smaller” software projects linked to
specific research papers – many by today’s speakers and
collaborators!
The open-source landscape circa 2019

giotto-tda: Pillars
Seamless integration with widely used ML frameworks: inherit their strengths and allow for creation
of heterogeneous ML pipelines. Python + scikit-learn
Code modularity: “Lego blocks” approach. Algorithms as transformers
User-friendliness and familiarity to the broad data science community. Strict adherence to scikit-
learn API and developer guidelines, “fit-transform” paradigm
Standardisation: Allow for integration of most available techniques into a generic framework.
Consistency of API across different modules
Performance within the language constraints. Vectorized code, parallelism (likely in future: just-in-
time compilation and more)
Data structures: Support for time series, graphs, images.

The giotto-tda team
… and many others!
github.com/giotto-ai/giotto-tda
Sponsored by

Open-source TML: The future?
o Further growth of performance-oriented projects: fast approximate calculations, HPC solutions,
algorithmic breakthroughs.
o Strong community: Inclusive and ever closer to the broad ML community.
Ø Join Bastian’s awesome Slack community, tda-in-ml.slack.com!
o Standardized integration with deep learning.
• In progress for persistent homology: “backpropagation through topology”
• Good promise for: interpretability, generalization power, robustness, …

Más contenido relacionado

La actualidad más candente

Data visualization in PythonMarc Garcia

Python and its applicationsmohakmishra97

Scientific Computing with Python Webinar --- August 28, 2009Enthought, Inc.

Data Visualization in PythonJagriti Goswami

The evolution of array computing in PythonRalf Gommers

Euro30 2019 - Benchmarking tree approaches on street dataFabion Kauker

Good Old Fashioned Artificial IntelligenceRobert Short

The road ahead for scientific computing with PythonRalf Gommers

Python CientíficoMárcio Ramos

Unsupervised Learning: Clustering Experfy

Top 11 python frameworks for machine learning and deep learningThinkTanker Technosoft PVT LTD

CartoType & OpenStreetMapguest69c941

Lect12 graph miningHouw Liong The

Deep Recurrent Neural Network for Multi-target FilteringMehryar (Mike) E., Ph.D.

SWiM – A Semantic Wiki for Mathematical Knowledge ManagementChristoph Lange

20181204i mlse discussionsHiroshi Maruyama

Vishal Verma: Rapidly Exploring Random TreesUniversity of Colorado at Boulder

Java review-2University of Massachusetts Amherst

Graph Matchinggraphitech

TMRA2009 Key NoteNetworkedPlanet

La actualidad más candente (20)

Data visualization in Python

Python and its applications

Scientific Computing with Python Webinar --- August 28, 2009

Data Visualization in Python

The evolution of array computing in Python

Euro30 2019 - Benchmarking tree approaches on street data

Good Old Fashioned Artificial Intelligence

The road ahead for scientific computing with Python

Python Científico

Unsupervised Learning: Clustering

Top 11 python frameworks for machine learning and deep learning

CartoType & OpenStreetMap

Lect12 graph mining

Deep Recurrent Neural Network for Multi-target Filtering

SWiM – A Semantic Wiki for Mathematical Knowledge Management

20181204i mlse discussions

Vishal Verma: Rapidly Exploring Random Trees

Java review-2

Graph Matching

TMRA2009 Key Note

Similar a AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"

The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j

Keynote at Converge 2019Travis Oliphant

Prof. M. Thaller (Universität Köln) - Toward a reference curriculum in Digita...infoclio.ch

Array computing and the evolution of SciPy, NumPy, and PyDataTravis Oliphant

Machine learning from software developers point of viewPierre Paci

Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne

Python for Data Science: A Comprehensive Guidepriyanka rajput

Benchmarking open source deep learning frameworksIJECEIAES

Hopsworks - ExtremeEarth Open WorkshopExtremeEarth

Final teit syllabus_2012_course_04.06.2014deepti112233

A Comprehensive Guide to Data Science Technologies.pdfGeethaPratyusha

GATE, HLT and Machine Learning, Sheffield, July 2003butest

Apache Spark and the Emerging Technology Landscape for Big DataPaco Nathan

20230525_mmc_seminar.pdfMiel Vander Sande

2014 01-ticosaPharo

AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult

AntoineLambertResumeAntoine Lambert

Why to Choose Python for Data Science Master.pptxHGLLearn

Big Data Analytics (ML, DL, AI) hands-onDony Riyanto

Icsme16.pptYann-Gaël Guéhéneuc

Similar a AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis" (20)

The Future is Big Graphs: A Community View on Graph Processing Systems

Keynote at Converge 2019

Prof. M. Thaller (Universität Köln) - Toward a reference curriculum in Digita...

Array computing and the evolution of SciPy, NumPy, and PyData

Machine learning from software developers point of view

Automatic Classification of Springer Nature Proceedings with Smart Topic Miner

Python for Data Science: A Comprehensive Guide

Benchmarking open source deep learning frameworks

Hopsworks - ExtremeEarth Open Workshop

Final teit syllabus_2012_course_04.06.2014

A Comprehensive Guide to Data Science Technologies.pdf

GATE, HLT and Machine Learning, Sheffield, July 2003

Apache Spark and the Emerging Technology Landscape for Big Data

20230525_mmc_seminar.pdf

2014 01-ticosa

AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...

AntoineLambertResume

Why to Choose Python for Data Science Master.pptx

Big Data Analytics (ML, DL, AI) hands-on

Icsme16.ppt

Último

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics

AI & Machine Learning Presentation TemplatePresentation.STUDIO

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2

WSO2CON 2024 Slides - Open Source to SaaSWSO2

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2

WSO2CON2024 - It's time to go PlatformlessWSO2

Announcing Codolex 2.0 from GDK SoftwareJim McKeeth

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba

tonesoftglanshi9

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba

WSO2Con204 - Hard Rock Presentation - KeynoteWSO2

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd

Direct Style Effect Systems -The Print[A] Example- A Comprehension AidPhilip Schwarz

What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba

Architecture decision records - How not to get lost in the pastPapp Krisztián

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba

%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health

AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"

1. Closing remarks: The open-source landscape for topology in machine learning and data analysis Umberto Lupo Research Scientist, L2F SA AI & Topology @ AMLD 2020 28 January 2020 “Where do I start?”

2. Why topological machine learning? Topological methods focus on connectivity properties and are uniquely able to reveal structure. Applying them to multiple domains provides new insights and state-of-the-art performance. ... And several other applications seen today! Potential for an algorithmic spring in machine learning! Time series analysis 3 Graph classification 1 Feature space analysis / dimensionality reduction 2 1 M. Carrière et al., arXiv:1904.09378 2 M. Moor et al., arXiv:1906.00722 3 A. Myers et al, arXiv:1904.07403

3. Objective: Place topological learning algorithms firmly alongside established machine learning techniques. ML ethos: Select the best combinations of techniques in a data-driven way. The best ones may well include a number of topological steps as part of a greater ML pipeline. Featurization: Turn PH information into features which are amenable to processing by ML algorithms. Possibilities: explicit vectorisations, learned representations (cf. Frédéric’s talk). Hyperparameters: Typically, several involved within each choice of featurization technique. Even the choice of featurization (model choice) can be regarded as a hyperparameter in its own right. Large-scale cross-validation routines: Must involve all hyperparameters and model choices at once, topological or not. Topology and the ML workflow Example: Persistent Homology (PH)

4. “Backend” libraries for persistent homology: • GUDHI (C++ & Python bindings/components) • Ripser (C++) • PHAT (C++) • JavaPlex (Java) • Dyonisus 2 (C++) • Aleph (C++) • RIVET (C++ & Python bindings) Toolkits in high-level programming languages: • TDA (R) • scikit-tda (Python) • gda-public (Python) • Eirene (Julia) Visualization-oriented: • TTK (VTK/C++, Python, ParaView plugins) Mapper algorithm: • TDAMapper (R) • Tmap (Python) Persistent homology and deep learning: • TopologyLayer (Python) • PersLay (Python) Other topological algorithms: • UMAP (Python) • hdbscan clustering (Python) • TdaToolbox (Python) ... And lots of “smaller” software projects linked to specific research papers – many by today’s speakers and collaborators! The open-source landscape circa 2019

5. giotto-tda: Pillars Seamless integration with widely used ML frameworks: inherit their strengths and allow for creation of heterogeneous ML pipelines. Python + scikit-learn Code modularity: “Lego blocks” approach. Algorithms as transformers User-friendliness and familiarity to the broad data science community. Strict adherence to scikit- learn API and developer guidelines, “fit-transform” paradigm Standardisation: Allow for integration of most available techniques into a generic framework. Consistency of API across different modules Performance within the language constraints. Vectorized code, parallelism (likely in future: just-in- time compilation and more) Data structures: Support for time series, graphs, images.

6. The giotto-tda team … and many others! github.com/giotto-ai/giotto-tda Sponsored by

7. Open-source TML: The future? o Further growth of performance-oriented projects: fast approximate calculations, HPC solutions, algorithmic breakthroughs. o Strong community: Inclusive and ever closer to the broad ML community. Ø Join Bastian’s awesome Slack community, tda-in-ml.slack.com! o Standardized integration with deep learning. • In progress for persistent homology: “backpropagation through topology” • Good promise for: interpretability, generalization power, robustness, …

8. Thank you for attending AI & Topology!

AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"

Similar a AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis" (20)

Último

Último (20)

AI & Topology concluding remarks - "The open-source landscape for topology in machine learning and data analysis"