MLCommons aims to accelerate machine learning to benefit everyone.
MLCommons will build a a common set of tools for ML practitioners including:
Benchmarks to measure progress: MLCommons will leverage MLPerf (built on DAWNbench) to measure speed, but also expand benchmarking other aspects of ML such as accuracy and algorithmic efficiency. ML models continue to increase in size and consequently cost. Sustaining growth in capability will require learning how to do more (accuracy) with less (efficiency).
Public datasets to fuel research: MLCommons new People’s Speech project seeks to develop a public dataset that, in addition to being larger than any other public speech dataset by more than an order of magnitude (86K hours labeled speech), better reflects diverse languages and accents. Public datasets drive machine learning like nothing else; consider ImageNet’s impact on the field of computer vision.
Best practices to accelerate development: MLCommons will make it easier to develop and deploy machine learning solutions by fostering consistent best practices. For instance, MLCommons’ MLCube project provides a common container interface for machine learning models to make them easier to share, experiment with (including benchmark), develop, and ultimately deploy.
2. MLCommons™ in 6 questions
1. What is MLCommons?
2. Why benchmarks?
3. Why datasets?
4. Why best practices?
5. What’s next?
6. How can I get involved?
4. ● Information access
● Business productivity
Machine learning (ML) could benefit everyone
● Health
● Safety
Icon: Ætoms
Photo: Ian Maddox Photo: Katrina.Tuliao
8. MLCommons is a new open engineering
organization to create better ML for everyone
Open engineering
organizations
AI/ML
organizations
MLCommons
9. MLCommons is supported by industry and
academics
Academics from educational
institutions including:
Harvard University
Indiana University
Polytechnique Montreal
Stanford University
University of California, Berkeley
University of Toronto
University of Tübingen
University of York, United
Kingdom
Yonsei University
10. MLCommons is the work of many people...
And many others contributing
ideas and code...
11. MLCommons creates better ML through three
pillars
Benchmarks
Datasets
Best
practices
Better ML
Research
13. “What get measured, gets improved.” — Peter Drucker
Benchmarking aligns research with development,
engineering with marketing, and competitors across the
industry in pursuit of the same clear objective.
Benchmarks drive progress and transparency
14. MLCommons will host MLPerf™
Industry standard; drives progress and transparency
MLPerf result press coverage (selected)
15. MLPerf progress
Scale 2018 2019 2020 2021
Training - HPC
Training
Inference - Datacenter
Inference - Edge
Inference - Mobile
Inference - Tiny (IoT)
Increasing breadth Improving technical approach
New training/inference benchmarks
● Recommendation: DLRM + 1TB
dataset
● Medical imaging: U-NET
● Speech-to-text: RNN-T
Standardized methodology for Training
● Optimizer definitions
● Hyperparameter definitions
● Convergence expectation (WIP)
Adding power measurement to Inference
Launched Mobile App (early alpha release)
17. ML needs ImageNet++ for everything
Imagenet: $300K → Modern ML
~80% of research papers by leading ML companies cite public datasets
And ML innovations needs:
● Large
● CC license or similar
● Redistributable
● Diverse
● Continually improving
But most public datasets are:
● Small
● Legally restricted
● Not redistributable
● Not diverse
● Static
18. MLCommons is starting with speech-to-text
https://commons.wikimedia.org/wiki/File:List_of_language
s_by_number_of_native_speakers.png
English
Voice interfaces will reach most of Earth’s 8 billion people by 2025
Need bigger datasets that support more diverse languages and accents
{
Earth’s population
grouped by native
language
19. People’s Speech: 10 years of speech, CC-BY
Read text Conversation +
noise
Diverse
languages/accents
English
60+ Other
languages
Future
w
ork
● ~10 years of labeled
speech (>10TB)
● CC-BY license (likely),
redistributable
● Undergoing evaluation by
MLCommons members
● Aiming for public release
1H2021
● Living dataset
21. ML has too much friction
Example: found an ML model you want to use?
Interface (how do you even run it)?
Software dependances?
Dataset?
Platform compatibility?
All solved after a couple of days of hard work!
And then it converges to 81.6% of claimed accuracy?
Unsplash.com
22. MLCube™ is a shipping container for ML models
Cargo ships Unsplash.com: / Shipping container: KMJ / Medicines: Ralf Roletschek / Electronics: DustyDingo
Complex
infrastructure
Complex
contents
Simple interface = low friction
23. MLCube makes it easier to share models
Basically, a docker with consistent command line and metadata
(really an abstract interface for any container)
Simple runners for:
● Local machine
● Multiple clouds
● Kubernetes
Or incorporate into your own infrastructure
Learn more at:
https://github.com/mlcommons/mlcube
25. MLCommons Research
Algorithmic Research Working Group
● Benchmarks for algorithms to improve efficiency: better accuracy/compute
Medical Research Working Group
● Federated evaluation across distributed data: research ~= clinical practice
Scientific Research Working Group
● Better datasets and software for science
(Your idea here)
27. We welcome people who want to make ML
better.
● Join our mailing list
● Attend community events
● Become a member (free for academics)
● Participate in working groups
● Submit benchmark results
Join us at mlcommons.org!