Marvin é um ambicioso projeto de código aberto que se concentra em ajudar equipes a entregar soluções de machine learning de maneira ágil. A plataforma oferece uma arquitetura padronizada e agnóstica de linguagem, de alta escala e baixa latência enquanto simplifica o processo de exploração e modelagem de projetos de IA.
4. B2W Digital: e-commerce leader in LatAm
Source: 2016 Results from ri.b2w.digital
Total GMV (R$)
12,458 MM
Market share (%)
26,2%
The Digital Platform that
connects People,
Businesses,Products
and Services.
5. Outline
• Context
• Data-driven culture
• Artificial Intelligence
• Domains of knowledge
• Problem Statement
• Marvin
• Main components
• Architecture
• DASFE pattern
• General features
• Case
• Roadmap
6.
7. Context: data-driven culture
Single source of truth
Data dictionary
Broad data access
Data literacy
Decision making
Why is it important to
be data-driven?
13. Problem statement
How can we abstract the complexity in
the creation of an AI application?
Building AI projects is not a simple task.
One is required to have advanced
knowledge in different domains.
15. Marvin Artificial Intelligence Platform
Empowers data science teams to deliver
AI applications, simplifying the process
of exploitation and modeling.
16. Marvin: main components
ENGINE EXECUTOR
ENGINE
Data acquisitor
Prediction preparator
Predictor
Trainingpreparator
Feedback
Trainer
TOOLBOX
Evaluator
19. Marvin: DASFE pattern
Batch Data Acquisition
& Cleaning
Training
Preparation
Model
Training
Model
Evaluation
20. Marvin: DASFE pattern
Online Prediction
Preparation
Model
Prediction
Batch Data Acquisition
& Cleaning
Training
Preparation
Model
Training
Model
Evaluation
21. Marvin: DASFE pattern
Online Prediction
Feedback
Online Prediction
Preparation
Model
Prediction
Batch Data Acquisition
& Cleaning
Training
Preparation
Model
Training
Model
Evaluation
23. Marvin: general features
• Training pipeline REST interface
• Experiment and artifacts versioning
• Engine project scaffold generator
• Data sampling and import CLI
• Engine test framework (unit, functional, dryrun)
• Toolbox: Python support
• Artifacts persistence layer: HDFS support
• Remote provisioning and deployment
24.
25. Case: risk analysis model
• XGBoost in python
• Dataset: 1,0 M of orders
• Training pipeline: 15 min
• REST HTTP predictions: 15 ms
• Load test: 100 rps w/ 15 ms mrt
26. Case: how marvin helped the team?
“... Jupyter notebook integration with Spark through Marvin’s
toolbox lib was very helpful during prototyping phase...”
“... the data importation utility speeds up
data collection and sampling... ”
“... it was easy to do feature engineering, feature selection and
model choice using the DASFE model... ”
“... we automated the training and deployment phase without
having a dev/ops in our team...”
27.
28. Marvin: roadmap
• Admin module
• Toolbox: Java and Scala support
• Feedback server
• Artifacts persistence layer: S3 and local FS support
• Remote provisioning and deployment: Azure, AWS and GCP
• Customized notebook kernel
• Automate feature engineering
• Hyper parameters support
• ML for no-data scientists
• …
29. Artificial Intelligence Platform
Fork me on GitHub.com/marvin-ai
and feel free to contribute!
Thank you!
@
GitHub.com/marvin-ai
twitter.com/_marvin_ai
marvin-ai@googlegroups.com