SlideShare a Scribd company logo
1 of 10
Download to read offline
BigML Education
Datasets
June 2017
BigML Education Program 2Datasets
In This Video
• Introduction
• Typical workflow: 1-click creation
• Purpose of datasets in BigML
• Exploration
• Pre-flight check
• Basic Features
• Other ways to create datasets
• Train/Test split
• More Exploration
• Advanced Features
• Filtering
• Feature engineering with Flatline
BigML Education Program 3Datasets
Sources
Introduction
BigML Education Program 4Datasets
What is a Dataset?
• Datasets are the fundamental building blocks
• Models, Clusters, etc. derive from datasets
• Sources can only become datasets
• Data exploration / Pre-flight check
• Missing/Errors
• Summary statistics
• Non-preferred fields
• Default objective for 1-click actions
BigML Education Program 5Datasets
Datasets
Basic Features
BigML Education Program 6Datasets
Dataset Features
• Immutable - “dataset/5943226f01440401bf0003bd”
• Creating Datasets
• From a source
• From a dataset: sampling, training/test
• From a batch output
• Dynamic scatterplot
BigML Education Program 7Datasets
Datasets
Advanced Features
BigML Education Program 8Datasets
Advanced Configuration
• Dataset Filtering
• Feature Engineering
BigML Education Program 9Datasets
Loan Status
Charged Off
Current
Default
Fully Paid
In Grace
Late (16-30)
Late (31-120)
Filter
Current
In Grace
Late (16-30)
Late (31-120)
Open
Charged Off
Default
Fully Paid
Closed
Engineer
Good
Bad
Quality
BigML Education Program 10Datasets
Summary
• Dataset Purpose
• Fundamental building block
• Pre-flight check: counts, histograms, scatterplot
• Creating dataset
• From source: 1-click and sampling
• Training / Test split
• From batch output
• From dataset: sampling, filtering, new features

More Related Content

Similar to BigML Education - Datasets

BigML Education - Introduction
BigML Education - IntroductionBigML Education - Introduction
BigML Education - IntroductionBigML, Inc
 
Citrination tutorial
Citrination tutorialCitrination tutorial
Citrination tutorialJoshua Tappan
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learnYoss Cohen
 
Supervised learning
Supervised learningSupervised learning
Supervised learningankit_ppt
 
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Damian R. Mingle, MBA
 
Starting data science with kaggle.com
Starting data science with kaggle.comStarting data science with kaggle.com
Starting data science with kaggle.comNathaniel Shimoni
 
Data driven omaha_12112014
Data driven omaha_12112014Data driven omaha_12112014
Data driven omaha_12112014Daniel Murray
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
BigML Education - Evaluations
BigML Education - EvaluationsBigML Education - Evaluations
BigML Education - EvaluationsBigML, Inc
 
Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520MariaHalstead1
 
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-os
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-osDt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-os
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-osCERTyou Formation
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big dataPoo Kuan Hoong
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
BigML Education - Deepnets
BigML Education - DeepnetsBigML Education - Deepnets
BigML Education - DeepnetsBigML, Inc
 
Feature engineering
Feature engineeringFeature engineering
Feature engineeringSaurabhWani6
 
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...Curiosity Software and RCG Global Services Present - Solving Test Data: the g...
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...Curiosity Software Ireland
 
Cognitive Assistant for Data Scientists (CADS)
Cognitive Assistant for Data Scientists (CADS)Cognitive Assistant for Data Scientists (CADS)
Cognitive Assistant for Data Scientists (CADS)Steven Miller
 

Similar to BigML Education - Datasets (20)

Business Analytics Forum #BAF3
Business Analytics Forum #BAF3Business Analytics Forum #BAF3
Business Analytics Forum #BAF3
 
AI.pptx
AI.pptxAI.pptx
AI.pptx
 
BigML Education - Introduction
BigML Education - IntroductionBigML Education - Introduction
BigML Education - Introduction
 
Citrination tutorial
Citrination tutorialCitrination tutorial
Citrination tutorial
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful ...
 
Starting data science with kaggle.com
Starting data science with kaggle.comStarting data science with kaggle.com
Starting data science with kaggle.com
 
Data driven omaha_12112014
Data driven omaha_12112014Data driven omaha_12112014
Data driven omaha_12112014
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
BigML Education - Evaluations
BigML Education - EvaluationsBigML Education - Evaluations
BigML Education - Evaluations
 
Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520Designing High Quality Data Driven Solutions 110520
Designing High Quality Data Driven Solutions 110520
 
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-os
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-osDt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-os
Dt812 g formation-infosphere-optim-test-data-management-and-data-masking-on-z-os
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
BigML Education - Deepnets
BigML Education - DeepnetsBigML Education - Deepnets
BigML Education - Deepnets
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
 
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...Curiosity Software and RCG Global Services Present - Solving Test Data: the g...
Curiosity Software and RCG Global Services Present - Solving Test Data: the g...
 
Cognitive Assistant for Data Scientists (CADS)
Cognitive Assistant for Data Scientists (CADS)Cognitive Assistant for Data Scientists (CADS)
Cognitive Assistant for Data Scientists (CADS)
 

More from BigML, Inc

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationBigML, Inc
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionBigML, Inc
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLBigML, Inc
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsBigML, Inc
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object DetectionBigML, Inc
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image ProcessingBigML, Inc
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 

More from BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

BigML Education - Datasets

  • 2. BigML Education Program 2Datasets In This Video • Introduction • Typical workflow: 1-click creation • Purpose of datasets in BigML • Exploration • Pre-flight check • Basic Features • Other ways to create datasets • Train/Test split • More Exploration • Advanced Features • Filtering • Feature engineering with Flatline
  • 3. BigML Education Program 3Datasets Sources Introduction
  • 4. BigML Education Program 4Datasets What is a Dataset? • Datasets are the fundamental building blocks • Models, Clusters, etc. derive from datasets • Sources can only become datasets • Data exploration / Pre-flight check • Missing/Errors • Summary statistics • Non-preferred fields • Default objective for 1-click actions
  • 5. BigML Education Program 5Datasets Datasets Basic Features
  • 6. BigML Education Program 6Datasets Dataset Features • Immutable - “dataset/5943226f01440401bf0003bd” • Creating Datasets • From a source • From a dataset: sampling, training/test • From a batch output • Dynamic scatterplot
  • 7. BigML Education Program 7Datasets Datasets Advanced Features
  • 8. BigML Education Program 8Datasets Advanced Configuration • Dataset Filtering • Feature Engineering
  • 9. BigML Education Program 9Datasets Loan Status Charged Off Current Default Fully Paid In Grace Late (16-30) Late (31-120) Filter Current In Grace Late (16-30) Late (31-120) Open Charged Off Default Fully Paid Closed Engineer Good Bad Quality
  • 10. BigML Education Program 10Datasets Summary • Dataset Purpose • Fundamental building block • Pre-flight check: counts, histograms, scatterplot • Creating dataset • From source: 1-click and sampling • Training / Test split • From batch output • From dataset: sampling, filtering, new features