SlideShare una empresa de Scribd logo
1 de 12
CVC TechParty


A practical Introduction to 
Machine Learning in Python


                         Piero Casale



           
CVC TechParty




www.ailab.si/orange/

                        
CVC TechParty



   Load-In Data and Basic Data Exploration
- Loading Data:
      iris = orange.ExampleTable('iris.tab')


- Exploring Features and Examples
       iris.domain.attributes
       iris.domain.classVar.name


- Basic Dataset Characteristics
       GetDatasetStatistics()


- Dataset Formats in Orange:
       csv, txt, xls


- Dataset as Python Lists:
      indexing, append, extend, native




                                 
CVC TechParty


                   Dataset Visualization
- Multi Dimensional Scaling:
     MultiDimensional Scaling Functions in orngMDS




                              
CVC TechParty



My First Classifier in Orange : Bayes




              
CVC TechParty



        My First Classifier in Orange : Bayes
- Loading Data:
        iris = orange.ExampleTable('iris.tab')


-   Declare the Learning Function:
        bayes = orange.BayesLearner()


- Train the Bayes Classifier on Data:
       BayesClassifier = bayes(iris)


- Classify new data:
       Prediction = bayesClassifier(newExample)


_ Example on Iris Dataset:
       exCodes.showBayes()



                                
CVC TechParty


        My (Second) Classifier in Orange :
                Decision Trees
- As before:
      import orngTree
      treeLearner = orngTree.TreeLearner()
      treeClassifier = treeLearner(iris)
      prediction = treeClassifier(newExample)


_ Measures for splitting : infoGain, gainRatio, gini
       treeLearner = orngTree.TreeLearner(measure='gini')


- Print the Tree:
  - on screen : orngTree.printTree(treeClassifier)
  - save as an image :
      orngTree.printDot(treeClassifier, fileName='tree.dot')
      dot -Tpng tree.dot -otree.png




                                
CVC TechParty



        Testing and Evaluating a Classifier
- Testing Functions in orngTest
      import orngTest
      learners = [bayesLearner, treeLearner]


- Make a 10 folds Cross Validation
      xv = orngTest.crossValidation(learners, data, folds=10)


- Scores Functions in orngStat
      import orngStat
      accuracy = orngStat.CA(xv)
      confusionMatrix = orngStat.cm(xv)


- Example on Iris Dataset using Bayes, DecisionTree and Knn.
      exCodes.crossValidate()




                               
CVC TechParty


                    Ensemble Methods
- Basic Ensemble Methods in orngEnsemble
      Bagging, Boosting and Random Forest
      import orngEnsemble


- Bagging of Decision Trees
      treeLearner = orngTree.TreeLearner()
      baggedTrees = orngEnsemble.BaggedLearner(treeLearner, t=10)


- Boosting of Decision Trees
      treeLearner = orngTree.TreeLearner()
      boostedTrees = orngEnsemble.BoostedLearner(treeLearner, t=10)


- Random Forest
      forest = orngEnsemble.RandomForestLearner(trees = 10)


- Example on Iris Dataset:
      exCodes.crossValidateEnsembles()


                             
CVC TechParty



                      Features Selection
- Functions for Features Selectoin in orngFSS
    import orngFSS
    vehicle = orange.ExampleTable('vehicle.tab')


-   Measuring Import of features with Information Gain
    measures = orngFSS.attMeasure(vehicle)
    TenBests = orngFSS.bestNAtts(measures,n=10)


-   Measuring Import of features with Gain Ratio
    gainRatio = orange.MeasureAttribute_gainRatio()
    measures = orngFSS.attMeasure(vehicle,gainRatio)
    fiveBests = orngFSS.bestNAtts(measures,n=5)


- Example on Vehicle Dataset:
       exCodes.measureAttributes()


                               
CVC TechParty

                             More.....
- Supervised Learning Algorithms:
           orngSVM,orngLR,orngC45
- Unsupervised Learning Algorithm :
           orngClustering
- Reinforcement Learning :
           orngReinforcement
- Outlier Detection :
           orngOutlier
- Discretization Functions :
           orngDisc




                          
CVC TechParty




      Enjoy.....
    More at www.ailab.si/orange




                                  Piero Casale



            

Más contenido relacionado

Similar a A practical Introduction to Machine Learning in Python

Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Takashi J OZAKI
 
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...Edureka!
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectMatthew Gerring
 
Computational decision making
Computational decision makingComputational decision making
Computational decision makingBoris Adryan
 
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.Lviv Startup Club
 
Stat Design3 18 09
Stat Design3 18 09Stat Design3 18 09
Stat Design3 18 09stat
 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
 
Using R on Netezza
Using R on NetezzaUsing R on Netezza
Using R on NetezzaAjay Ohri
 
2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 SiglerSonya Sigler
 
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...Dataconomy Media
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Red Hat Developers
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntel Nervana
 
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in SparkTraining Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in SparkPatrick Pletscher
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira
 
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTKStatistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTKOlivier Grisel
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Julien SIMON
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!Diana Ortega
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyDatabricks
 

Similar a A practical Introduction to Machine Learning in Python (20)

Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}
 
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...
Introduction To TensorFlow | Deep Learning with TensorFlow | TensorFlow For B...
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
Computational decision making
Computational decision makingComputational decision making
Computational decision making
 
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
 
Stat Design3 18 09
Stat Design3 18 09Stat Design3 18 09
Stat Design3 18 09
 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architectures
 
Using R on Netezza
Using R on NetezzaUsing R on Netezza
Using R on Netezza
 
2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler2012 8 29 TAR Webinar Part 2 Sigler
2012 8 29 TAR Webinar Part 2 Sigler
 
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...
DN 2017 | Multi-Paradigm Data Science - On the many dimensions of Knowledge D...
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at Galvanize
 
C3 w2
C3 w2C3 w2
C3 w2
 
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in SparkTraining Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in Spark
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTKStatistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise Company
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 

Último

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Último (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

A practical Introduction to Machine Learning in Python

  • 3. CVC TechParty Load-In Data and Basic Data Exploration - Loading Data: iris = orange.ExampleTable('iris.tab') - Exploring Features and Examples iris.domain.attributes iris.domain.classVar.name - Basic Dataset Characteristics GetDatasetStatistics() - Dataset Formats in Orange: csv, txt, xls - Dataset as Python Lists: indexing, append, extend, native    
  • 4. CVC TechParty Dataset Visualization - Multi Dimensional Scaling: MultiDimensional Scaling Functions in orngMDS    
  • 5. CVC TechParty My First Classifier in Orange : Bayes    
  • 6. CVC TechParty My First Classifier in Orange : Bayes - Loading Data: iris = orange.ExampleTable('iris.tab') - Declare the Learning Function: bayes = orange.BayesLearner() - Train the Bayes Classifier on Data: BayesClassifier = bayes(iris) - Classify new data: Prediction = bayesClassifier(newExample) _ Example on Iris Dataset: exCodes.showBayes()    
  • 7. CVC TechParty My (Second) Classifier in Orange : Decision Trees - As before: import orngTree treeLearner = orngTree.TreeLearner() treeClassifier = treeLearner(iris) prediction = treeClassifier(newExample) _ Measures for splitting : infoGain, gainRatio, gini treeLearner = orngTree.TreeLearner(measure='gini') - Print the Tree: - on screen : orngTree.printTree(treeClassifier) - save as an image : orngTree.printDot(treeClassifier, fileName='tree.dot') dot -Tpng tree.dot -otree.png    
  • 8. CVC TechParty Testing and Evaluating a Classifier - Testing Functions in orngTest import orngTest learners = [bayesLearner, treeLearner] - Make a 10 folds Cross Validation xv = orngTest.crossValidation(learners, data, folds=10) - Scores Functions in orngStat import orngStat accuracy = orngStat.CA(xv) confusionMatrix = orngStat.cm(xv) - Example on Iris Dataset using Bayes, DecisionTree and Knn. exCodes.crossValidate()    
  • 9. CVC TechParty Ensemble Methods - Basic Ensemble Methods in orngEnsemble Bagging, Boosting and Random Forest import orngEnsemble - Bagging of Decision Trees treeLearner = orngTree.TreeLearner() baggedTrees = orngEnsemble.BaggedLearner(treeLearner, t=10) - Boosting of Decision Trees treeLearner = orngTree.TreeLearner() boostedTrees = orngEnsemble.BoostedLearner(treeLearner, t=10) - Random Forest forest = orngEnsemble.RandomForestLearner(trees = 10) - Example on Iris Dataset: exCodes.crossValidateEnsembles()    
  • 10. CVC TechParty Features Selection - Functions for Features Selectoin in orngFSS import orngFSS vehicle = orange.ExampleTable('vehicle.tab') - Measuring Import of features with Information Gain measures = orngFSS.attMeasure(vehicle) TenBests = orngFSS.bestNAtts(measures,n=10) - Measuring Import of features with Gain Ratio gainRatio = orange.MeasureAttribute_gainRatio() measures = orngFSS.attMeasure(vehicle,gainRatio) fiveBests = orngFSS.bestNAtts(measures,n=5) - Example on Vehicle Dataset: exCodes.measureAttributes()    
  • 11. CVC TechParty More..... - Supervised Learning Algorithms: orngSVM,orngLR,orngC45 - Unsupervised Learning Algorithm : orngClustering - Reinforcement Learning : orngReinforcement - Outlier Detection : orngOutlier - Discretization Functions : orngDisc    
  • 12. CVC TechParty Enjoy..... More at www.ailab.si/orange Piero Casale