SlideShare una empresa de Scribd logo
1 de 49
Machine Learning
Machine Learning
         A n Intro duction
Automated Insights
Spam
You might like ...
The World
People you should follow...
People you may know...
People you may know...
Classifying
  Clustering
Recommending
Classifying
Clustering
Recommending
Items




Users
Items




Users
Items




Users
Items




Users
Items




Users
Modeling
Similarity
Movies
Collaborative
How to represent our
       data?
Data


         User A   User B   User C

Item 1    1.0      3.0      5.0
Similarity?
          User A   User B   User C


Item 1     1.0      3.0      5.0


Item 2     2.0      5.0      2.0


Item 3     1.0      3.0      1.0
Euclidean Distance
Euclidean Distance



    q   1.0   2.0   1.0
    p   2.0   5.0   3.0
Euclidean Distance
         User A User B User C    d
Item 1    1.0    3.0    5.0      4
Item 2    2.0    5.0    2.0     2.45
Item 3    1.0    3.0    1.0
Euclidean Distance

(defn euclidean-distance
 [v m]
 (let [num-of-rows (first (dim m))
      difference (minus (matrix (repeat num-of-rows v)) m)]
   (sqrt (map sum-of-squares difference))))




           Clojure #ftw
Content Based
Distance
         User A   User B   User C


Item 1    1.0      3.0      5.0


Item 2    2.0      5.0      2.0


Item 3    1.0      3.0      1.0
Distance
         Feature A Feature B Feature C


Item 1      1.0       3.0       5.0


Item 2      2.0       5.0       2.0


Item 3      1.0       3.0       1.0
Classification
  Algorithm
k-nearest neighbours
Our Data
         A     B     C      d
Item 1   1.0   3.0   5.0    4
Item 2   2.0   5.0   2.0   2.45
Item 3   1.0   3.0   1.0
Our Model
                       A     B     C      d     Label


          {
Trained




              Item 1   1.0   3.0   5.0    4     Spam
              Item 2   2.0   5.0   2.0   2.45   Ham
              Item 3   1.0   3.0   1.0
Our Model
                       Label    d


          {
Trained




              Item 1   Spam     4
              Item 2   Ham     2.45
              Item 3
k-nn Classifier
(defn knn-classify
 [xs k m labels]
 (let [sorted-labels (take k (map (partial nth labels)
                        (sorted-indexes (euclidean-distance xs m))))
       category (mode sorted-labels)]
   (if (seq? category)
     (first category)
     category)))




              Clojure #ftw
Evaluation
Our Model
                       Label    d


          {
Trained




              Item 1   Spam     4
              Item 2   Ham     2.45
              Item 3
Our Model
                       Observed Label   Calculated Label




          {
Trained




              Item 1      Spam
              Item 2      Ham
Test          Item 3      Ham               Ham
kʼthx

Más contenido relacionado

La actualidad más candente

2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
Dongseo University
 
Kmeans initialization
Kmeans initializationKmeans initialization
Kmeans initialization
djempol
 

La actualidad más candente (6)

2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
 
Multiple Classifier Systems for Adversarial Classification Tasks
Multiple Classifier Systems for Adversarial  Classification TasksMultiple Classifier Systems for Adversarial  Classification Tasks
Multiple Classifier Systems for Adversarial Classification Tasks
 
Bin Sorting And Bubble Sort By Luisito G. Trinidad
Bin Sorting And Bubble Sort By Luisito G. TrinidadBin Sorting And Bubble Sort By Luisito G. Trinidad
Bin Sorting And Bubble Sort By Luisito G. Trinidad
 
Kmeans initialization
Kmeans initializationKmeans initialization
Kmeans initialization
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
JAVA CONCEPTS
JAVA CONCEPTS JAVA CONCEPTS
JAVA CONCEPTS
 

Destacado

Users as Data
Users as DataUsers as Data
Users as Data
pdingles
 
Intro to machine learning for web folks @ BlendWebMix
Intro to machine learning for web folks @ BlendWebMixIntro to machine learning for web folks @ BlendWebMix
Intro to machine learning for web folks @ BlendWebMix
Louis Dorard
 
Unit 1 foundations of geometry
Unit 1   foundations of geometryUnit 1   foundations of geometry
Unit 1 foundations of geometry
hlrivas
 

Destacado (20)

Users as Data
Users as DataUsers as Data
Users as Data
 
Kafka - A little introduction
Kafka - A little introductionKafka - A little introduction
Kafka - A little introduction
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Semantic Computing Executive Briefing
Semantic Computing Executive Briefing Semantic Computing Executive Briefing
Semantic Computing Executive Briefing
 
Machine Learning Intro Session
Machine Learning Intro SessionMachine Learning Intro Session
Machine Learning Intro Session
 
Intro to modelling-supervised learning
Intro to modelling-supervised learningIntro to modelling-supervised learning
Intro to modelling-supervised learning
 
Intro to machine learning for web folks @ BlendWebMix
Intro to machine learning for web folks @ BlendWebMixIntro to machine learning for web folks @ BlendWebMix
Intro to machine learning for web folks @ BlendWebMix
 
Lecture 02 introduction to ai
Lecture 02 introduction to aiLecture 02 introduction to ai
Lecture 02 introduction to ai
 
Machine learning intro
Machine learning introMachine learning intro
Machine learning intro
 
Intro to Machine Learning
Intro to Machine LearningIntro to Machine Learning
Intro to Machine Learning
 
Unit 1 foundations of geometry
Unit 1   foundations of geometryUnit 1   foundations of geometry
Unit 1 foundations of geometry
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Commercializing legal AI research: lessons learned
Commercializing legal AI research: lessons learnedCommercializing legal AI research: lessons learned
Commercializing legal AI research: lessons learned
 
An Intuitive Intro To Machine Learning
An Intuitive Intro To Machine LearningAn Intuitive Intro To Machine Learning
An Intuitive Intro To Machine Learning
 
AI in legal practice – the research perspective
AI in legal practice – the research perspectiveAI in legal practice – the research perspective
AI in legal practice – the research perspective
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Intro au Big Data & Machine Learning
Intro au Big Data & Machine LearningIntro au Big Data & Machine Learning
Intro au Big Data & Machine Learning
 
Introduction to AI
Introduction to AIIntroduction to AI
Introduction to AI
 

Similar a Machine learning

Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
taikhoan262
 

Similar a Machine learning (20)

Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
 
Class Responsibility Assignment as Fuzzy Constraint Satisfaction
Class Responsibility Assignment as Fuzzy Constraint SatisfactionClass Responsibility Assignment as Fuzzy Constraint Satisfaction
Class Responsibility Assignment as Fuzzy Constraint Satisfaction
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 
Metaprogramming code-that-writes-code
Metaprogramming code-that-writes-codeMetaprogramming code-that-writes-code
Metaprogramming code-that-writes-code
 
Devry CIS 247 Full Course Latest
Devry CIS 247 Full Course LatestDevry CIS 247 Full Course Latest
Devry CIS 247 Full Course Latest
 
Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...
Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...
Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
Week2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptxWeek2- Deep Learning Intuition.pptx
Week2- Deep Learning Intuition.pptx
 
C3 w2
C3 w2C3 w2
C3 w2
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Dat 305 dat305 dat 305 education for service uopstudy.com
Dat 305 dat305 dat 305 education for service   uopstudy.comDat 305 dat305 dat 305 education for service   uopstudy.com
Dat 305 dat305 dat 305 education for service uopstudy.com
 
Kyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdf
 
Guide
GuideGuide
Guide
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
 
DDW Clinic Session 1.pdf
DDW Clinic Session 1.pdfDDW Clinic Session 1.pdf
DDW Clinic Session 1.pdf
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Machine learning

Notas del editor

  1. \n
  2. Hope to show that it’s not too complicated, very interested, potentially valuable, and various parts are quite similar\nWhat kinds of things does machine learning cover?\n
  3. Increasing piles of data\nmachine learning is complimentary to data mining: evolve behaviours from empirical data\n
  4. Classic classification.\n
  5. Product suggestions.\nList from my Kindle suggestions. Over 850,000 kindle titles alone. Recommendations based on my purchases and content?\n
  6. Google employs all kinds of machine learning: query result ranking, news story clustering \n
  7. 2 searches, on immediately after the other: one chrome, one safari. there’s a difference!?\n
  8. Social sites make use of recommendations.\nInstead of products it’s users to other users.\nThis time it’s pretty good.\n
  9. Social sites make use of recommendations. Instead of products it’s users to other users.\n
  10. \n
  11. Going to cover a high level description of these 3 topics, and then explore some of the details through a classification example\n
  12. How much something is or isn’t part of a group. Assign class labels using a classifier built from predictor values\n
  13. 16 things. we know there are 4 categories or labels.\nwe want to automate the way find a category for each thing. \n
  14. \n
  15. Clustering: Group a large number of things into groups of similar things\n
  16. 24 blobs, not sure of what the categories are\njust want groups of similar things\n
  17. we’ve got 4 categories\n
  18. \n
  19. lets take an example of looking at recommending items to users\n
  20. 3 items, and 2 users\n
  21. we can see recommendations for items from those users\n
  22. for example, the red user shares 2 items...\n
  23. with the blue user... we can use the blue users preferences to identify things that the red user would be interested in...\n
  24. and for things like twitter + facebook, these graphs would be users to users\n
  25. this brings up an interesting point- how do we model the problem.\nthe first thing we need to look at...\n
  26. I mentioned it quite a lot- but what does that mean?\n
  27. interesting example\n2 films- how similar?\nboth star jim carrey\n\n
  28. Collaborative filtering- based on behaviour of multiple people (for example)\n
  29. \n
  30. \n
  31. How to measure similarity? We can calculate distance... \n
  32. One way is euclidean distance. Similar to pythagorean formula for calculating sides of triangles.\n\nWhat are q and p? ...\n
  33. p and q are our vectors-\n1) so we first calculate the difference\n2) then square those (ensuring all numbers are signed the same)\n3) we sum the squares\n4) square root of the sum\n\nso, let’s look at the results for our data\n
  34. we can see that item 3 is closer to item 2 than item 1.\nthis can be seen by the ratings for items 2 and 3 from all users have a similar shape.\n\nhow does this look in code?\n
  35. \n
  36. How about content based calculations?\nWell we break down the content into feature vectors.\n
  37. This is our previous matrix- user and item ratings, what do we swap users for?\n
  38. We swap them for features.\nFor example, items were documents, features may be the words in those documents.\nMovies might break down films into running length, actors etc.\n\nImportantly- Measure similarity in the same way- with distance calculations.\n\nLet’s put this in practice\n
  39. We’ve looked at how to represent data, and how to measure similarity.\nHow do we turn that into an algorithm that can classify things?\n
  40. One really simple one is k-nearest neighbours: find the most common category for our item from k nearest items\n
  41. Our matrix from before- shows the calculated distance of Items 1 and 2 from item 3.\nBut, if we’re classifying, we need to know what the categories are!\n
  42. We’ve added the labels so we can see that item 1 was spam and item 2 was ham\n\nitems 1 and 2 represent our trained model- data and their label\n\nlet’s drop the stuff we don’t need any more\n
  43. we have just labels and distances from all other items to our new item.\n\nback to our algorithm- knn. method: find the most common label from k nearest items to our item (in this case 3).\n\nso, given the above information we’d classify it into “Ham” category. If we had more data we’d just compare more neighbours.\n\ntime for some code ...\n
  44. xs is the vector we’re trying to classify\nk is the number of nearest neighbours we’ll measure the distance of\nm is our trained matrix of data\nlabels are the labels for the items in the matrix\n
  45. all very well, how do we know our model is accurately categorising things?\n
  46. Similar matrix to before, how can we use the empirical data to measure effectiveness of the algorithm?\n\nWe can take our data and consider part of it to be testing data...\n
  47. Item 3 now becomes our test data- we have calculated label and an observed label. We can then measure how well we match.\n\nThis is the same for rating movies (for example) as well- how close is our estimated score to the actual measured score?\n\nAnyway, that brings us to the end of a whistlestop tour\n
  48. \n