Machine learning

•Descargar como KEY, PDF•

2 recomendaciones•1,353 vistas

A brief overview of making recommendations using the K nearest neighbour algorithm and the Euclidean distance. Given at a Forward First Tuesday evening.

Tecnología Empresariales

Data

User A User B User C

Item 1 1.0 3.0 5.0

Similarity?
User A User B User C

Item 1 1.0 3.0 5.0

Item 2 2.0 5.0 2.0

Item 3 1.0 3.0 1.0

Euclidean Distance

q 1.0 2.0 1.0
p 2.0 5.0 3.0

Euclidean Distance
User A User B User C d
Item 1 1.0 3.0 5.0 4
Item 2 2.0 5.0 2.0 2.45
Item 3 1.0 3.0 1.0

Euclidean Distance

(defn euclidean-distance
[v m]
(let [num-of-rows (ﬁrst (dim m))
difference (minus (matrix (repeat num-of-rows v)) m)]
(sqrt (map sum-of-squares difference))))

Clojure #ftw

Distance
User A User B User C

Item 1 1.0 3.0 5.0

Item 2 2.0 5.0 2.0

Item 3 1.0 3.0 1.0

Distance
Feature A Feature B Feature C

Item 1 1.0 3.0 5.0

Item 2 2.0 5.0 2.0

Item 3 1.0 3.0 1.0

Our Data
A B C d
Item 1 1.0 3.0 5.0 4
Item 2 2.0 5.0 2.0 2.45
Item 3 1.0 3.0 1.0

Our Model
A B C d Label

{
Trained

Item 1 1.0 3.0 5.0 4 Spam
Item 2 2.0 5.0 2.0 2.45 Ham
Item 3 1.0 3.0 1.0

Our Model
Label d

{
Trained

Item 1 Spam 4
Item 2 Ham 2.45
Item 3

k-nn Classifier
(defn knn-classify
[xs k m labels]
(let [sorted-labels (take k (map (partial nth labels)
(sorted-indexes (euclidean-distance xs m))))
category (mode sorted-labels)]
(if (seq? category)
(ﬁrst category)
category)))

Clojure #ftw

Our Model
Observed Label Calculated Label

{
Trained

Item 1 Spam
Item 2 Ham
Test Item 3 Ham Ham

Más contenido relacionado

La actualidad más candente

2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector MachinesDongseo University

Multiple Classifier Systems for Adversarial Classification TasksPluribus One

Bin Sorting And Bubble Sort By Luisito G. TrinidadLUISITO TRINIDAD

Kmeans initializationdjempol

Fast Single-pass K-means Clusterting at Oxford MapR Technologies

JAVA CONCEPTS Shivam Singh

La actualidad más candente (6)

2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines

Multiple Classifier Systems for Adversarial Classification Tasks

Bin Sorting And Bubble Sort By Luisito G. Trinidad

Kmeans initialization

Fast Single-pass K-means Clusterting at Oxford

JAVA CONCEPTS

Destacado

Users as Datapdingles

Kafka - A little introductionpdingles

Intro to machine learningAkshay Kanchan

Semantic Computing Executive Briefing Graeme Wood

Machine Learning Intro SessionNaveen Rajan

Intro to modelling-supervised learningJustin Sebok

Intro to machine learning for web folks @ BlendWebMixLouis Dorard

Lecture 02 introduction to aiHema Kashyap

Machine learning introSergey Shelpuk

Intro to Machine LearningMohammed Ashour

Unit 1 foundations of geometryhlrivas

Basics of Machine LearningFrank Evans

Machine learningAndrea Iacono

Commercializing legal AI research: lessons learnedAnna Ronkainen

An Intuitive Intro To Machine LearningBen Freundorfer

AI in legal practice – the research perspectiveAnna Ronkainen

Basics of Machine LearningPranav Challa

Artificial intelligenceUmesh Meher

Intro au Big Data & Machine LearningEric Daoud

Introduction to AIDr. Loganathan R

Destacado (20)

Users as Data

Kafka - A little introduction

Intro to machine learning

Semantic Computing Executive Briefing

Machine Learning Intro Session

Intro to modelling-supervised learning

Intro to machine learning for web folks @ BlendWebMix

Lecture 02 introduction to ai

Machine learning intro

Intro to Machine Learning

Unit 1 foundations of geometry

Basics of Machine Learning

Machine learning

Commercializing legal AI research: lessons learned

An Intuitive Intro To Machine Learning

AI in legal practice – the research perspective

Basics of Machine Learning

Artificial intelligence

Intro au Big Data & Machine Learning

Introduction to AI

Similar a Machine learning

Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Matthew Rowe

Class Responsibility Assignment as Fuzzy Constraint SatisfactionShinpei Hayashi

Matrix FactorizationYusuke Yamamoto

Metaprogramming code-that-writes-codeorga shih

Devry CIS 247 Full Course LatestAtifkhilji

Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...Dongsun Kim

LinearAlgebra_2016updatedFromwiki.pptAruneshAdarsh

LinearAlgebra_2016updatedFromwiki.pptHumayilZia

Week2- Deep Learning Intuition.pptxfahmi324663

C3 w2Ajay Taneja

Anomaly detection using deep one class classifier홍배 김

Dat 305 dat305 dat 305 education for service uopstudy.comULLPTT

Kyo - Functional Scala 2023.pdfFlavio W. Brasil

Guidetaikhoan262

Huong dan cu the svmtaikhoan262

Backpropagation - Elisa Sayrol - UPC Barcelona 2018Universitat Politècnica de Catalunya

K-Nearest Neighbor(KNN)Abdullah al Mamun

Value Objects, Full Throttle (to be updated for spring TC39 meetings)Brendan Eich

DDW Clinic Session 1.pdfBeckhamWee

Machine Learning: Classification Concepts (Part 1)Daniel Chan

Similar a Machine learning (20)

Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...

Class Responsibility Assignment as Fuzzy Constraint Satisfaction

Matrix Factorization

Metaprogramming code-that-writes-code

Devry CIS 247 Full Course Latest

Good Hunting: Locating, Prioritizing, and Fixing Bugs Automatically (Keynote,...

LinearAlgebra_2016updatedFromwiki.ppt

Week2- Deep Learning Intuition.pptx

C3 w2

Anomaly detection using deep one class classifier

Dat 305 dat305 dat 305 education for service uopstudy.com

Kyo - Functional Scala 2023.pdf

Guide

Huong dan cu the svm

Backpropagation - Elisa Sayrol - UPC Barcelona 2018

K-Nearest Neighbor(KNN)

Value Objects, Full Throttle (to be updated for spring TC39 meetings)

DDW Clinic Session 1.pdf

Machine Learning: Classification Concepts (Part 1)

Último

Training state-of-the-art general text embeddingZilliz

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Story boards and shot lists for my a level piececharlottematthew16

Commit 2024 - Secret Management made easyAlfredo García Lavilla

"ML in Production",Oleksandr BaganFwdays

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Machine learning

1. Machine Learning

2. Machine Learning A n Intro duction

3. Automated Insights

4. Spam

5. You might like ...

6. The World

9. People you should follow...

10. People you may know...

11. People you may know...

12. Classifying Clustering Recommending

13. Classifying

14.

15.

16. Clustering

17.

18.

19.

20. Recommending

26. Modeling

27. Similarity

28. Movies

29. Collaborative

30. How to represent our data?

31. Data User A User B User C Item 1 1.0 3.0 5.0

32. Similarity? User A User B User C Item 1 1.0 3.0 5.0 Item 2 2.0 5.0 2.0 Item 3 1.0 3.0 1.0

33. Euclidean Distance

34. Euclidean Distance q 1.0 2.0 1.0 p 2.0 5.0 3.0

35. Euclidean Distance User A User B User C d Item 1 1.0 3.0 5.0 4 Item 2 2.0 5.0 2.0 2.45 Item 3 1.0 3.0 1.0

36. Euclidean Distance (defn euclidean-distance [v m] (let [num-of-rows (ﬁrst (dim m)) difference (minus (matrix (repeat num-of-rows v)) m)] (sqrt (map sum-of-squares difference)))) Clojure #ftw

37. Content Based

38. Distance User A User B User C Item 1 1.0 3.0 5.0 Item 2 2.0 5.0 2.0 Item 3 1.0 3.0 1.0

39. Distance Feature A Feature B Feature C Item 1 1.0 3.0 5.0 Item 2 2.0 5.0 2.0 Item 3 1.0 3.0 1.0

40. Classification Algorithm

41. k-nearest neighbours

42. Our Data A B C d Item 1 1.0 3.0 5.0 4 Item 2 2.0 5.0 2.0 2.45 Item 3 1.0 3.0 1.0

43. Our Model A B C d Label { Trained Item 1 1.0 3.0 5.0 4 Spam Item 2 2.0 5.0 2.0 2.45 Ham Item 3 1.0 3.0 1.0

44. Our Model Label d { Trained Item 1 Spam 4 Item 2 Ham 2.45 Item 3

45. k-nn Classifier (defn knn-classify [xs k m labels] (let [sorted-labels (take k (map (partial nth labels) (sorted-indexes (euclidean-distance xs m)))) category (mode sorted-labels)] (if (seq? category) (ﬁrst category) category))) Clojure #ftw

46. Evaluation

47. Our Model Label d { Trained Item 1 Spam 4 Item 2 Ham 2.45 Item 3

48. Our Model Observed Label Calculated Label { Trained Item 1 Spam Item 2 Ham Test Item 3 Ham Ham

49. kʼthx

Notas del editor

\n
Hope to show that it&#x2019;s not too complicated, very interested, potentially valuable, and various parts are quite similar\nWhat kinds of things does machine learning cover?\n
Increasing piles of data\nmachine learning is complimentary to data mining: evolve behaviours from empirical data\n
Classic classification.\n
Product suggestions.\nList from my Kindle suggestions. Over 850,000 kindle titles alone. Recommendations based on my purchases and content?\n
Google employs all kinds of machine learning: query result ranking, news story clustering \n
2 searches, on immediately after the other: one chrome, one safari. there&#x2019;s a difference!?\n
Social sites make use of recommendations.\nInstead of products it&#x2019;s users to other users.\nThis time it&#x2019;s pretty good.\n
Social sites make use of recommendations. Instead of products it&#x2019;s users to other users.\n
\n
Going to cover a high level description of these 3 topics, and then explore some of the details through a classification example\n
How much something is or isn&#x2019;t part of a group. Assign class labels using a classifier built from predictor values\n
16 things. we know there are 4 categories or labels.\nwe want to automate the way find a category for each thing. \n
\n
Clustering: Group a large number of things into groups of similar things\n
24 blobs, not sure of what the categories are\njust want groups of similar things\n
we&#x2019;ve got 4 categories\n
\n
lets take an example of looking at recommending items to users\n
3 items, and 2 users\n
we can see recommendations for items from those users\n
for example, the red user shares 2 items...\n
with the blue user... we can use the blue users preferences to identify things that the red user would be interested in...\n
and for things like twitter + facebook, these graphs would be users to users\n
this brings up an interesting point- how do we model the problem.\nthe first thing we need to look at...\n
I mentioned it quite a lot- but what does that mean?\n
interesting example\n2 films- how similar?\nboth star jim carrey\n\n
Collaborative filtering- based on behaviour of multiple people (for example)\n
\n
\n
How to measure similarity? We can calculate distance... \n
One way is euclidean distance. Similar to pythagorean formula for calculating sides of triangles.\n\nWhat are q and p? ...\n
p and q are our vectors-\n1) so we first calculate the difference\n2) then square those (ensuring all numbers are signed the same)\n3) we sum the squares\n4) square root of the sum\n\nso, let&#x2019;s look at the results for our data\n
we can see that item 3 is closer to item 2 than item 1.\nthis can be seen by the ratings for items 2 and 3 from all users have a similar shape.\n\nhow does this look in code?\n
\n
How about content based calculations?\nWell we break down the content into feature vectors.\n
This is our previous matrix- user and item ratings, what do we swap users for?\n
We swap them for features.\nFor example, items were documents, features may be the words in those documents.\nMovies might break down films into running length, actors etc.\n\nImportantly- Measure similarity in the same way- with distance calculations.\n\nLet&#x2019;s put this in practice\n
We&#x2019;ve looked at how to represent data, and how to measure similarity.\nHow do we turn that into an algorithm that can classify things?\n
One really simple one is k-nearest neighbours: find the most common category for our item from k nearest items\n
Our matrix from before- shows the calculated distance of Items 1 and 2 from item 3.\nBut, if we&#x2019;re classifying, we need to know what the categories are!\n
We&#x2019;ve added the labels so we can see that item 1 was spam and item 2 was ham\n\nitems 1 and 2 represent our trained model- data and their label\n\nlet&#x2019;s drop the stuff we don&#x2019;t need any more\n
we have just labels and distances from all other items to our new item.\n\nback to our algorithm- knn. method: find the most common label from k nearest items to our item (in this case 3).\n\nso, given the above information we&#x2019;d classify it into &#x201C;Ham&#x201D; category. If we had more data we&#x2019;d just compare more neighbours.\n\ntime for some code ...\n
xs is the vector we&#x2019;re trying to classify\nk is the number of nearest neighbours we&#x2019;ll measure the distance of\nm is our trained matrix of data\nlabels are the labels for the items in the matrix\n
all very well, how do we know our model is accurately categorising things?\n
Similar matrix to before, how can we use the empirical data to measure effectiveness of the algorithm?\n\nWe can take our data and consider part of it to be testing data...\n
Item 3 now becomes our test data- we have calculated label and an observed label. We can then measure how well we match.\n\nThis is the same for rating movies (for example) as well- how close is our estimated score to the actual measured score?\n\nAnyway, that brings us to the end of a whistlestop tour\n
\n

Machine learning

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (6)

Destacado

Destacado (20)

Similar a Machine learning

Similar a Machine learning (20)

Último

Último (20)

Machine learning

Notas del editor