Enviar búsqueda
Cargar
K-means and Hierarchical Clustering
•
2 recomendaciones
•
2,940 vistas
G
guestfee8698
Seguir
Tecnología
Educación
Denunciar
Compartir
Denunciar
Compartir
1 de 24
Descargar ahora
Descargar para leer sin conexión
Recomendados
DWT-SVD Based Visual Cryptography Scheme for Audio Watermarking
DWT-SVD Based Visual Cryptography Scheme for Audio Watermarking
inventionjournals
Clustered defered and forward shading
Clustered defered and forward shading
WuBinbo
Hierarchical Clustering
Hierarchical Clustering
Carlos Castillo (ChaTo)
Hierarchical clustering
Hierarchical clustering
ishmecse13
Cluster analysis
Cluster analysis
Jewel Refran
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
Frank Kelly
Cluster analysis
Cluster analysis
saba khan
Cluster Analysis
Cluster Analysis
DataminingTools Inc
Recomendados
DWT-SVD Based Visual Cryptography Scheme for Audio Watermarking
DWT-SVD Based Visual Cryptography Scheme for Audio Watermarking
inventionjournals
Clustered defered and forward shading
Clustered defered and forward shading
WuBinbo
Hierarchical Clustering
Hierarchical Clustering
Carlos Castillo (ChaTo)
Hierarchical clustering
Hierarchical clustering
ishmecse13
Cluster analysis
Cluster analysis
Jewel Refran
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
Frank Kelly
Cluster analysis
Cluster analysis
saba khan
Cluster Analysis
Cluster Analysis
DataminingTools Inc
DBSCAN
DBSCAN
Éverton M. Gava
Dbscan algorithom
Dbscan algorithom
Mahbubur Rahman Shimul
Db Scan
Db Scan
International Islamic University
3.3 hierarchical methods
3.3 hierarchical methods
Krish_ver2
08 clustering
08 clustering
นนทวัฒน์ บุญบา
Density Based Clustering
Density Based Clustering
SSA KPI
K means and dbscan
K means and dbscan
Yan Xu
Cluster Analysis for Dummies
Cluster Analysis for Dummies
Venkata Reddy Konasani
Clustering in Data Mining
Clustering in Data Mining
Archana Swaminathan
Cluster Analysis
Cluster Analysis
SSA KPI
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
Cory Cook
Clustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
pkchoudhury
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
Lee Larcombe
Theory and application of fluctuating-charge models
Theory and application of fluctuating-charge models
Jiahao Chen
Cluster spss week7
Cluster spss week7
Birat Sharma
Gene Expression Data Analysis
Gene Expression Data Analysis
Jhoirene Clemente
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
RCC Institute of Information Technology
Introduction to simulation modeling
Introduction to simulation modeling
bhupendra kumar
Kmeans
Kmeans
Sukhvir Singh Lal
Kmeans
Kmeans
Nikita Goyal
Más contenido relacionado
Destacado
DBSCAN
DBSCAN
Éverton M. Gava
Dbscan algorithom
Dbscan algorithom
Mahbubur Rahman Shimul
Db Scan
Db Scan
International Islamic University
3.3 hierarchical methods
3.3 hierarchical methods
Krish_ver2
08 clustering
08 clustering
นนทวัฒน์ บุญบา
Density Based Clustering
Density Based Clustering
SSA KPI
K means and dbscan
K means and dbscan
Yan Xu
Cluster Analysis for Dummies
Cluster Analysis for Dummies
Venkata Reddy Konasani
Clustering in Data Mining
Clustering in Data Mining
Archana Swaminathan
Cluster Analysis
Cluster Analysis
SSA KPI
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
Cory Cook
Clustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
pkchoudhury
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
Lee Larcombe
Theory and application of fluctuating-charge models
Theory and application of fluctuating-charge models
Jiahao Chen
Cluster spss week7
Cluster spss week7
Birat Sharma
Gene Expression Data Analysis
Gene Expression Data Analysis
Jhoirene Clemente
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
RCC Institute of Information Technology
Introduction to simulation modeling
Introduction to simulation modeling
bhupendra kumar
Destacado
(20)
DBSCAN
DBSCAN
Dbscan algorithom
Dbscan algorithom
Db Scan
Db Scan
3.3 hierarchical methods
3.3 hierarchical methods
08 clustering
08 clustering
Density Based Clustering
Density Based Clustering
K means and dbscan
K means and dbscan
Cluster Analysis for Dummies
Cluster Analysis for Dummies
Clustering in Data Mining
Clustering in Data Mining
Cluster Analysis
Cluster Analysis
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
Clustering: Large Databases in data mining
Clustering: Large Databases in data mining
Computer Aided Molecular Modeling
Computer Aided Molecular Modeling
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
Theory and application of fluctuating-charge models
Theory and application of fluctuating-charge models
Cluster spss week7
Cluster spss week7
Gene Expression Data Analysis
Gene Expression Data Analysis
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Introduction to simulation modeling
Introduction to simulation modeling
Similar a K-means and Hierarchical Clustering
Kmeans
Kmeans
Sukhvir Singh Lal
Kmeans
Kmeans
Nikita Goyal
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
Machine Learning Prague
Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)
guestfee8698
Mle
Mle
Badri Patro
Document Analysis with Deep Learning
Document Analysis with Deep Learning
aiaioo
Similar a K-means and Hierarchical Clustering
(6)
Kmeans
Kmeans
Kmeans
Kmeans
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)
Mle
Mle
Document Analysis with Deep Learning
Document Analysis with Deep Learning
Más de guestfee8698
PAC Learning
PAC Learning
guestfee8698
Support Vector Machines
Support Vector Machines
guestfee8698
VC dimensio
VC dimensio
guestfee8698
Hidden Markov Models
Hidden Markov Models
guestfee8698
Gaussian Mixture Models
Gaussian Mixture Models
guestfee8698
Short Overview of Bayes Nets
Short Overview of Bayes Nets
guestfee8698
A Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian Classifiers
guestfee8698
Learning Bayesian Networks
Learning Bayesian Networks
guestfee8698
Inference in Bayesian Networks
Inference in Bayesian Networks
guestfee8698
Bayesian Networks
Bayesian Networks
guestfee8698
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
guestfee8698
Eight Regression Algorithms
Eight Regression Algorithms
guestfee8698
Neural Networks
Neural Networks
guestfee8698
Cross-Validation
Cross-Validation
guestfee8698
Gaussian Bayes Classifiers
Gaussian Bayes Classifiers
guestfee8698
Maximum Likelihood Estimation
Maximum Likelihood Estimation
guestfee8698
Gaussians
Gaussians
guestfee8698
Probability for Data Miners
Probability for Data Miners
guestfee8698
Más de guestfee8698
(18)
PAC Learning
PAC Learning
Support Vector Machines
Support Vector Machines
VC dimensio
VC dimensio
Hidden Markov Models
Hidden Markov Models
Gaussian Mixture Models
Gaussian Mixture Models
Short Overview of Bayes Nets
Short Overview of Bayes Nets
A Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian Classifiers
Learning Bayesian Networks
Learning Bayesian Networks
Inference in Bayesian Networks
Inference in Bayesian Networks
Bayesian Networks
Bayesian Networks
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
Eight Regression Algorithms
Eight Regression Algorithms
Neural Networks
Neural Networks
Cross-Validation
Cross-Validation
Gaussian Bayes Classifiers
Gaussian Bayes Classifiers
Maximum Likelihood Estimation
Maximum Likelihood Estimation
Gaussians
Gaussians
Probability for Data Miners
Probability for Data Miners
Último
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
Antenna Manufacturer Coco
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
wesley chun
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
Último
(20)
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Slack Application Development 101 Slides
Slack Application Development 101 Slides
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
K-means and Hierarchical Clustering
1.
K-means and
Hierarchical Clustering Note to other teachers and users of these slides. Andrew would be delighted if you found this source Andrew W. Moore material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your Professor own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in School of Computer Science your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: Carnegie Mellon University http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully www.cs.cmu.edu/~awm received. awm@cs.cmu.edu 412-268-7599 Copyright © 2001, Andrew W. Moore Nov 16th, 2001 Some Data This could easily be modeled by a Gaussian Mixture (with 5 components) But let’s look at an satisfying, friendly and infinitely popular alternative… Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 2 1
2.
Suppose you transmit
the Lossy Compression coordinates of points drawn randomly from this dataset. You can install decoding software at the receiver. You’re only allowed to send two bits per point. It’ll have to be a “lossy transmission”. Loss = Sum Squared Error between decoded coords and original coords. What encoder/decoder will lose the least information? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 3 Suppose you transmit the Idea One coordinates of points Break into a grid, drawn decode each bit-pair randomly from this dataset. as the middle of You can install decoding grid-cell each software at the receiver. 00 01 You’re only allowed to send two bits per point. It’ll have to be a “lossy transmission”. 10 11 Loss = Sum Squared Error between decoded coords and original coords. eas? er Id What encoder/decoder will Be t t Any lose the least information? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 4 2
3.
Suppose you transmit
the Idea Two coordinates of points drawna grid, decode Break into randomly from thiseach bit-pair as the dataset. centroid of all data in You can install decoding that grid-cell software at the receiver. 00 01 You’re only allowed to send two bits per point. It’ll have to be a “lossy transmission”. 11 10 Loss = Sum Squared Error between decoded coords and original coords. as? r I de What encoder/decoder will urthe ny F lose the least information? A Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 5 K-means 1. Ask user how many clusters they’d like. (e.g. k=5) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 6 3
4.
K-means 1. Ask user
how many clusters they’d like. (e.g. k=5) 2. Randomly guess k cluster Center locations Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 7 K-means 1. Ask user how many clusters they’d like. (e.g. k=5) 2. Randomly guess k cluster Center locations 3. Each datapoint finds out which Center it’s closest to. (Thus each Center “owns” a set of datapoints) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 8 4
5.
K-means 1. Ask user
how many clusters they’d like. (e.g. k=5) 2. Randomly guess k cluster Center locations 3. Each datapoint finds out which Center it’s closest to. 4. Each Center finds the centroid of the points it owns Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 9 K-means 1. Ask user how many clusters they’d like. (e.g. k=5) 2. Randomly guess k cluster Center locations 3. Each datapoint finds out which Center it’s closest to. 4. Each Center finds the centroid of the points it owns… 5. …and jumps there 6. …Repeat until terminated! Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 10 5
6.
K-means
Start Advance apologies: in Black and White this example will deteriorate Example generated by Dan Pelleg’s super-duper fast K-means system: Dan Pelleg and Andrew Moore. Accelerating Exact k-means Algorithms with Geometric Reasoning. Proc. Conference on Knowledge Discovery in Databases 1999, (KDD99) (available on www.autonlab.org/pap.html) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 11 K-means continues … Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 12 6
7.
K-means continues
… Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 13 K-means continues … Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 14 7
8.
K-means continues
… Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 15 K-means continues … Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 16 8
9.
K-means continues
… Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 17 K-means continues … Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 18 9
10.
K-means continues
… Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 19 K-means terminates Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 20 10
11.
K-means Questions • What
is it trying to optimize? • Are we sure it will terminate? • Are we sure it will find an optimal clustering? • How should we start it? • How could we automatically choose the number of centers? ….we’ll deal with these questions over the next few slides Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 21 Distortion Given.. •an encoder function: ENCODE : ℜm → [1..k] •a decoder function: DECODE : [1..k] → ℜm Define… R Distortion = ∑ (x i − DECODE[ENCODE(x i )]) 2 i =1 Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 22 11
12.
Distortion Given.. •an encoder function:
ENCODE : ℜm → [1..k] •a decoder function: DECODE : [1..k] → ℜm Define… R Distortion = ∑ (x i − DECODE[ENCODE(x i )]) 2 i =1 We may as well write DECODE[ j ] = c j R so Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 23 The Minimal Distortion R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties must centers c1 , c2 , … , ck have when distortion is minimized? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 24 12
13.
The Minimal Distortion
(1) R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties must centers c1 , c2 , … , ck have when distortion is minimized? (1) xi must be encoded by its nearest center ….why? c ENCODE ( xi ) = arg min (x i − c j ) 2 c j ∈{c1 ,c 2 ,...c k } ..at the minimal distortion Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 25 The Minimal Distortion (1) R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties must centers c1 , c2 , … , ck have when distortion is minimized? (1) xi must be encoded by its nearest center Otherwise distortion could be ….why? reduced by replacing ENCODE[xi] by the nearest center c ENCODE ( xi ) = arg min (x i − c j ) 2 c j ∈{c1 ,c 2 ,...c k } ..at the minimal distortion Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 26 13
14.
The Minimal Distortion
(2) R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties must centers c1 , c2 , … , ck have when distortion is minimized? (2) The partial derivative of Distortion with respect to each center location must be zero. Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 27 (2) The partial derivative of Distortion with respect to each center location must be zero. R ∑ (x = − c ENCODE ( xi ) ) 2 Distortion i i =1 k ∑ ∑ (x = − c j )2 OwnedBy(cj ) = the set i of records owned by j =1 i∈OwnedBy(c j ) Center cj . ∂Distortion ∂ ∑ (x = − c j )2 i ∂c j ∂c j i∈OwnedBy(c j ) ∑ (x = −2 −cj) i i∈OwnedBy(c j ) = 0 (for a minimum) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 28 14
15.
(2) The partial
derivative of Distortion with respect to each center location must be zero. R ∑ (x = − c ENCODE ( xi ) ) 2 Distortion i i =1 k ∑ ∑ (x = − c j )2 i j =1 i∈OwnedBy(c j ) ∂Distortion ∂ ∑ (x = − c j )2 i ∂c j ∂c j i∈OwnedBy(c j ) ∑ (x = −2 −cj) i i∈OwnedBy(c j ) = 0 (for a minimum) 1 ∑ xi Thus, at a minimum: c j = | OwnedBy(c j ) | i∈OwnedBy(c j ) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 29 At the minimum distortion R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties must centers c1 , c2 , … , ck have when distortion is minimized? (1) xi must be encoded by its nearest center (2) Each Center must be at the centroid of points it owns. Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 30 15
16.
Improving a suboptimal
configuration… R Distortion = ∑ (x i − c ENCODE ( xi ) ) 2 i =1 What properties can be changed for centers c1 , c2 , … , ck have when distortion is not minimized? (1) Change encoding so that xi is encoded by its nearest center (2) Set each Center to the centroid of points it owns. There’s no point applying either operation twice in succession. But it can be profitable to alternate. …And that’s K-means! Easy to prove this procedure will terminate in a state at which neither (1) or (2) change the configuration. Why? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 31 Improving a suboptimal sconfiguration… R of partitioning of way finite number re are only a R The ∑ . records Distortion = nux iberc ENCODE ( x i ) ) ( − of possible 2 into k groups m only a finite=1 roids of rs are the cent So there are i hich all Cente w What properties canin n. configurations be changed for centers c1 , c2 , … , ck t have ts they ow the distortion is not minimized? ration, it mus have when poin ges on an ite ation chan If the configur distortion. x is encoded by itsgo to a improved the (1) Change encoding so that i ion changes it must nearest center nfigurat ch time the co So eaCenter to the centroid before. (2) Set eachiguration it’s never been to of points tually run out it owns. would even conf on forever, it go There’s no pointied to . either operation twice in succession. So if it tr applying figurations of con But it can be profitable to alternate. …And that’s K-means! Easy to prove this procedure will terminate in a state at which neither (1) or (2) change the configuration. Why? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 32 16
17.
Will we find
the optimal configuration? • Not necessarily. • Can you invent a configuration that has converged, but does not have the minimum distortion? Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 33 Will we find the optimal configuration? • Not necessarily. • Can you invent a configuration that has converged, but does not have the minimum distortion? (Hint: try a fiendish k=3 configuration here…) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 34 17
18.
Will we find
the optimal configuration? • Not necessarily. • Can you invent a configuration that has converged, but does not have the minimum distortion? (Hint: try a fiendish k=3 configuration here…) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 35 Trying to find good optima • Idea 1: Be careful about where you start • Idea 2: Do many runs of k-means, each from a different random start configuration • Many other ideas floating around. Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 36 18
19.
Trying to find
good optima • Idea 1: Be careful about where you start • Idea 2: Do many runs of k-means, each from a different random start configuration Neat trick: • Many other center on top of randomly chosen datapoint. Place first ideas floating around. Place second center on datapoint that’s as far away as possible from first center : Place j’th center on datapoint that’s as far away as possible from the closest of Centers 1 through j-1 : Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 37 Choosing the number of Centers • A difficult problem • Most common approach is to try to find the solution that minimizes the Schwarz Criterion (also related to the BIC) Distortion + λ (# parameters) log R = Distortion + λmk log R m=#dimensions R=#Records k=#Centers Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 38 19
20.
Common uses of
K-means • Often used as an exploratory data analysis tool • In one-dimension, a good way to quantize real- valued variables into k non-uniform buckets • Used on acoustic data in speech understanding to convert waveforms into one of k categories (known as Vector Quantization) • Also used for choosing color palettes on old fashioned graphical display devices! Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 39 Single Linkage Hierarchical Clustering 1. Say “Every point is its own cluster” Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 40 20
21.
Single Linkage Hierarchical
Clustering 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 41 Single Linkage Hierarchical Clustering 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 42 21
22.
Single Linkage Hierarchical
Clustering 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster 4. Repeat Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 43 Single Linkage Hierarchical Clustering 1. Say “Every point is its own cluster” 2. Find “most similar” pair of clusters 3. Merge it into a parent cluster 4. Repeat Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 44 22
23.
Single Linkage Hierarchical
Clustering How do we define similarity between clusters? • Minimum distance between 1. Say “Every point is its points in clusters (in which own cluster” case we’re simply doing Euclidian Minimum Spanning 2. Find “most similar” pair Trees) of clusters • Maximum distance between 3. Merge it into a parent points in clusters cluster • Average distance between 4. Repeat…until you’ve points in clusters merged the whole dataset into one cluster You’re left with a nice dendrogram, or taxonomy, or hierarchy of datapoints (not shown here) Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 45 Also known in the trade as Hierarchical Agglomerative Clustering (note the acronym) Single Linkage Comments • It’s nice that you get a hierarchy instead of an amorphous collection of groups • If you want k groups, just cut the (k-1) longest links • There’s no real statistical or information- theoretic foundation to this. Makes your lecturer feel a bit queasy. Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 46 23
24.
What you should
know • All the details of K-means • The theory behind K-means as an optimization algorithm • How K-means can get stuck • The outline of Hierarchical clustering • Be able to contrast between which problems would be relatively well/poorly suited to K- means vs Gaussian Mixtures vs Hierarchical clustering Copyright © 2001, 2004, Andrew W. Moore K-means and Hierarchical Clustering: Slide 47 24
Descargar ahora