SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
K-Means Clustering Problem
            Ahmad Sabiq
          Febri Maspiyanti
       Indah Kuntum Khairina
          Wiwin Farhania
              Yonatan
What is k-means?
• To partition n objects into k clusters, based on
  attributes.
  – Objects of the same cluster are close their
    attributes are related to each other.
  – Objects of different clusters are far apart their
    attributes are very dissimilar.
Algorithm
• Input: n objects, k (integer k ≤ n)
• Output: k clusters
• Steps:
   1. Select k initial centroids.
   2. Calculate the distance between each object and
      each centroid.
   3. Assign each object to the cluster with the nearest
      centroid.
   4. Recalculate each centroid.
   5. If the centroids don’t change, stop (convergence).
      Otherwise, back to step 2.
• Complexity: O(k.n.d.total_iteration)
Initialization
• Why is it important? What does it affect?
  – Clustering result local optimum!
  – Total iteration / complexity
Good Initialization
3 clusters with 2 iterations…
Bad Initialization
3 clusters with 4 iterations…
Initialization Methods
1.   Random
2.   Forgy
3.   Macqueen
4.   Kaufman
Random
• Algorithm:
  1. Assigns each object to a random cluster.
  2. Computes the initial centroid of each cluster.
Random
Random
Random
9
8
7
6
5
4
3
2
1
0
    0   5   10    15   20   25   30   35
Forgy
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
Forgy
9
8
7
6
5
4
3
2
1
0
    0   5   10   15   20   25   30   35
MacQueen
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
  2. Assign each object to the cluster with the nearest
     centroid.
  3. After each assignment, recalculate the centroid.
MacQueen
9
8
7
6
5
4
3
2
1
0
    0   5   10     15   20   25   30   35
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
                        C=0




d = 24,33

            D = 15,52
Kaufman
          C=0


          C=0   C=0

          C=0




          C=0
Kaufman
                       C=0


                       C=0   C=0

                       C=0



∑C1 = 2,74
                       C=0
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Reference
1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical
   Comparison of Four Initialization Methods for the K-
   Means Algorithm. Pattern Recognition Letters, vol. 20,
   pp. 1027–1040. 1999.
2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A
   Greedy Randomized Adaptive Search Procedure
   Applied to the Clustering Problem as an Initialization
   Process Using K-Means as a Local Search Procedure.
   Journal of Intelligent and Fuzzy Systems, vol. 12, pp.
   235 – 242. 2002.
3. L. Kaufman and P.J. Rousseeuw. Finding Groups in
   Data: An Introduction to Cluster Analysis. Wiley. 1990.
Questions
1. Kenapa inisialisasi penting pada k-means?
2. Metode inisialisasi apa yang memiliki greedy
   choice property?
3. Jelaskan kompleksitas O(nkd) pada metode
   Random.

Más contenido relacionado

La actualidad más candente

The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrapPaul Gardner
 
FUNCTION APPROXIMATION
FUNCTION APPROXIMATIONFUNCTION APPROXIMATION
FUNCTION APPROXIMATIONankita pandey
 
Rとpythonとjuliaで機械学習レベル4を目指す
Rとpythonとjuliaで機械学習レベル4を目指すRとpythonとjuliaで機械学習レベル4を目指す
Rとpythonとjuliaで機械学習レベル4を目指すyuta july
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsViet-Trung TRAN
 
統計学基礎
統計学基礎統計学基礎
統計学基礎Yuka Ezura
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsFrancesco Casalegno
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsVarad Meru
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Yuki Matsubara
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapConfidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapFrancesco Casalegno
 
階層的クラスタリング入門の入門
階層的クラスタリング入門の入門階層的クラスタリング入門の入門
階層的クラスタリング入門の入門Mas Kot
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)Entrepreneur / Startup
 

La actualidad más candente (20)

The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrap
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
FUNCTION APPROXIMATION
FUNCTION APPROXIMATIONFUNCTION APPROXIMATION
FUNCTION APPROXIMATION
 
Rとpythonとjuliaで機械学習レベル4を目指す
Rとpythonとjuliaで機械学習レベル4を目指すRとpythonとjuliaで機械学習レベル4を目指す
Rとpythonとjuliaで機械学習レベル4を目指す
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applications
 
統計学基礎
統計学基礎統計学基礎
統計学基礎
 
Chapter2.3.6
Chapter2.3.6Chapter2.3.6
Chapter2.3.6
 
Markov Chain Monte Carlo Methods
Markov Chain Monte Carlo MethodsMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods
 
Clustering
ClusteringClustering
Clustering
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its Applications
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and BootstrapConfidence Intervals––Exact Intervals, Jackknife, and Bootstrap
Confidence Intervals––Exact Intervals, Jackknife, and Bootstrap
 
階層的クラスタリング入門の入門
階層的クラスタリング入門の入門階層的クラスタリング入門の入門
階層的クラスタリング入門の入門
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)
 

Destacado

Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithmJunyoung Park
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithmparry prabhu
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansShinichi Tamura
 
Kmeans
KmeansKmeans
KmeansWagner
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016GloverParkGroup
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表hanyzeng
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовCyril Savitsky
 
Experimental design
Experimental designExperimental design
Experimental designDan Toma
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحMorad Kheloufi Kheloufi
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Delivering Happiness
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofDongheartwell Dargantes
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia
 

Destacado (20)

Kmeans plusplus
Kmeans plusplusKmeans plusplus
Kmeans plusplus
 
Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithm
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
 
Kmeans
KmeansKmeans
Kmeans
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016
 
Comprension de lectura de los mexicanos
Comprension de lectura de los mexicanosComprension de lectura de los mexicanos
Comprension de lectura de los mexicanos
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表
 
 
Zaragoza turismo 243
Zaragoza turismo 243Zaragoza turismo 243
Zaragoza turismo 243
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентов
 
Experimental design
Experimental designExperimental design
Experimental design
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاح
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015
 
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
 
Kmeans
KmeansKmeans
Kmeans
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012
 

Similar a Kmeans initialization

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering TheorySSA KPI
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringPier Luca Lanzi
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means ClusteringJunghoon Kim
 
K means clustering
K means clusteringK means clustering
K means clusteringKuppusamy P
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithmDarshak Mehta
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithmsMark Moriarty
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.pptLPrashanthi
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringPier Luca Lanzi
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareMohammed Kharma
 

Similar a Kmeans initialization (20)

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering Theory
 
K means-1
K means-1K means-1
K means-1
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means Clustering
 
Data Mining Lecture_7.pptx
Data Mining Lecture_7.pptxData Mining Lecture_7.pptx
Data Mining Lecture_7.pptx
 
K means clustering
K means clusteringK means clustering
K means clustering
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
 
Clustering
ClusteringClustering
Clustering
 
Bioalgo 2012-03-randomized
Bioalgo 2012-03-randomizedBioalgo 2012-03-randomized
Bioalgo 2012-03-randomized
 
Ch12 randalgs
Ch12 randalgsCh12 randalgs
Ch12 randalgs
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Kmeans initialization

  • 1. K-Means Clustering Problem Ahmad Sabiq Febri Maspiyanti Indah Kuntum Khairina Wiwin Farhania Yonatan
  • 2. What is k-means? • To partition n objects into k clusters, based on attributes. – Objects of the same cluster are close their attributes are related to each other. – Objects of different clusters are far apart their attributes are very dissimilar.
  • 3. Algorithm • Input: n objects, k (integer k ≤ n) • Output: k clusters • Steps: 1. Select k initial centroids. 2. Calculate the distance between each object and each centroid. 3. Assign each object to the cluster with the nearest centroid. 4. Recalculate each centroid. 5. If the centroids don’t change, stop (convergence). Otherwise, back to step 2. • Complexity: O(k.n.d.total_iteration)
  • 4. Initialization • Why is it important? What does it affect? – Clustering result local optimum! – Total iteration / complexity
  • 5. Good Initialization 3 clusters with 2 iterations…
  • 6. Bad Initialization 3 clusters with 4 iterations…
  • 7. Initialization Methods 1. Random 2. Forgy 3. Macqueen 4. Kaufman
  • 8. Random • Algorithm: 1. Assigns each object to a random cluster. 2. Computes the initial centroid of each cluster.
  • 11. Random 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 12. Forgy • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids.
  • 13. Forgy 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 14. MacQueen • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids. 2. Assign each object to the cluster with the nearest centroid. 3. After each assignment, recalculate the centroid.
  • 15. MacQueen 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 33. Kaufman C=0 d = 24,33 D = 15,52
  • 34. Kaufman C=0 C=0 C=0 C=0 C=0
  • 35. Kaufman C=0 C=0 C=0 C=0 ∑C1 = 2,74 C=0
  • 36. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 37. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 38. Reference 1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical Comparison of Four Initialization Methods for the K- Means Algorithm. Pattern Recognition Letters, vol. 20, pp. 1027–1040. 1999. 2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A Greedy Randomized Adaptive Search Procedure Applied to the Clustering Problem as an Initialization Process Using K-Means as a Local Search Procedure. Journal of Intelligent and Fuzzy Systems, vol. 12, pp. 235 – 242. 2002. 3. L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley. 1990.
  • 39. Questions 1. Kenapa inisialisasi penting pada k-means? 2. Metode inisialisasi apa yang memiliki greedy choice property? 3. Jelaskan kompleksitas O(nkd) pada metode Random.