SlideShare una empresa de Scribd logo
1 de 23
What’s in a Label?
Business value of “soft” vs “hard” cluster ensembles
                                              solutions-2
                              Nicole Huyghe & Anita Prinzie
Answers the who and the why
Theme 1


Theme 2


Theme 3

          ...

Theme 9
Theme 10


           Cluster
          Ensemble
HARD OR SOFT
CLUSTER ENSEMBLE
Stability   Integrity   Accuracy   Size
Stability




Similarity Index (Lange et al, 2004) indicates the percentage of pairs of observations that belong to the same
cluster in both clustering C and clustering C’.
Cluster Integrity – Heterogeneity




Total separation of clusters: based on the distance between cluster centers
Cluster Integrity - Homogeneity




Scatter (compactness): average ratio of the cluster variance to the variance of the dataset.
Accuracy
                Reality                                                    Prediction

                        5                                                             5
                                                                                              6
                   4        6                                                  4
                                                                                          2

    1         2                                                       1
         3                      7                                                             7
                                                                           3
                       8                                                             8

                            9                                                             9




Adjusted Rand Index (Hubert and Arabie, 1985): level of agreement between the predicted segment and the real
segment correcting for the expected level of agreement.
Size




Uniformity deviation: average deviation from each segment from uniform segment size (1/number of segments).
Rheumatism


Software journey


Osteoporosis
Stability          Heterogeneity


 H>S                     H>S




       Accuracy           Homogeneity

                                      S>H
H>S                      H>S
                   S>H                  S>H
LC gives smaller segments
            Rheumatism

Soft LC
Soft CCEA
Hard LC
Hard CCEA


            Software journey   Osteoporosis

Soft LC
Soft CCEA
Hard LC
Hard CCEA
MIXED EVIDENCE
Fixed Factors




                         x 10
 100   100   100   100
Stability: SOFT is better




                                  High confidence
                                  Low confidence

                                 Sim. Index soft > hard
                                 Sim. Index hard > soft

        Strong        Weak
       similarity   similarity
Homogeneity: SOFT is better


                                Scatter hard > soft


                                High confidence
                                Low confidence




       Strong        Weak
      similarity   similarity
Heterogeneity: Hard is better




                                High confidence
                                Low confidence



                                Tot. Sep. soft > hard

       Strong        Weak
      similarity   similarity
Size: Hard is better




                                 High confidence
                                 Low confidence



                                 Uni. dev. soft > hard

        Strong        Weak
       similarity   similarity
HARD ENSEMBLES
GIVE BETTER
BUSINESS
SEGMENTS
Anita Prinzie, Nicole Huyghe
                     anita@solutions2.be
                      www.solutions2.be




        do we cause

risingquestions
References

•   Fred and Jain, Combining Multiple Clustering using Evidence
    Accumulation (2005), IEEE Transactions on Pattern analysis and
    Machine Intelligence, 27(6), 835-850.
•   Lange, T., Roth., V., Braun L. And Buhmann J.M. (2004) , Stability-
    based validation of Clustering Solutions, Neural Computation, 16,
    1299-1323.
•   Haldiki, M.,Vazirgiannis M. and Batistakis, Y. (2000), Quality Scheme
    Assessment in the Clustering Process, Proc. Of the 4th European
    Conference on Principles of Data Mining and Knowledge
    Discovery, 265-276.
•   Hubert, L. And Arabie, P. (1985) Comparing partitions, Journal of
    Classification, 193-218.
•   Nieweglowski, L., CLV package (2007), R software.
•   Martin, A., Quinn, K.M. And Park, J.H., Markov Chain Monte Carlo
    Package (MCMCpack) (2003-2012), R software.

Más contenido relacionado

Destacado

Gamification and the Moodle gradebook
Gamification and the Moodle gradebookGamification and the Moodle gradebook
Gamification and the Moodle gradebookNatalie Denmeade
 
Peculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi riverPeculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi riverEugene Tkachenko
 
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon WinesSegmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon WinesCompusense Inc.
 
The power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panelsThe power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panelsCompusense Inc.
 
Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...Compusense Inc.
 
Best Practices in Equivalence Testing
Best Practices in Equivalence TestingBest Practices in Equivalence Testing
Best Practices in Equivalence TestingCompusense Inc.
 
Panel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case StudyPanel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case StudyCompusense Inc.
 
Worst practices in Business Intelligence setup
Worst practices in Business Intelligence setupWorst practices in Business Intelligence setup
Worst practices in Business Intelligence setupThe Marketing Distillery
 
INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015ESADE
 
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...Compusense Inc.
 
2014 AR_English_WEB
2014 AR_English_WEB2014 AR_English_WEB
2014 AR_English_WEBYang Zhao
 
Sam Decker at SBS2010
Sam Decker at SBS2010Sam Decker at SBS2010
Sam Decker at SBS2010Dachis Group
 

Destacado (17)

Gamification and the Moodle gradebook
Gamification and the Moodle gradebookGamification and the Moodle gradebook
Gamification and the Moodle gradebook
 
Twenty Years of CRC A Balance Sheet Volume II
Twenty Years of CRC A Balance Sheet Volume IITwenty Years of CRC A Balance Sheet Volume II
Twenty Years of CRC A Balance Sheet Volume II
 
Peculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi riverPeculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi river
 
02.11.2012, NEWSWIRE, Issue 246
02.11.2012, NEWSWIRE, Issue 24602.11.2012, NEWSWIRE, Issue 246
02.11.2012, NEWSWIRE, Issue 246
 
19.07.2013, NEWSWIRE, Issues 282 283
19.07.2013, NEWSWIRE, Issues 282 28319.07.2013, NEWSWIRE, Issues 282 283
19.07.2013, NEWSWIRE, Issues 282 283
 
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon WinesSegmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
 
The power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panelsThe power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panels
 
Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...
 
Best Practices in Equivalence Testing
Best Practices in Equivalence TestingBest Practices in Equivalence Testing
Best Practices in Equivalence Testing
 
27.06.2014, NEWSWIRE, Issue331
27.06.2014, NEWSWIRE, Issue33127.06.2014, NEWSWIRE, Issue331
27.06.2014, NEWSWIRE, Issue331
 
Panel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case StudyPanel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case Study
 
Worst practices in Business Intelligence setup
Worst practices in Business Intelligence setupWorst practices in Business Intelligence setup
Worst practices in Business Intelligence setup
 
INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015
 
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
 
2014 AR_English_WEB
2014 AR_English_WEB2014 AR_English_WEB
2014 AR_English_WEB
 
Sam Decker at SBS2010
Sam Decker at SBS2010Sam Decker at SBS2010
Sam Decker at SBS2010
 
Bazaarvoice
BazaarvoiceBazaarvoice
Bazaarvoice
 

Similar a Sawtooth 2012 what's in a label

Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Julián Urbano
 
Pareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender SystemsPareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender SystemsMarco Túlio Ribeiro
 
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee ApproachModular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee ApproachRaffi Khatchadourian
 
Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
 Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
Rely-Guarantee Approach to Reasoning about Aspect-Oriented ProgramsRaffi Khatchadourian
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyAlan Said
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- IAnish Acharya
 
★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysisirisshicat
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsYousef Fadila
 

Similar a Sawtooth 2012 what's in a label (8)

Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
 
Pareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender SystemsPareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender Systems
 
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee ApproachModular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
 
Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
 Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User Study
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
 
★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
 

Más de solutions-2

Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3solutions-2
 
Solutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting toolsSolutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting toolssolutions-2
 
Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012solutions-2
 
Tables 2 slideshare
Tables 2 slideshareTables 2 slideshare
Tables 2 slidesharesolutions-2
 
The big window, bbc & solutions 2 - changing the way we think about age
The big window, bbc & solutions 2  -  changing the way we think about ageThe big window, bbc & solutions 2  -  changing the way we think about age
The big window, bbc & solutions 2 - changing the way we think about agesolutions-2
 
New company presentation slideshare
New company presentation slideshareNew company presentation slideshare
New company presentation slidesharesolutions-2
 
Inspiration run 2011 slideshare version
Inspiration run 2011   slideshare versionInspiration run 2011   slideshare version
Inspiration run 2011 slideshare versionsolutions-2
 

Más de solutions-2 (7)

Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3
 
Solutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting toolsSolutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting tools
 
Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012
 
Tables 2 slideshare
Tables 2 slideshareTables 2 slideshare
Tables 2 slideshare
 
The big window, bbc & solutions 2 - changing the way we think about age
The big window, bbc & solutions 2  -  changing the way we think about ageThe big window, bbc & solutions 2  -  changing the way we think about age
The big window, bbc & solutions 2 - changing the way we think about age
 
New company presentation slideshare
New company presentation slideshareNew company presentation slideshare
New company presentation slideshare
 
Inspiration run 2011 slideshare version
Inspiration run 2011   slideshare versionInspiration run 2011   slideshare version
Inspiration run 2011 slideshare version
 

Último

International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Appkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxAppkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxappkodes
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxsaniyaimamuddin
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Send Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSendBig4
 

Último (20)

International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Appkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxAppkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptx
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Send Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.comSend Files | Sendbig.com
Send Files | Sendbig.comSend Files | Sendbig.com
 

Sawtooth 2012 what's in a label

  • 1. What’s in a Label? Business value of “soft” vs “hard” cluster ensembles solutions-2 Nicole Huyghe & Anita Prinzie
  • 2. Answers the who and the why
  • 3. Theme 1 Theme 2 Theme 3 ... Theme 9 Theme 10 Cluster Ensemble
  • 5. Stability Integrity Accuracy Size
  • 6. Stability Similarity Index (Lange et al, 2004) indicates the percentage of pairs of observations that belong to the same cluster in both clustering C and clustering C’.
  • 7. Cluster Integrity – Heterogeneity Total separation of clusters: based on the distance between cluster centers
  • 8. Cluster Integrity - Homogeneity Scatter (compactness): average ratio of the cluster variance to the variance of the dataset.
  • 9. Accuracy Reality Prediction 5 5 6 4 6 4 2 1 2 1 3 7 7 3 8 8 9 9 Adjusted Rand Index (Hubert and Arabie, 1985): level of agreement between the predicted segment and the real segment correcting for the expected level of agreement.
  • 10. Size Uniformity deviation: average deviation from each segment from uniform segment size (1/number of segments).
  • 12. Stability Heterogeneity H>S H>S Accuracy Homogeneity S>H H>S H>S S>H S>H
  • 13. LC gives smaller segments Rheumatism Soft LC Soft CCEA Hard LC Hard CCEA Software journey Osteoporosis Soft LC Soft CCEA Hard LC Hard CCEA
  • 15. Fixed Factors x 10 100 100 100 100
  • 16.
  • 17. Stability: SOFT is better High confidence Low confidence Sim. Index soft > hard Sim. Index hard > soft Strong Weak similarity similarity
  • 18. Homogeneity: SOFT is better Scatter hard > soft High confidence Low confidence Strong Weak similarity similarity
  • 19. Heterogeneity: Hard is better High confidence Low confidence Tot. Sep. soft > hard Strong Weak similarity similarity
  • 20. Size: Hard is better High confidence Low confidence Uni. dev. soft > hard Strong Weak similarity similarity
  • 22. Anita Prinzie, Nicole Huyghe anita@solutions2.be www.solutions2.be do we cause risingquestions
  • 23. References • Fred and Jain, Combining Multiple Clustering using Evidence Accumulation (2005), IEEE Transactions on Pattern analysis and Machine Intelligence, 27(6), 835-850. • Lange, T., Roth., V., Braun L. And Buhmann J.M. (2004) , Stability- based validation of Clustering Solutions, Neural Computation, 16, 1299-1323. • Haldiki, M.,Vazirgiannis M. and Batistakis, Y. (2000), Quality Scheme Assessment in the Clustering Process, Proc. Of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 265-276. • Hubert, L. And Arabie, P. (1985) Comparing partitions, Journal of Classification, 193-218. • Nieweglowski, L., CLV package (2007), R software. • Martin, A., Quinn, K.M. And Park, J.H., Markov Chain Monte Carlo Package (MCMCpack) (2003-2012), R software.