SlideShare una empresa de Scribd logo
1 de 25
Association Analysis
Association Analysis-Definition Association Analysis is the task of uncovering relationships among data. Association rules: It  is a model that identifies how the data items are associated with each other. Ex:        It is used in retail sales to identify that are frequently purchased together.
What is a rule?  ,[object Object],If (condition) then (result)  Example: IF a customer purchases coke, then the customer also purchases orange juice  The first part is the rule body and the second part is the rule head
Strength of a rule  How certain is the rule?  Confidence measures the certainty of a rule  It is the percentage of transactions containing all items stated in the condition that also contain the items in result  Confidence (A ,B) = P(B | A)  Example: The rule "If Coke then Oranje Juice" has a confidence of 100%
Strength of a rule  How often is the rule occurred?  Support measures the usefulness of a rule  It is the percentage of transactions that contains all items in the rule  Support (A , B) = P(A ,B)  Example: For the rule If Coke then Oranj juice  In all 5 transactions, 2 contains both coke and OJ  The support of the rule is 40% 
Association Rule Mining Two-step process  Find all frequent k-item sets, k=1, 2, 3, …  All items in a rule is referred as an itemset Rules that contains k item forms a k-itemset The occurrence frequency of an k-itemset is the number of transactions that contain all k items in the itemset An itemset satisfies a minimum support (or minimum occurrence frequency) is called a frequent itemset
Association Rule Mining 2.Generate strong association rules from the frequent k-itemsets Rules satisfy both a minimum support threshold and a minimum confidence threshold are called strong rules
Apriori Algorithm: Find all frequent k-item sets Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent
Illustrating Apriori Principle
Apriori Algorithm Method:  Let k=1 Generate frequent itemsets of length 1 Repeat until no new frequent itemsets are identified Generate length (k+1) candidate itemsets from length k frequent itemsets
Contd… Prune candidate itemsets containing subsets of length k that are infrequent  Count the support of each candidate by scanning the DB Eliminate candidates that are infrequent, leaving only those that are frequent
Generate strong association rules from the frequent k-itemsets For each frequent k-itemset, generate all non-empty subsets  Fore every nonempty subset, generate the rule and the associated confidence  Output the rule if the minimum confidence threshold is satisfied
Multilevel association rules Difficult to find strong associations at very low or primitive levels of data    Few people may buy "IBM desktop computer" and "Sony b/w printer" together  Many people may purchase "computer" and "printer" together
Concept hierarchy defines a sequence of mappings from a set of low level concepts to higher level EX:                                IBM                                           Microsoft                                           Hp                                              ………                                          computer                                      software                                       printer                                    accessory 
Steps to be followed Top-down, progressive deepening approach  First mine high-level frequent items  Then mine their lower level frequent items and so on  At each level, Apriori algorithm is used  Use uniform minimum support for all levels, or  Use reduced minimum support at lower levels
Sequential Association Rule  Concerns sequences of events  New homeowners purchase shower curtains before purchasing furniture  When a customer goes into a bank branch and ask for an account reconciliation, there is a good chance that he or she will close all his or her accounts
Sequential Association Rule  Transaction must have two additional features:  a time stamp or sequencing information to determine when transactions occurred relative to each other  identifying information, such as account number or id number
Some important parameters  Duration  duration may be the entire available sequence in the database, or a user selected subsequence, such as year 1999  Event folding window  a set of events occurring within a specified period of time, such as within the same day, can be viewed as occurring together.
Some important parameters  Interval  between events in the discovered pattern  0 interval means to find strictly consecutive sequences  min_int <= interval <= max_int means to find patterns that are separated by at least min_int at most max_int interval = c, to find patterns carrying an exact interval
Some Practical Issues  Time window of transactions  Level of aggregation  Level of support and confidence
Time window of transactions  Select a time window for the transaction covers at least 2 product cycles  e.g. customer purchases a product with a frequency of six month or less, select a 12-month window of customer transaction data  For frequently purchased products, a short time window is sufficient  For low frequency items, a longer time window is necessary.
Level of aggregation  If product codes in the data are too specific (such as based on product details such as size and flavour), few associations will be discovered  Group products into categories according to the product hierarchy or create new level manually
Level of support and confidence  Start with a high support and gradually reduce it  Set confidence to around 50% to reduce the number of permutation
Conclusion Association analysis rules such as multidimensional and sequential association rules are studied. Apriori algorithm is described in detail Various practical issues in association rules are analyzed.
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Más contenido relacionado

Similar a Association Analysis

big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptxAmenahAbbood
 
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset CreationTop Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset Creationcscpconf
 
Software requirementspecification
Software requirementspecificationSoftware requirementspecification
Software requirementspecificationoshin-japanese
 
20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.pptPalaniKumarR2
 
viva_dd.pptx
viva_dd.pptxviva_dd.pptx
viva_dd.pptxdivlee1
 
20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.pptSamPrem3
 
Businesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxBusinesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxdewhirstichabod
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataIRJET Journal
 
A wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataA wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataJun Hong
 
Profitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsProfitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsIRJET Journal
 
Customer Decision Support System
Customer Decision Support SystemCustomer Decision Support System
Customer Decision Support SystemIRJET Journal
 
Refining The System Definition
Refining The System DefinitionRefining The System Definition
Refining The System DefinitionSandeep Ganji
 
Monitoring Distributed Systems
Monitoring Distributed SystemsMonitoring Distributed Systems
Monitoring Distributed SystemsAleksandr Tavgen
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsArmando Vieira
 
 risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docxodiliagilby
 
Lecture7 use case modeling
Lecture7 use case modelingLecture7 use case modeling
Lecture7 use case modelingShahid Riaz
 
Introduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIntroduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIJSRD
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONijwscjournal
 

Similar a Association Analysis (20)

big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptx
 
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset CreationTop Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
 
Software requirementspecification
Software requirementspecificationSoftware requirementspecification
Software requirementspecification
 
20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt
 
viva_dd.pptx
viva_dd.pptxviva_dd.pptx
viva_dd.pptx
 
20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt
 
Businesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxBusinesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docx
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big Data
 
A wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataA wrapper for QuantLib and reference data
A wrapper for QuantLib and reference data
 
Profitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsProfitable Itemset Mining using Weights
Profitable Itemset Mining using Weights
 
Customer Decision Support System
Customer Decision Support SystemCustomer Decision Support System
Customer Decision Support System
 
Refining The System Definition
Refining The System DefinitionRefining The System Definition
Refining The System Definition
 
Dma unit 2
Dma unit  2Dma unit  2
Dma unit 2
 
Monitoring Distributed Systems
Monitoring Distributed SystemsMonitoring Distributed Systems
Monitoring Distributed Systems
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
 
 risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx
 
Lecture7 use case modeling
Lecture7 use case modelingLecture7 use case modeling
Lecture7 use case modeling
 
Introduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIntroduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its Methods
 
PrésentationKnime-Final
PrésentationKnime-FinalPrésentationKnime-Final
PrésentationKnime-Final
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
 

Más de Datamining Tools

Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDatamining Tools
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysisDatamining Tools
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDatamining Tools
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDatamining Tools
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDatamining Tools
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyDatamining Tools
 
Data MIning: Data processing
Data MIning: Data processingData MIning: Data processing
Data MIning: Data processingDatamining Tools
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDatamining Tools
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and PredictionDatamining Tools
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisDatamining Tools
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsDatamining Tools
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationDatamining Tools
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data miningDatamining Tools
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDatamining Tools
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDatamining Tools
 

Más de Datamining Tools (20)

Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technology
 
Data MIning: Data processing
Data MIning: Data processingData MIning: Data processing
Data MIning: Data processing
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and Prediction
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitions
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalization
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data mining
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI  2AI: Learning in AI  2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Association Analysis

  • 2. Association Analysis-Definition Association Analysis is the task of uncovering relationships among data. Association rules: It is a model that identifies how the data items are associated with each other. Ex: It is used in retail sales to identify that are frequently purchased together.
  • 3.
  • 4. Strength of a rule How certain is the rule? Confidence measures the certainty of a rule It is the percentage of transactions containing all items stated in the condition that also contain the items in result Confidence (A ,B) = P(B | A) Example: The rule "If Coke then Oranje Juice" has a confidence of 100%
  • 5. Strength of a rule How often is the rule occurred? Support measures the usefulness of a rule It is the percentage of transactions that contains all items in the rule Support (A , B) = P(A ,B) Example: For the rule If Coke then Oranj juice In all 5 transactions, 2 contains both coke and OJ The support of the rule is 40% 
  • 6. Association Rule Mining Two-step process Find all frequent k-item sets, k=1, 2, 3, … All items in a rule is referred as an itemset Rules that contains k item forms a k-itemset The occurrence frequency of an k-itemset is the number of transactions that contain all k items in the itemset An itemset satisfies a minimum support (or minimum occurrence frequency) is called a frequent itemset
  • 7. Association Rule Mining 2.Generate strong association rules from the frequent k-itemsets Rules satisfy both a minimum support threshold and a minimum confidence threshold are called strong rules
  • 8. Apriori Algorithm: Find all frequent k-item sets Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent
  • 10. Apriori Algorithm Method: Let k=1 Generate frequent itemsets of length 1 Repeat until no new frequent itemsets are identified Generate length (k+1) candidate itemsets from length k frequent itemsets
  • 11. Contd… Prune candidate itemsets containing subsets of length k that are infrequent Count the support of each candidate by scanning the DB Eliminate candidates that are infrequent, leaving only those that are frequent
  • 12. Generate strong association rules from the frequent k-itemsets For each frequent k-itemset, generate all non-empty subsets Fore every nonempty subset, generate the rule and the associated confidence Output the rule if the minimum confidence threshold is satisfied
  • 13. Multilevel association rules Difficult to find strong associations at very low or primitive levels of data   Few people may buy "IBM desktop computer" and "Sony b/w printer" together Many people may purchase "computer" and "printer" together
  • 14. Concept hierarchy defines a sequence of mappings from a set of low level concepts to higher level EX: IBM  Microsoft  Hp ……… computer  software  printer  accessory 
  • 15. Steps to be followed Top-down, progressive deepening approach First mine high-level frequent items Then mine their lower level frequent items and so on At each level, Apriori algorithm is used Use uniform minimum support for all levels, or Use reduced minimum support at lower levels
  • 16. Sequential Association Rule  Concerns sequences of events New homeowners purchase shower curtains before purchasing furniture When a customer goes into a bank branch and ask for an account reconciliation, there is a good chance that he or she will close all his or her accounts
  • 17. Sequential Association Rule  Transaction must have two additional features: a time stamp or sequencing information to determine when transactions occurred relative to each other identifying information, such as account number or id number
  • 18. Some important parameters Duration duration may be the entire available sequence in the database, or a user selected subsequence, such as year 1999 Event folding window a set of events occurring within a specified period of time, such as within the same day, can be viewed as occurring together.
  • 19. Some important parameters Interval between events in the discovered pattern 0 interval means to find strictly consecutive sequences min_int <= interval <= max_int means to find patterns that are separated by at least min_int at most max_int interval = c, to find patterns carrying an exact interval
  • 20. Some Practical Issues  Time window of transactions Level of aggregation Level of support and confidence
  • 21. Time window of transactions Select a time window for the transaction covers at least 2 product cycles e.g. customer purchases a product with a frequency of six month or less, select a 12-month window of customer transaction data For frequently purchased products, a short time window is sufficient For low frequency items, a longer time window is necessary.
  • 22. Level of aggregation If product codes in the data are too specific (such as based on product details such as size and flavour), few associations will be discovered Group products into categories according to the product hierarchy or create new level manually
  • 23. Level of support and confidence Start with a high support and gradually reduce it Set confidence to around 50% to reduce the number of permutation
  • 24. Conclusion Association analysis rules such as multidimensional and sequential association rules are studied. Apriori algorithm is described in detail Various practical issues in association rules are analyzed.
  • 25. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net