Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
GUIDE : MS. ANAGHA CHAUDHARI
A sequence : < (ef) (ab) (df) c b >A sequence databaseSID        sequence             An element may contain a set of item...
CHALLENGES ON SEQUENTIALPATTERN MINING A huge number of possible sequential patterns are hidden in  databases A mining a...
The Apriori Algorithm—An Example                      Supmin = 2      Itemset       sup                                   ...
The Apriori Algorithm [Pseudo-Code]Ck: Candidate itemset of size kLk : frequent itemset of size kL1 = {frequent items};for...
APRIORI ADV/DISADV Advantages:   Uses large itemset property.   Easily parallelized   Easy to implement. Disadvantage...
   J. Han, J. Pei, and Y. Yin 2000   Depth-first search   Avoid explicit candidate generation   Adopt divide-and-conqu...
Step 1: FP-Tree Construction FP-Tree is constructed using 2 passes over the data-set:  Pass 1:    Scan data and find sup...
Pass 2:Nodes correspond to items and have a counter1.     FP-Growth reads 1 transaction at a time and maps it to a path2. ...
 Start from each frequent length-1 pattern (as an initial suffix  pattern) construct its conditional pattern base (a ―su...
Table : Table after                             first scan of databaseTable : Transactional data
Fig . FP – Tree Construction
EXAMPLE CONTTable:Mining FP Tree by creating conditional (sub)-pattern bases
EXAMPLE CONTFig.The conditional FP-tree associated with the conditiona node I3
FP-FROWTH ADV/DISADVAdvantages of FP-Growth  • only 2 passes over data-set  • ―compresses‖ data-set  • no candidate gener...
APPLICATIONSCustomer shopping sequences:   First buy computer, then CD-ROM, and then digital camera, within 3    months....
THANK YOU
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Sequential pattern mining
Próxima SlideShare
Cargando en…5
×

Sequential pattern mining

5.203 visualizaciones

Publicado el

Publicado en: Educación
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • For Business Analytics tools Online Training register at http://www.todaycourses.com
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

Sequential pattern mining

  1. 1. GUIDE : MS. ANAGHA CHAUDHARI
  2. 2. A sequence : < (ef) (ab) (df) c b >A sequence databaseSID sequence An element may contain a set of items. Items within an element are unordered10 <a(abc)(ac)d(cf)> and we list them alphabetically.20 <(ad)c(bc)(ae)>30 <(ef)(ab)(df)cb> <a(bc)df> is a subsequence of40 <eg(af)cbc> <a(abc)(ac)d(cf)> Given support threshold min_sup =2, <(ab)c> is a sequential pattern 6
  3. 3. CHALLENGES ON SEQUENTIALPATTERN MINING A huge number of possible sequential patterns are hidden in databases A mining algorithm should  find the complete set of patterns, when possible, satisfying the minimum support (frequency) threshold  be highly efficient, scalable, involving only a small number of database scans  be able to incorporate various kinds of user-specific constraints 7
  4. 4. The Apriori Algorithm—An Example Supmin = 2 Itemset sup Itemset supDatabase TDB {A} 2 Tid Items L1 {A} 2 C1 {B} 3 {B} 3 10 A, C, D {C} 3 1st scan {C} 3 20 B, C, E {D} 1 {E} 3 30 A, B, C, E {E} 3 40 B, E C2 Itemset sup C2 Itemset {A, B} 1 L2 Itemset sup 2nd scan {A, B} {A, C} 2 {A, C} 2 {A, C} {A, E} 1 {B, C} 2 {B, C} 2 {A, E} {B, E} 3 {B, E} 3 {B, C} {C, E} 2 {C, E} 2 {B, E} {C, E} Itemset 3rd scan L3 Itemset sup C3 {B, C, E} {B, C, E} 2 10
  5. 5. The Apriori Algorithm [Pseudo-Code]Ck: Candidate itemset of size kLk : frequent itemset of size kL1 = {frequent items};for (k = 1; Lk != ; k++) do begin Ck+1 = candidates generated from Lk; for each transaction t in database do increment the count of all candidates in Ck+1 that are contained in t Lk+1 = candidates in Ck+1 with min_support endreturn k Lk; 11
  6. 6. APRIORI ADV/DISADV Advantages:  Uses large itemset property.  Easily parallelized  Easy to implement. Disadvantages:  Assumes transaction database is memory resident.  Requires up to m database scans.
  7. 7.  J. Han, J. Pei, and Y. Yin 2000 Depth-first search Avoid explicit candidate generation Adopt divide-and-conquer strategy Two step approach Step1:Build a compact data structure called FP tree Step2:Extract frequent itemsets from FP tree.
  8. 8. Step 1: FP-Tree Construction FP-Tree is constructed using 2 passes over the data-set: Pass 1:  Scan data and find support for each item.  Discard infrequent items.  Sort frequent items in decreasing order based on their support.
  9. 9. Pass 2:Nodes correspond to items and have a counter1. FP-Growth reads 1 transaction at a time and maps it to a path2. Fixed order is used, so paths can overlap when transactions share items (when they have the same prfix ). – In this case, counters are incremented3. Pointers are maintained between nodes containing the same item, creating singly linked lists (dotted lines) – The more paths that overlap, the higher the compression. FP-tree may fit in memory.4. Frequent itemsets extracted from the FP-Tree.
  10. 10.  Start from each frequent length-1 pattern (as an initial suffix pattern) construct its conditional pattern base (a ―subdatabase,‖which consists of the set of prefix paths in the FP-tree co-occurring with the suffix pattern) Construct its (conditional) FP-tree, and perform mining recursively on such a tree. The pattern growth is achieved by the concatenation of the suffix pattern with the frequent patterns generated from a conditional FP-tree.
  11. 11. Table : Table after first scan of databaseTable : Transactional data
  12. 12. Fig . FP – Tree Construction
  13. 13. EXAMPLE CONTTable:Mining FP Tree by creating conditional (sub)-pattern bases
  14. 14. EXAMPLE CONTFig.The conditional FP-tree associated with the conditiona node I3
  15. 15. FP-FROWTH ADV/DISADVAdvantages of FP-Growth • only 2 passes over data-set • ―compresses‖ data-set • no candidate generation • much faster than AprioriDisadvantages of FP-Growth • FP-Tree may not fit in memory!! • FP-Tree is expensive to build
  16. 16. APPLICATIONSCustomer shopping sequences:  First buy computer, then CD-ROM, and then digital camera, within 3 months.Medical treatments, natural disasters (e.g., earthquakes), science & eng. processes, stocks and markets, etc.Telephone calling patterns, Weblog click streamsDNA sequences and gene structures 22
  17. 17. THANK YOU

×