SlideShare una empresa de Scribd logo
1 de 10
Sequence Mining Automata: a New Technique for Mining Frequent Sequences Under Regular Expressions Roberto Trasarti, Francesco Bonchi, Bart Goethals
Problem Definition (1): Given a database of sequences D, the support of a sequence S ∈ Σ∗ is the number of sequences in D that are supersequences of S: sup(S) = | {T ∈ D | S ⊑ T} |.  Given a Regular Expression R a sequence s is valid if can be generated by R. A B A C B A Sequence	s:  1 Minimum support: 3  	RE: A*BC* A A A B B C A B C C D A B A A B B C 2 C B A A B D A A A B 3 A A B Subsequence:                              Support: 3 Subsequence:                              Support: 2 … B C
Previous approaches and our contribution: Previous approaches [1,2,3] solve the problem focusing on its search space, exploiting in different ways the pruning power of the regular expression  R over unpromising patterns. The idea behind our solution is to focus on the input dataset and the given regular expression: reading the input database we produce for each sequence in the database, all and only the valid patterns contained in the sequences. [1] H. Albert-Lorincz and J.-F. Boulicaut. Mining frequent sequential patterns under regular expressions: A highly adaptive strategy for pushing contraints. In Proc. of SDM’03. [2] M. N. Garofalakis, R. Rastogi, and K. Shim. Spirit: Sequential pattern mining with regular expression constraints. In Proceedings of VLDB’99. [3] J. Pei, J. Han, andW.Wang. Mining sequential patterns with constraints in large databases. In Proc. of CIKM’02. A B ...  A C A B C A ...  B ...  A A ...  ...  C ...  C A B ...  A B A C B A A A A B B C ...
Sequence Mining Automata (1): Our subsequences mining automata SMA is a specialized kind of Petri Net, which can be constructed from a DFA by transforming each edge of the DFA in a transition with its two arcs from its input place and to its output place.  Moreover it has the following peculiarities: • Transitions do not consume tokens• Parallel execution • External signal The initial marking consists of only the token representing the empty sequence ε in the starting places.  External signal Example RE: A*B(B|C)D*E
Sequence Mining Automata (2): Each transition applies an process which is activated only if the external signal is equal to the label of the edge. This process produces a new set of tokens in the destination  place. External signal Example RE: A*B(B|C)D*E
Sequence Mining Automata (3 Example): Given R ≡ A∗B(B|C)D∗E S ≡ ACDBFAEBCFDE
One-Pass Solution (SMA-1P) and Full-Cut (SMA-FC) Simply using the SMA on each transactions and at the end compute the support for each sequences extracted filtering using the support threshold. The support threshold is not used during the process of generation. We compute All the sequences in the dataset w.r.t the RE. A D B B E C Given a SMA a valid set of cuts is a partition p1, . . . , pn of the places of the SMA such as does not exist a path from a place in pj to a place in pi if j > i. For each cut we apply the SMA-1P on all the DB. At the end of the i-th scan we obtain an intermediate information about frequent patterns that can be used in subsequent scans by removing the infrequent tokens.
Experiments (Synthetic Data): (D=dataset size, N=number of items, C=average length)
Experiments (Mobility data): From San Jose to San Francisco and back – via CA-101 (west-bound of the bay), i.e., passing through San Mateo (cell H9 of our map); or via I-880 (east-bound of the bay), i.e., passing through Hayward (cell J8 of our map).
Conclusions:  We have introduced “Sequence Mining Automata”, a new mechanism for mining frequent sequences under regular expressions.   Around this basic mechanism we built a family of algorithms embedding different techniques.   The efficiency of our proposal has been thoroughly proven empirically.   The SMA is a very simple and fundamental mechanism opening the door to many possible extensions.

Más contenido relacionado

La actualidad más candente

Breadth first search signed
Breadth first search signedBreadth first search signed
Breadth first search signedAfshanKhan51
 
Tele3113 tut1
Tele3113 tut1Tele3113 tut1
Tele3113 tut1Vin Voro
 
2.7 normal forms cnf & problems
2.7 normal forms  cnf & problems2.7 normal forms  cnf & problems
2.7 normal forms cnf & problemsSampath Kumar S
 
Tele3113 tut2
Tele3113 tut2Tele3113 tut2
Tele3113 tut2Vin Voro
 
22. trig identitiessumdiffsinecosinetouchpad
22. trig identitiessumdiffsinecosinetouchpad22. trig identitiessumdiffsinecosinetouchpad
22. trig identitiessumdiffsinecosinetouchpadMedia4math
 
Applied maths for electronics engineers june 2013 (2)
Applied maths for electronics engineers june 2013 (2)Applied maths for electronics engineers june 2013 (2)
Applied maths for electronics engineers june 2013 (2)SRI TECHNOLOGICAL SOLUTIONS
 
Tele3113 tut5
Tele3113 tut5Tele3113 tut5
Tele3113 tut5Vin Voro
 
Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016appasami
 
Tele3113 tut4
Tele3113 tut4Tele3113 tut4
Tele3113 tut4Vin Voro
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmMeghaj Mallick
 
Adding new Query to Druid
Adding new Query to DruidAdding new Query to Druid
Adding new Query to DruidNavis Ryu
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015appasami
 

La actualidad más candente (20)

Breadth first search signed
Breadth first search signedBreadth first search signed
Breadth first search signed
 
Propulsion ii
Propulsion iiPropulsion ii
Propulsion ii
 
Tele3113 tut1
Tele3113 tut1Tele3113 tut1
Tele3113 tut1
 
Mid term
Mid termMid term
Mid term
 
2.7 normal forms cnf & problems
2.7 normal forms  cnf & problems2.7 normal forms  cnf & problems
2.7 normal forms cnf & problems
 
Tele3113 tut2
Tele3113 tut2Tele3113 tut2
Tele3113 tut2
 
Cs 62
Cs 62Cs 62
Cs 62
 
22. trig identitiessumdiffsinecosinetouchpad
22. trig identitiessumdiffsinecosinetouchpad22. trig identitiessumdiffsinecosinetouchpad
22. trig identitiessumdiffsinecosinetouchpad
 
Applied maths for electronics engineers june 2013 (2)
Applied maths for electronics engineers june 2013 (2)Applied maths for electronics engineers june 2013 (2)
Applied maths for electronics engineers june 2013 (2)
 
Sns pre sem
Sns pre semSns pre sem
Sns pre sem
 
Tele3113 tut5
Tele3113 tut5Tele3113 tut5
Tele3113 tut5
 
Prepostinfix
PrepostinfixPrepostinfix
Prepostinfix
 
Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016
 
Assignment2
Assignment2Assignment2
Assignment2
 
Tele3113 tut4
Tele3113 tut4Tele3113 tut4
Tele3113 tut4
 
Lo18
Lo18Lo18
Lo18
 
Turing machine
Turing machineTuring machine
Turing machine
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer Algorithm
 
Adding new Query to Druid
Adding new Query to DruidAdding new Query to Druid
Adding new Query to Druid
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
 

Destacado

5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patternsKrish_ver2
 
Real timefrauddetectiononbigdata
Real timefrauddetectiononbigdataReal timefrauddetectiononbigdata
Real timefrauddetectiononbigdataPranab Ghosh
 
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu
 
Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShareSlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShareSlideShare
 

Destacado (10)

5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 
Real timefrauddetectiononbigdata
Real timefrauddetectiononbigdataReal timefrauddetectiononbigdata
Real timefrauddetectiononbigdata
 
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 8.3 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Temporal data mining
Temporal data miningTemporal data mining
Temporal data mining
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
SPADE -
SPADE - SPADE -
SPADE -
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Similar a Sma

Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding Algorithm
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding AlgorithmFixed Point Realization of Iterative LR-Aided Soft MIMO Decoding Algorithm
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding AlgorithmCSCJournals
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)IJMIT JOURNAL
 
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...Beniamino Murgante
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Vishalkumarec
 
Iaetsd a review on ecg arrhythmia detection
Iaetsd a review on ecg arrhythmia detectionIaetsd a review on ecg arrhythmia detection
Iaetsd a review on ecg arrhythmia detectionIaetsd Iaetsd
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsJonny Daenen
 
Baseband transmission
Baseband transmissionBaseband transmission
Baseband transmissionPunk Pankaj
 
Acquisition of Long Pseudo Code in Dsss Signal
Acquisition of Long Pseudo Code in Dsss SignalAcquisition of Long Pseudo Code in Dsss Signal
Acquisition of Long Pseudo Code in Dsss SignalIJMER
 

Similar a Sma (20)

Er24902905
Er24902905Er24902905
Er24902905
 
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding Algorithm
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding AlgorithmFixed Point Realization of Iterative LR-Aided Soft MIMO Decoding Algorithm
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding Algorithm
 
Lect6 csp
Lect6 cspLect6 csp
Lect6 csp
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)
 
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
 
Nc2421532161
Nc2421532161Nc2421532161
Nc2421532161
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
Lgm saarbrucken
Lgm saarbruckenLgm saarbrucken
Lgm saarbrucken
 
MATEX @ DAC14
MATEX @ DAC14MATEX @ DAC14
MATEX @ DAC14
 
DC_PPT.pptx
DC_PPT.pptxDC_PPT.pptx
DC_PPT.pptx
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
 
Iaetsd a review on ecg arrhythmia detection
Iaetsd a review on ecg arrhythmia detectionIaetsd a review on ecg arrhythmia detection
Iaetsd a review on ecg arrhythmia detection
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
Lecture 3 sapienza 2017
Lecture 3 sapienza 2017Lecture 3 sapienza 2017
Lecture 3 sapienza 2017
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-Joins
 
Baseband transmission
Baseband transmissionBaseband transmission
Baseband transmission
 
Acquisition of Long Pseudo Code in Dsss Signal
Acquisition of Long Pseudo Code in Dsss SignalAcquisition of Long Pseudo Code in Dsss Signal
Acquisition of Long Pseudo Code in Dsss Signal
 

Más de Roberto Trasarti

Más de Roberto Trasarti (8)

Preserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human MobilityPreserving Privacy in Semantic-Rich Trajectories of Human Mobility
Preserving Privacy in Semantic-Rich Trajectories of Human Mobility
 
Cast
CastCast
Cast
 
Roberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti PhD Thesis
Roberto Trasarti PhD Thesis
 
Athena
AthenaAthena
Athena
 
K-BestMatch
K-BestMatchK-BestMatch
K-BestMatch
 
Where Next
Where NextWhere Next
Where Next
 
Daedalus
DaedalusDaedalus
Daedalus
 
ConQueSt
ConQueStConQueSt
ConQueSt
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Sma

  • 1. Sequence Mining Automata: a New Technique for Mining Frequent Sequences Under Regular Expressions Roberto Trasarti, Francesco Bonchi, Bart Goethals
  • 2. Problem Definition (1): Given a database of sequences D, the support of a sequence S ∈ Σ∗ is the number of sequences in D that are supersequences of S: sup(S) = | {T ∈ D | S ⊑ T} |. Given a Regular Expression R a sequence s is valid if can be generated by R. A B A C B A Sequence s: 1 Minimum support: 3 RE: A*BC* A A A B B C A B C C D A B A A B B C 2 C B A A B D A A A B 3 A A B Subsequence: Support: 3 Subsequence: Support: 2 … B C
  • 3. Previous approaches and our contribution: Previous approaches [1,2,3] solve the problem focusing on its search space, exploiting in different ways the pruning power of the regular expression R over unpromising patterns. The idea behind our solution is to focus on the input dataset and the given regular expression: reading the input database we produce for each sequence in the database, all and only the valid patterns contained in the sequences. [1] H. Albert-Lorincz and J.-F. Boulicaut. Mining frequent sequential patterns under regular expressions: A highly adaptive strategy for pushing contraints. In Proc. of SDM’03. [2] M. N. Garofalakis, R. Rastogi, and K. Shim. Spirit: Sequential pattern mining with regular expression constraints. In Proceedings of VLDB’99. [3] J. Pei, J. Han, andW.Wang. Mining sequential patterns with constraints in large databases. In Proc. of CIKM’02. A B ... A C A B C A ... B ... A A ... ... C ... C A B ... A B A C B A A A A B B C ...
  • 4. Sequence Mining Automata (1): Our subsequences mining automata SMA is a specialized kind of Petri Net, which can be constructed from a DFA by transforming each edge of the DFA in a transition with its two arcs from its input place and to its output place. Moreover it has the following peculiarities: • Transitions do not consume tokens• Parallel execution • External signal The initial marking consists of only the token representing the empty sequence ε in the starting places. External signal Example RE: A*B(B|C)D*E
  • 5. Sequence Mining Automata (2): Each transition applies an process which is activated only if the external signal is equal to the label of the edge. This process produces a new set of tokens in the destination place. External signal Example RE: A*B(B|C)D*E
  • 6. Sequence Mining Automata (3 Example): Given R ≡ A∗B(B|C)D∗E S ≡ ACDBFAEBCFDE
  • 7. One-Pass Solution (SMA-1P) and Full-Cut (SMA-FC) Simply using the SMA on each transactions and at the end compute the support for each sequences extracted filtering using the support threshold. The support threshold is not used during the process of generation. We compute All the sequences in the dataset w.r.t the RE. A D B B E C Given a SMA a valid set of cuts is a partition p1, . . . , pn of the places of the SMA such as does not exist a path from a place in pj to a place in pi if j > i. For each cut we apply the SMA-1P on all the DB. At the end of the i-th scan we obtain an intermediate information about frequent patterns that can be used in subsequent scans by removing the infrequent tokens.
  • 8. Experiments (Synthetic Data): (D=dataset size, N=number of items, C=average length)
  • 9. Experiments (Mobility data): From San Jose to San Francisco and back – via CA-101 (west-bound of the bay), i.e., passing through San Mateo (cell H9 of our map); or via I-880 (east-bound of the bay), i.e., passing through Hayward (cell J8 of our map).
  • 10. Conclusions: We have introduced “Sequence Mining Automata”, a new mechanism for mining frequent sequences under regular expressions. Around this basic mechanism we built a family of algorithms embedding different techniques. The efficiency of our proposal has been thoroughly proven empirically. The SMA is a very simple and fundamental mechanism opening the door to many possible extensions.