SlideShare una empresa de Scribd logo
1 de 16
CSCE 3110
  Data Structures
  & Algorithm Analysis

Rada Mihalcea
http://www.cs.unt.edu/~rada/CSCE3110
Trees Applications
Trees: A Review (again? )
General trees
  one parent, N children
Binary tree
  ISA General tree
  + max 2 children
Binary search tree
  ISA Binary tree
  + left subtree < parent < right subtree
AVL tree
  ISA Binary search tree
  + | height left subtree – height right subtree | ≤ 1
Trees: A Review (cont’d)
Multi-way search tree
  ISA General tree
  + Each node has K keys and K+1 children
  + All keys in child K < key K < all keys in child K+1
2-4 Tree
  ISA Multi-way search tree
  + All nodes have at most 3 keys / 4 children
  + All leaves are at the same level
B-Tree
  ISA Multi-way search tree
  + All nodes have at least T keys, at most 2T(+1) keys
  + All leaves are at the same level
Tree Applications
Data Compression
  Huffman tree


Automatic Learning
  Decision trees
Huffman code
Very often used for text compression
Do you know how gzip or winzip works?
 Compression methods

ASCII code uses codes of equal length for all
letters  how many codes?
Today’s alternative to ASCII?

Idea behind Huffman code: use shorter length
codes for letters that are more frequent
Huffman Code
  Build a list of letters and frequencies
“have a great day today”

  Build a Huffman Tree bottom up, by grouping
  letters with smaller occurrence frequencies
Huffman Codes
Write the Huffman codes for the strings
  “abracadabra”

  “Veni Vidi Vici”
Huffman Code
Running time?
Suppose N letters in input string, with L unique
letters

What is the most important factor for obtaining
highest compression?
Compare: [assume a text with a total of 1000
characters]
   I. Three different characters, each occurring the same
   number of times
   II. 20 different characters, 19 of them occurring only once,
   and the 20st occurring the rest of the time
One More Application
Heuristic Search
  Decision Trees


Given a set of examples, with an associated
decision (e.g. good/bad, +/-, pass/fail,
caseI/caseII/caseIII, etc.)
Attempt to take (automatically) a decision
when a new example is presented
  Predict the behavior in new cases!
Data Records
Name             A B CDE FG
1. Jeffrey B.    1 0 1 0 1 0 1-
2. Paul S.       0 1 1 0 0 0 1-
3. Daniel C.     0 0 1 0 0 0 0-
4. Gregory P.    1 0 1 0 1 0 0-
5. Michael N.    0 0 1 1 0 0 0-

6. Corinne N.    1   1   1   0   1   0   1+
7. Mariyam M.    0   1   0   1   0   0   1+
8. Stephany D.   1   1   1   1   1   1   1+
9. Mary D.       1   1   1   1   1   1   1+
10. Jamie F.     1   1   1   0   0   1   1+
Fields in the Record
A: First name ends in a vowel?
B: Neat handwriting?
C: Middle name listed?
D: Senior?
E: Got extra-extra credit?
F: Google brings up home page?
G: Google brings up reference?
Build a Classification Tree
Internal nodes: features
Leaves: classification

                            F
                      0              1

                  A                  D

          2,3,7   1,4,5,6       10
                                         A
   Error: 30%                                8,9
Different Search Problem
Given a set of data records with their
  classifications, pick a decision tree: search
  problem!
Challenges:
  Scoring function?
  Large space of trees.

What’s a good tree?
 Low error on given set of records
 Small
“Perfect” Decision Tree


               C       middle name?
           0           1

               0       E    EEC?
                            1
   Google?     F            B    Neat?
          0             0
                   1            1


Training set Error: 0%
(can always do this?)
Search For a Classification
  Classify new records

New1. Mike M.            1 0 1 1 0 0 1 ?
New2. Jerry K.           0 1 0 1 0 0 0 ?
The very last tree for
this class

Más contenido relacionado

Destacado (7)

TC Day-1 Training
TC Day-1 TrainingTC Day-1 Training
TC Day-1 Training
 
Drug Abuse Prevention
Drug Abuse PreventionDrug Abuse Prevention
Drug Abuse Prevention
 
The Way to Success Timesharing
The Way to Success TimesharingThe Way to Success Timesharing
The Way to Success Timesharing
 
Parts
PartsParts
Parts
 
Word
WordWord
Word
 
Programari lliure
Programari lliureProgramari lliure
Programari lliure
 
South park copia
South park   copiaSouth park   copia
South park copia
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Tree apps

  • 1. CSCE 3110 Data Structures & Algorithm Analysis Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Trees Applications
  • 2. Trees: A Review (again? ) General trees one parent, N children Binary tree ISA General tree + max 2 children Binary search tree ISA Binary tree + left subtree < parent < right subtree AVL tree ISA Binary search tree + | height left subtree – height right subtree | ≤ 1
  • 3. Trees: A Review (cont’d) Multi-way search tree ISA General tree + Each node has K keys and K+1 children + All keys in child K < key K < all keys in child K+1 2-4 Tree ISA Multi-way search tree + All nodes have at most 3 keys / 4 children + All leaves are at the same level B-Tree ISA Multi-way search tree + All nodes have at least T keys, at most 2T(+1) keys + All leaves are at the same level
  • 4. Tree Applications Data Compression Huffman tree Automatic Learning Decision trees
  • 5. Huffman code Very often used for text compression Do you know how gzip or winzip works?  Compression methods ASCII code uses codes of equal length for all letters  how many codes? Today’s alternative to ASCII? Idea behind Huffman code: use shorter length codes for letters that are more frequent
  • 6. Huffman Code Build a list of letters and frequencies “have a great day today” Build a Huffman Tree bottom up, by grouping letters with smaller occurrence frequencies
  • 7. Huffman Codes Write the Huffman codes for the strings “abracadabra” “Veni Vidi Vici”
  • 8. Huffman Code Running time? Suppose N letters in input string, with L unique letters What is the most important factor for obtaining highest compression? Compare: [assume a text with a total of 1000 characters] I. Three different characters, each occurring the same number of times II. 20 different characters, 19 of them occurring only once, and the 20st occurring the rest of the time
  • 9. One More Application Heuristic Search Decision Trees Given a set of examples, with an associated decision (e.g. good/bad, +/-, pass/fail, caseI/caseII/caseIII, etc.) Attempt to take (automatically) a decision when a new example is presented Predict the behavior in new cases!
  • 10. Data Records Name A B CDE FG 1. Jeffrey B. 1 0 1 0 1 0 1- 2. Paul S. 0 1 1 0 0 0 1- 3. Daniel C. 0 0 1 0 0 0 0- 4. Gregory P. 1 0 1 0 1 0 0- 5. Michael N. 0 0 1 1 0 0 0- 6. Corinne N. 1 1 1 0 1 0 1+ 7. Mariyam M. 0 1 0 1 0 0 1+ 8. Stephany D. 1 1 1 1 1 1 1+ 9. Mary D. 1 1 1 1 1 1 1+ 10. Jamie F. 1 1 1 0 0 1 1+
  • 11. Fields in the Record A: First name ends in a vowel? B: Neat handwriting? C: Middle name listed? D: Senior? E: Got extra-extra credit? F: Google brings up home page? G: Google brings up reference?
  • 12. Build a Classification Tree Internal nodes: features Leaves: classification F 0 1 A D 2,3,7 1,4,5,6 10 A Error: 30% 8,9
  • 13. Different Search Problem Given a set of data records with their classifications, pick a decision tree: search problem! Challenges: Scoring function? Large space of trees. What’s a good tree? Low error on given set of records Small
  • 14. “Perfect” Decision Tree C middle name? 0 1 0 E EEC? 1 Google? F B Neat? 0 0 1 1 Training set Error: 0% (can always do this?)
  • 15. Search For a Classification Classify new records New1. Mike M. 1 0 1 1 0 0 1 ? New2. Jerry K. 0 1 0 1 0 0 0 ?
  • 16. The very last tree for this class