SlideShare una empresa de Scribd logo
1 de 23
Data Mining Tool
Neeraj Goswami
Contents
• Data mining
• Data warehouse
• Orange Software
• Orange Widgets
• Demo
What is Data Mining ?
• process of analyzing
data from different
perspectives
• summarizing it into
useful information
• information that can be
used to increase
revenue, cuts costs, or
both.
Analysis(cont…)
Data mining helps analysts recognize significant
• facts
• relationships
• trends
• patterns
• Exceptions
• anomalies
that might otherwise go unnoticed.
Industries Using Data Mining
• retail
• finance
• heath care
• manufacturing transportation
• aerospace
Major Data Mining Tasks
1)Classification: Predicting an item class
2)Clustering: descriptive, finding groups of
items
3)Deviation Detection: predictive, finding
changes
4)Forecasting: predicting a parameter value
5)Description: describing a group
6)Link analysis: finding relationships and
associations
Data Warehouse
A single, complete and
consistent store of data
obtained from a variety of
different sources made
available to end users in a
what they can understand
and use in a business
context.
Data Warehouse-Layers
Decision Tree(classification algo.)
20 No Low
25 Yes High
44 Yes High
18 No Low
55 No High
35 No Low
Smoke
Age
Yes No
0-35 36 - 100
Insurance
Risk
High
High
Low
Age Smoke Risk
Decision tree advantages
• Its model is simple to understand and
interpret
• Requires little data preparation
• Possible to validate a model using
statistical tests.
• Robust
ORANGE SOFTWARE
 Open source
 Component based
 data visualization
 analysis for novice and
experts.
 Data mining through visual
programming or Python
scripting.
 Add-ons for bioinformatics
and text mining.
 Packed with features for data
analytics.
Orange Developments
• In1997-developed in Bioinformatics Laboratory
of the Faculty of Computer and
Information Science, Slovenia.
• In 2005- extents data analysis
in bioinformatics
• In 2008- installation packages were developed.
• In 2009- over 100 widgets were created and
maintained.
Widgets ?
• Orange widgets provide a graphical user’s
interface to Orange’s data mining and
machine learning methods. They include
widgets for
• data entry and preprocessing
• data visualization,
• Classification
Data Widget
Classify Widget
Examples
• Any of your schemas
should probably start
with the file widget. In
the schema below, the
widget is used to read
the data that is then
sent to both data
table widget and to
widget that
displays attributes
statistics.
Scatter Plot( a widget)
Scripting
Visualization
DEMO
References
– http://orange.biolab.si/docs/latest/
– http://en.wikipedia.org/wiki/Data_mining
– http://www.oracle.com/technetwork/database/o
ptions/advanced-analytics/odm/index.html
– http://orange.biolab.si/features/
– http://en.wikipedia.org/wiki/Orange_(software)
– http://eprints.fri.uni-lj.si/1150/1/DataMining-
Kyoto.pdf
THANK YOU
Questions
????

Más contenido relacionado

La actualidad más candente

Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
Simplilearn
 

La actualidad más candente (20)

Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
 
Data Visualization - A Brief Overview
Data Visualization - A Brief OverviewData Visualization - A Brief Overview
Data Visualization - A Brief Overview
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
Data analytics
Data analyticsData analytics
Data analytics
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Text MIning
Text MIningText MIning
Text MIning
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Tableau Presentation
Tableau PresentationTableau Presentation
Tableau Presentation
 
OLAP operations
OLAP operationsOLAP operations
OLAP operations
 
Data Visualization Tools
Data Visualization ToolsData Visualization Tools
Data Visualization Tools
 
Education data mining presentation
Education data mining presentationEducation data mining presentation
Education data mining presentation
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 

Destacado (7)

RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
ViewPorter® Louis™ Machine Learning
ViewPorter® Louis™ Machine LearningViewPorter® Louis™ Machine Learning
ViewPorter® Louis™ Machine Learning
 
orange mineria de datos
orange mineria de datosorange mineria de datos
orange mineria de datos
 
Orange Canvas - PyData 2013
Orange Canvas - PyData 2013Orange Canvas - PyData 2013
Orange Canvas - PyData 2013
 
Mengenal Rapidminer
Mengenal RapidminerMengenal Rapidminer
Mengenal Rapidminer
 
Rapid miner
Rapid minerRapid miner
Rapid miner
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 

Similar a DATA MINING TOOL- ORANGE

Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
javed75
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentation
Priyesh Patel
 
Unit 3 3 architectural design
Unit 3 3 architectural designUnit 3 3 architectural design
Unit 3 3 architectural design
Hiren Selani
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
Subrat Swain
 

Similar a DATA MINING TOOL- ORANGE (20)

Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentation
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
Abhishek Training PPT.pptx
Abhishek Training PPT.pptxAbhishek Training PPT.pptx
Abhishek Training PPT.pptx
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data Visualisation
 
Unit 3 3 architectural design
Unit 3 3 architectural designUnit 3 3 architectural design
Unit 3 3 architectural design
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Applying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data SetsApplying Auto-Data Classification Techniques for Large Data Sets
Applying Auto-Data Classification Techniques for Large Data Sets
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
 
Tech essentials for Product managers
Tech essentials for Product managersTech essentials for Product managers
Tech essentials for Product managers
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 

Último

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

DATA MINING TOOL- ORANGE

  • 2. Contents • Data mining • Data warehouse • Orange Software • Orange Widgets • Demo
  • 3. What is Data Mining ? • process of analyzing data from different perspectives • summarizing it into useful information • information that can be used to increase revenue, cuts costs, or both.
  • 4. Analysis(cont…) Data mining helps analysts recognize significant • facts • relationships • trends • patterns • Exceptions • anomalies that might otherwise go unnoticed.
  • 5. Industries Using Data Mining • retail • finance • heath care • manufacturing transportation • aerospace
  • 6. Major Data Mining Tasks 1)Classification: Predicting an item class 2)Clustering: descriptive, finding groups of items 3)Deviation Detection: predictive, finding changes 4)Forecasting: predicting a parameter value 5)Description: describing a group 6)Link analysis: finding relationships and associations
  • 7. Data Warehouse A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.
  • 9. Decision Tree(classification algo.) 20 No Low 25 Yes High 44 Yes High 18 No Low 55 No High 35 No Low Smoke Age Yes No 0-35 36 - 100 Insurance Risk High High Low Age Smoke Risk
  • 10. Decision tree advantages • Its model is simple to understand and interpret • Requires little data preparation • Possible to validate a model using statistical tests. • Robust
  • 11. ORANGE SOFTWARE  Open source  Component based  data visualization  analysis for novice and experts.  Data mining through visual programming or Python scripting.  Add-ons for bioinformatics and text mining.  Packed with features for data analytics.
  • 12. Orange Developments • In1997-developed in Bioinformatics Laboratory of the Faculty of Computer and Information Science, Slovenia. • In 2005- extents data analysis in bioinformatics • In 2008- installation packages were developed. • In 2009- over 100 widgets were created and maintained.
  • 13. Widgets ? • Orange widgets provide a graphical user’s interface to Orange’s data mining and machine learning methods. They include widgets for • data entry and preprocessing • data visualization, • Classification
  • 16. Examples • Any of your schemas should probably start with the file widget. In the schema below, the widget is used to read the data that is then sent to both data table widget and to widget that displays attributes statistics.
  • 17. Scatter Plot( a widget)
  • 20. DEMO
  • 21. References – http://orange.biolab.si/docs/latest/ – http://en.wikipedia.org/wiki/Data_mining – http://www.oracle.com/technetwork/database/o ptions/advanced-analytics/odm/index.html – http://orange.biolab.si/features/ – http://en.wikipedia.org/wiki/Orange_(software) – http://eprints.fri.uni-lj.si/1150/1/DataMining- Kyoto.pdf