SlideShare una empresa de Scribd logo
1 de 15
Data Mining Tools
Kowshik
Madhumati
Mayur
Mohamed Sharique
Vidyashankar
• Open source
• Data visualization and analysis
• Novice and experts
• Through Python scripting
• Available for all popular platforms, including
Windows, Mac OS X and variants of Linux.
• Founded on 1996
• Orange is distributed free under the GPL.
• M&D at the Bioinformatics Laboratory of the
Faculty of Computer and Information
Science, University of Ljubljana, Slovenia.
Product Details
Company Details
Python is a widely used general-purpose, high-level programming language.
GNU General Public License is the most widely used free software license
Features
• Visual Programming
• Visualization
• Interaction and Data Analytics
• Large Toolbox
• Scripting Interface
• Extendable
• Documentation
• Open Source
• Platform Independence
Success Stories
• Astra-Zeneca, a pharmaceutical giant, which uses
Orange in drug development and sponsors the
development of several related parts of Orange
• At Jožef Stefan Institute, the visual programming
interface has been upgraded in Orange4WS to
support service-oriented architectures
Screenshot
• Latest R-language engine for statistical computing
• Open source, R- Enterprise, R-Cloud(Paid version )
• Data visualization and analysis up to 16 TB
• Extended capabilities with reproducible R tool Kits
• Windows , Mac OS and variants of Linux.
• Founded on 1993 in New Zealand
• Robert and Rossa pioneer in R language
development .
• R has General Public Licence.
• Many Big MNC companies are using R software.
Product Details
Company Details
Useful Functions • Graphics Visualization
• Spatial Data Analysis
• Clustering
• Text Mining
• Social Network Analysis and Graph mining
• Statistics
• Graphics
• Data Manipulation
Success Stories
• Bank of America
• Bing
• Facebook
• Ford
• Google
Screenshot
• Open source
• a collection of machine learning algorithms
• Data visualization and analysis
• Java based platform
• Most researchers and practitioners
• Founded on 1997
• University of Waikato
Product Details
Company Details
Public License is the most widely used free software license
Features • General public license
• GUI for interacting
• Explorer is the main user interface of WEKA
• primitive tasks including data pre-processing,
classification, regression, clustering, association rules
and visualization
• Execute data files in multiple format
• One exceptional feature of WEKA is the database
connection using JDBC with any RDBMS package
• The Weka mailing list has over 1100
subscribers in 50 countries, including
subscribers from many major companies
such as Rechtsportal
Success Stories
Screenshot
• Open source.
• Data visualization and analysis
• Machine Learning
• Data Mining, Text Mining.
• Business Intelligence.
• Works on java runtime.
• Available on all major operating systems and
platforms
• Started as YALE in 2001 by Ralf Klinkenberg, Ingo
Mierswa, and Simon Fische
• In 2006 it was renamed by Rapidminer since
developed by Rapid-1 founded by Ralf
Klinkenberg, Ingo Mierswa
• Licensed by AGPL.
Product Details
Company Details
Features • A visual - code-free - environment, so no programming needed
• Design of analysis processes
• Predictive analytics (with pre-made templates)
• Data loading
• Data transformation
• Data Modelling
• Data visualization (with lots of visualizations)
• Allows you to work with different types and sizes of data sources
• Platform Independence.
• Acts as a powerful scripting language engine along with a
graphical user
• Modular operator concept.
• Multi-layered data view.
• CISCO
• PAYPAL
• EBAY
• MIELE
• VOLKSWAGEN
Success Stories
Screenshot
Procedure R-Programming RapidMiner Weka Orange
Partitioning of
dataset into training
and testing sets.
Pass (but limited
partitioning
methods)
Pass (but limited
partitioning
methods)
Pass (but limited
partitioning
methods)
Pass (but limited
partitioning
methods)
Descriptor scaling Pass Pass
Fail (cannot save
parameters for
scaling to apply to
future datasets)
Fail (no scaling
methods)
Descriptor selection
Fail (no wrapper
methods)
Pass
Pass (but is not part
of KnowledgeFlow)
Fail (no wrapper
methods)
Parameter
optimization of
machine
learning/statistical
methods
Fail (not automatic) Pass Fail (not automatic) Fail (not automatic)
Model validation
using cross-
validation and/or
independent
validation set
Pass (but limited
error measurement
methods)
Pass
Pass (but cannot
save model so have
to rebuild model for
every future dataset)
Pass (but cannot
save model so have
to rebuild model for
every future dataset)
Overall Comparison
• http://old.biolab.si/
• http://en.wikipedia.org/
• http://www.predictiveanalyticsto
day.com/
• http://thenewstack.io/
• www.facebook.com/
• www.slideshare.net/
• www.kdnuggets.com/
• www.researchgate.net
• https://rapidminer.com/
• www.r-project.org
• sourceforge.net/projects/weka
• www.thearling.com

Más contenido relacionado

La actualidad más candente

Distributed computing
Distributed computingDistributed computing
Distributed computing
shivli0769
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 

La actualidad más candente (20)

Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Data Mining Tools / Orange
Data Mining Tools / OrangeData Mining Tools / Orange
Data Mining Tools / Orange
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycle
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Design cycles of pattern recognition
Design cycles of pattern recognitionDesign cycles of pattern recognition
Design cycles of pattern recognition
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
Challenges of Conventional Systems.pptx
Challenges of Conventional Systems.pptxChallenges of Conventional Systems.pptx
Challenges of Conventional Systems.pptx
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
Lstm
LstmLstm
Lstm
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 

Destacado

DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
Neeraj Goswami
 
orange mineria de datos
orange mineria de datosorange mineria de datos
orange mineria de datos
Omar Cespedes
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
Saeed Iqbal
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
Saif Ullah
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
rathorenitin87
 

Destacado (20)

DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
Orange Canvas - PyData 2013
Orange Canvas - PyData 2013Orange Canvas - PyData 2013
Orange Canvas - PyData 2013
 
Manual orange
Manual orangeManual orange
Manual orange
 
orange mineria de datos
orange mineria de datosorange mineria de datos
orange mineria de datos
 
Data mining
Data miningData mining
Data mining
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
Rudi hartanto tutorial 01 rapid miner 5.3 decision tree
Rudi hartanto   tutorial 01 rapid miner 5.3 decision treeRudi hartanto   tutorial 01 rapid miner 5.3 decision tree
Rudi hartanto tutorial 01 rapid miner 5.3 decision tree
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression models
 
RapidMiner: Important Elements
RapidMiner: Important ElementsRapidMiner: Important Elements
RapidMiner: Important Elements
 
Rapidminer: Visualization Capabilities
Rapidminer:   Visualization CapabilitiesRapidminer:   Visualization Capabilities
Rapidminer: Visualization Capabilities
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining Techniques
 
Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques Chapter 08 Data Mining Techniques
Chapter 08 Data Mining Techniques
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 

Similar a Data mining tools (R , WEKA, RAPID MINER, ORANGE)

zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData Inc.
 
HemantKumarSharma_v1.1
HemantKumarSharma_v1.1HemantKumarSharma_v1.1
HemantKumarSharma_v1.1
hemant sharma
 

Similar a Data mining tools (R , WEKA, RAPID MINER, ORANGE) (20)

Data mining tools overall
Data mining tools overallData mining tools overall
Data mining tools overall
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
2014.07.11 biginsights data2014
2014.07.11 biginsights data20142014.07.11 biginsights data2014
2014.07.11 biginsights data2014
 
UI Dev in Big data world using open source
UI Dev in Big data world using open sourceUI Dev in Big data world using open source
UI Dev in Big data world using open source
 
Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
 
College of Technology Pantnagar lecture- Jainendra
College of Technology Pantnagar lecture- Jainendra College of Technology Pantnagar lecture- Jainendra
College of Technology Pantnagar lecture- Jainendra
 
Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517Which postgres is_right_for_me_20130517
Which postgres is_right_for_me_20130517
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on Hadoop
 
Market research of the analytics tools
Market research of the analytics toolsMarket research of the analytics tools
Market research of the analytics tools
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Knowage essential training
Knowage essential trainingKnowage essential training
Knowage essential training
 
HemantKumarSharma_v1.1
HemantKumarSharma_v1.1HemantKumarSharma_v1.1
HemantKumarSharma_v1.1
 
Open source presentation to Cork County Council
Open source presentation to Cork County CouncilOpen source presentation to Cork County Council
Open source presentation to Cork County Council
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
Coding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE frameworkCoding Secure Infrastructure in the Cloud using the PIE framework
Coding Secure Infrastructure in the Cloud using the PIE framework
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltre
 
SamSegalResume
SamSegalResumeSamSegalResume
SamSegalResume
 

Más de Krishna Petrochemicals (7)

Olive Oil Farming
Olive Oil FarmingOlive Oil Farming
Olive Oil Farming
 
Olive Olive Industrial Farming
Olive Olive Industrial FarmingOlive Olive Industrial Farming
Olive Olive Industrial Farming
 
Brics peste analysis
Brics peste analysisBrics peste analysis
Brics peste analysis
 
Indian tourism policy
Indian tourism policyIndian tourism policy
Indian tourism policy
 
Innovation management
Innovation managementInnovation management
Innovation management
 
Apple Blue ocean-strategy
Apple Blue ocean-strategyApple Blue ocean-strategy
Apple Blue ocean-strategy
 
Organization Development interventions in ibm
Organization Development interventions in ibmOrganization Development interventions in ibm
Organization Development interventions in ibm
 

Último

Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
allensay1
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
dollysharma2066
 

Último (20)

Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLWhitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceEluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business Growth
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 

Data mining tools (R , WEKA, RAPID MINER, ORANGE)

  • 2. • Open source • Data visualization and analysis • Novice and experts • Through Python scripting • Available for all popular platforms, including Windows, Mac OS X and variants of Linux. • Founded on 1996 • Orange is distributed free under the GPL. • M&D at the Bioinformatics Laboratory of the Faculty of Computer and Information Science, University of Ljubljana, Slovenia. Product Details Company Details Python is a widely used general-purpose, high-level programming language. GNU General Public License is the most widely used free software license
  • 3. Features • Visual Programming • Visualization • Interaction and Data Analytics • Large Toolbox • Scripting Interface • Extendable • Documentation • Open Source • Platform Independence Success Stories • Astra-Zeneca, a pharmaceutical giant, which uses Orange in drug development and sponsors the development of several related parts of Orange • At Jožef Stefan Institute, the visual programming interface has been upgraded in Orange4WS to support service-oriented architectures
  • 5. • Latest R-language engine for statistical computing • Open source, R- Enterprise, R-Cloud(Paid version ) • Data visualization and analysis up to 16 TB • Extended capabilities with reproducible R tool Kits • Windows , Mac OS and variants of Linux. • Founded on 1993 in New Zealand • Robert and Rossa pioneer in R language development . • R has General Public Licence. • Many Big MNC companies are using R software. Product Details Company Details
  • 6. Useful Functions • Graphics Visualization • Spatial Data Analysis • Clustering • Text Mining • Social Network Analysis and Graph mining • Statistics • Graphics • Data Manipulation Success Stories • Bank of America • Bing • Facebook • Ford • Google
  • 8. • Open source • a collection of machine learning algorithms • Data visualization and analysis • Java based platform • Most researchers and practitioners • Founded on 1997 • University of Waikato Product Details Company Details Public License is the most widely used free software license
  • 9. Features • General public license • GUI for interacting • Explorer is the main user interface of WEKA • primitive tasks including data pre-processing, classification, regression, clustering, association rules and visualization • Execute data files in multiple format • One exceptional feature of WEKA is the database connection using JDBC with any RDBMS package • The Weka mailing list has over 1100 subscribers in 50 countries, including subscribers from many major companies such as Rechtsportal Success Stories
  • 11. • Open source. • Data visualization and analysis • Machine Learning • Data Mining, Text Mining. • Business Intelligence. • Works on java runtime. • Available on all major operating systems and platforms • Started as YALE in 2001 by Ralf Klinkenberg, Ingo Mierswa, and Simon Fische • In 2006 it was renamed by Rapidminer since developed by Rapid-1 founded by Ralf Klinkenberg, Ingo Mierswa • Licensed by AGPL. Product Details Company Details
  • 12. Features • A visual - code-free - environment, so no programming needed • Design of analysis processes • Predictive analytics (with pre-made templates) • Data loading • Data transformation • Data Modelling • Data visualization (with lots of visualizations) • Allows you to work with different types and sizes of data sources • Platform Independence. • Acts as a powerful scripting language engine along with a graphical user • Modular operator concept. • Multi-layered data view. • CISCO • PAYPAL • EBAY • MIELE • VOLKSWAGEN Success Stories
  • 14. Procedure R-Programming RapidMiner Weka Orange Partitioning of dataset into training and testing sets. Pass (but limited partitioning methods) Pass (but limited partitioning methods) Pass (but limited partitioning methods) Pass (but limited partitioning methods) Descriptor scaling Pass Pass Fail (cannot save parameters for scaling to apply to future datasets) Fail (no scaling methods) Descriptor selection Fail (no wrapper methods) Pass Pass (but is not part of KnowledgeFlow) Fail (no wrapper methods) Parameter optimization of machine learning/statistical methods Fail (not automatic) Pass Fail (not automatic) Fail (not automatic) Model validation using cross- validation and/or independent validation set Pass (but limited error measurement methods) Pass Pass (but cannot save model so have to rebuild model for every future dataset) Pass (but cannot save model so have to rebuild model for every future dataset) Overall Comparison
  • 15. • http://old.biolab.si/ • http://en.wikipedia.org/ • http://www.predictiveanalyticsto day.com/ • http://thenewstack.io/ • www.facebook.com/ • www.slideshare.net/ • www.kdnuggets.com/ • www.researchgate.net • https://rapidminer.com/ • www.r-project.org • sourceforge.net/projects/weka • www.thearling.com

Notas del editor

  1. contains a GUI for interacting with data files and producing visual results
  2. Explorer has several panels providing access to the main components of the workbench: the Preprocess panel has facilities for importing data from a database, a CSV file, etc, and to preprocess this data using a filtering algorithm. Such filters can be used to transform the data and make it possible to delete instances and attributes as per specific criteria. The Classify panel provides the features to apply classification and regression algorithms to the dataset, to estimate the accuracy of the resulting predictive model and visualise erroneous predictions, ROC curves or the model. The Associate panel provides the access for association rule learning to identify the interrelationships between attributes in the data. The Cluster panel or module provides access to the clustering techniques, including simple k-means algorithm and many others. The Select attributes panel provides access to the algorithms for the identification of the most predictive attributes in a dataset. The Visualize panel depicts a scatter plot matrix in which individual scatter plots can be selected, enlarged and analysed using various selection operators.