SlideShare a Scribd company logo
1 of 23
Introduction to XLMiner™ DATA Utilities XLMiner and Microsoft Office are registered trademarks of the respective owners.
Brief description of the features of XLMiner: Data Utilities The XLMiner provides the user with a host of Data Utilities at his disposal. They are: 	The different Data Utilities that XLMiner Provides are:- Sample from Worksheet/Database. ,[object Object]
Stratified Sampling.Missing Data handling. Bin Continuous Data. Transform Categorical Data . http://dataminingtools.net
Sample data from Worksheet When huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain.  The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions.  A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample. XLMiner provides us sampling facilities. http://dataminingtools.net
Sample data from Worksheet In XLMiner, sampling can be done in two ways: Simple Random sampling: 	A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosen Stratified Sampling : 	The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample. http://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling Select the variables to be present in the sample Here “Simple Random sampling is selected We can specify the seed value( value used for random selection) or the wizard will specify it by default. Set the size for the sampled set If selected duplicate copies of records may be used. http://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling output http://dataminingtools.net
Sample data from Worksheet-  Simple Random Sampling output with replacement. Duplicate copies of record exist in the sample. http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate ) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate – output ) As selected by us, the % of records in each stratum in the sample set is same as that in the input set http://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number) All stratums have equal sizes as specified by user (here 10 records each) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum-output) All stratum have size equal to the size of the smallest stratum http://dataminingtools.net
Missing Data Handling This utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants.   XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user. The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variable http://dataminingtools.net
Missing Data Handling http://dataminingtools.net
Missing Data Handling Data Set Select the action to handle the missing data in individual columns and click on “Apply this option to selected variable” http://dataminingtools.net
Missing Data Handling-Output Changed records high-lighted http://dataminingtools.net
Transform Categorical Data Sometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data. There are two ways to transform  categorical data: Creating Dummies:  Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D. Create category scores:  In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to. http://dataminingtools.net
Transform Categorical Data- Dummies Select the variable that contains non-numeric Data and needs to be transformed http://dataminingtools.net
Transform Categorical Data-Category Scores http://dataminingtools.net
Transform Categorical Data-Category Scores(output) http://dataminingtools.net
Thank you For more visit: http://dataminingtools.net http://dataminingtools.net

More Related Content

What's hot

DATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSINGDATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSINGAhtesham Ullah khan
 
Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentationnibraspk
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesDataminingTools Inc
 
Creating a histogram
Creating a histogramCreating a histogram
Creating a histogramKyle Greaves
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingkayathri02
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine LearningTechsparks
 
Data Mining with WEKA WEKA
Data Mining with WEKA WEKAData Mining with WEKA WEKA
Data Mining with WEKA WEKAbutest
 
Analytics machine learning in weka
Analytics machine learning in wekaAnalytics machine learning in weka
Analytics machine learning in wekaSudhakar Chavan
 
Data processing and analysis final
Data processing and analysis finalData processing and analysis final
Data processing and analysis finalAkul10
 

What's hot (17)

Classification
ClassificationClassification
Classification
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
DATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSINGDATA PREPROCESSING AND DATA CLEANSING
DATA PREPROCESSING AND DATA CLEANSING
 
Dsa unit 1
Dsa unit 1Dsa unit 1
Dsa unit 1
 
Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentation
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And Attributes
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Fundamental of SPSS
Fundamental of SPSSFundamental of SPSS
Fundamental of SPSS
 
Creating a histogram
Creating a histogramCreating a histogram
Creating a histogram
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginner
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine Learning
 
Data Mining with WEKA WEKA
Data Mining with WEKA WEKAData Mining with WEKA WEKA
Data Mining with WEKA WEKA
 
Analytics machine learning in weka
Analytics machine learning in wekaAnalytics machine learning in weka
Analytics machine learning in weka
 
Data Mining: Data Preprocessing
Data Mining: Data PreprocessingData Mining: Data Preprocessing
Data Mining: Data Preprocessing
 
Data processing and analysis final
Data processing and analysis finalData processing and analysis final
Data processing and analysis final
 

Viewers also liked

MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseDataminingTools Inc
 
Huidige status van de testtaal TTCN-3
Huidige status van de testtaal TTCN-3Huidige status van de testtaal TTCN-3
Huidige status van de testtaal TTCN-3Erik Altena
 
Cinnamonhotel saigon 2013_01
Cinnamonhotel saigon 2013_01Cinnamonhotel saigon 2013_01
Cinnamonhotel saigon 2013_01cinnamonhotel
 
Direct-services portfolio
Direct-services portfolioDirect-services portfolio
Direct-services portfoliovlastakolaja
 
MS Sql Server: Deleting A Database
MS Sql Server: Deleting A DatabaseMS Sql Server: Deleting A Database
MS Sql Server: Deleting A DatabaseDataminingTools Inc
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozanolara
 
DataKraft - Powerful No-Coding Platform for Business Applications
DataKraft - Powerful No-Coding Platform for Business ApplicationsDataKraft - Powerful No-Coding Platform for Business Applications
DataKraft - Powerful No-Coding Platform for Business ApplicationsTibbs Pereira
 
Procedures And Functions in Matlab
Procedures And Functions in MatlabProcedures And Functions in Matlab
Procedures And Functions in MatlabDataminingTools Inc
 
MS Sql Server: Doing Calculations With Functions
MS Sql Server: Doing Calculations With FunctionsMS Sql Server: Doing Calculations With Functions
MS Sql Server: Doing Calculations With FunctionsDataminingTools Inc
 

Viewers also liked (20)

XL-MINER:Partition
XL-MINER:PartitionXL-MINER:Partition
XL-MINER:Partition
 
MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating Database
 
Huidige status van de testtaal TTCN-3
Huidige status van de testtaal TTCN-3Huidige status van de testtaal TTCN-3
Huidige status van de testtaal TTCN-3
 
Cinnamonhotel saigon 2013_01
Cinnamonhotel saigon 2013_01Cinnamonhotel saigon 2013_01
Cinnamonhotel saigon 2013_01
 
Direct-services portfolio
Direct-services portfolioDirect-services portfolio
Direct-services portfolio
 
MS Sql Server: Deleting A Database
MS Sql Server: Deleting A DatabaseMS Sql Server: Deleting A Database
MS Sql Server: Deleting A Database
 
Txomin Hartz Txikia
Txomin Hartz TxikiaTxomin Hartz Txikia
Txomin Hartz Txikia
 
Ontwikkeling In Eigen Handen Nl Web
Ontwikkeling In Eigen Handen Nl WebOntwikkeling In Eigen Handen Nl Web
Ontwikkeling In Eigen Handen Nl Web
 
Retrieving Data From A Database
Retrieving Data From A DatabaseRetrieving Data From A Database
Retrieving Data From A Database
 
Txomin Hartz Txikia
Txomin Hartz TxikiaTxomin Hartz Txikia
Txomin Hartz Txikia
 
LISP: Macros in lisp
LISP: Macros in lispLISP: Macros in lisp
LISP: Macros in lisp
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozano
 
DataKraft - Powerful No-Coding Platform for Business Applications
DataKraft - Powerful No-Coding Platform for Business ApplicationsDataKraft - Powerful No-Coding Platform for Business Applications
DataKraft - Powerful No-Coding Platform for Business Applications
 
Procedures And Functions in Matlab
Procedures And Functions in MatlabProcedures And Functions in Matlab
Procedures And Functions in Matlab
 
R: Apply Functions
R: Apply FunctionsR: Apply Functions
R: Apply Functions
 
LISP:Object System Lisp
LISP:Object System LispLISP:Object System Lisp
LISP:Object System Lisp
 
Probability And Its Axioms
Probability And Its AxiomsProbability And Its Axioms
Probability And Its Axioms
 
Data Applied: Association
Data Applied: AssociationData Applied: Association
Data Applied: Association
 
SPSS: Quick Look
SPSS: Quick LookSPSS: Quick Look
SPSS: Quick Look
 
MS Sql Server: Doing Calculations With Functions
MS Sql Server: Doing Calculations With FunctionsMS Sql Server: Doing Calculations With Functions
MS Sql Server: Doing Calculations With Functions
 

Similar to XL-MINER: Data Utilities

XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Minerxlminer content
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advancedexcel content
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginnerexcel content
 
PATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETPATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETMichael Heron
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningNandakumar P
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)Chhom Karath
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structuresecomputernotes
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data PreprocessingT Kavitha
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Splunk 6.2 new features
Splunk 6.2 new featuresSplunk 6.2 new features
Splunk 6.2 new featuresCleverDATA
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1ecomputernotes
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambastVijay Ambast
 
Chapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfChapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfAxmedcarb
 

Similar to XL-MINER: Data Utilities (20)

XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Miner
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginner
 
PATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETPATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NET
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data Mining
 
data mining
data miningdata mining
data mining
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structures
 
somhelpdoc
somhelpdocsomhelpdoc
somhelpdoc
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
XL MINER: Associations
XL MINER: AssociationsXL MINER: Associations
XL MINER: Associations
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Splunk 6.2 new features
Splunk 6.2 new featuresSplunk 6.2 new features
Splunk 6.2 new features
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambast
 
Chapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdfChapter 1 Introduction to Data Structures and Algorithms.pdf
Chapter 1 Introduction to Data Structures and Algorithms.pdf
 

More from DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

More from DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Recently uploaded

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Recently uploaded (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

XL-MINER: Data Utilities

  • 1. Introduction to XLMiner™ DATA Utilities XLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2.
  • 3. Stratified Sampling.Missing Data handling. Bin Continuous Data. Transform Categorical Data . http://dataminingtools.net
  • 4. Sample data from Worksheet When huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain. The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions. A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample. XLMiner provides us sampling facilities. http://dataminingtools.net
  • 5. Sample data from Worksheet In XLMiner, sampling can be done in two ways: Simple Random sampling: A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosen Stratified Sampling : The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample. http://dataminingtools.net
  • 6. Sample data from Worksheet- Simple Random Sampling Select the variables to be present in the sample Here “Simple Random sampling is selected We can specify the seed value( value used for random selection) or the wizard will specify it by default. Set the size for the sampled set If selected duplicate copies of records may be used. http://dataminingtools.net
  • 7. Sample data from Worksheet- Simple Random Sampling output http://dataminingtools.net
  • 8. Sample data from Worksheet- Simple Random Sampling output with replacement. Duplicate copies of record exist in the sample. http://dataminingtools.net
  • 9. Sample data from Worksheet- Stratified Sample( proportionate ) http://dataminingtools.net
  • 10. Sample data from Worksheet- Stratified Sample( proportionate – output ) As selected by us, the % of records in each stratum in the sample set is same as that in the input set http://dataminingtools.net
  • 11. Sample data from Worksheet- Stratified Sample(specify number) http://dataminingtools.net
  • 12. Sample data from Worksheet- Stratified Sample(specify number) All stratums have equal sizes as specified by user (here 10 records each) http://dataminingtools.net
  • 13. Sample data from Worksheet- Stratified Sample( size of smallest stratum) http://dataminingtools.net
  • 14. Sample data from Worksheet- Stratified Sample( size of smallest stratum-output) All stratum have size equal to the size of the smallest stratum http://dataminingtools.net
  • 15. Missing Data Handling This utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants.   XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user. The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variable http://dataminingtools.net
  • 16. Missing Data Handling http://dataminingtools.net
  • 17. Missing Data Handling Data Set Select the action to handle the missing data in individual columns and click on “Apply this option to selected variable” http://dataminingtools.net
  • 18. Missing Data Handling-Output Changed records high-lighted http://dataminingtools.net
  • 19. Transform Categorical Data Sometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data. There are two ways to transform categorical data: Creating Dummies: Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D. Create category scores: In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to. http://dataminingtools.net
  • 20. Transform Categorical Data- Dummies Select the variable that contains non-numeric Data and needs to be transformed http://dataminingtools.net
  • 21. Transform Categorical Data-Category Scores http://dataminingtools.net
  • 22. Transform Categorical Data-Category Scores(output) http://dataminingtools.net
  • 23. Thank you For more visit: http://dataminingtools.net http://dataminingtools.net
  • 24. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net