SlideShare una empresa de Scribd logo
1 de 14
Introduction to XLMiner™:  PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data.  http://dataminingtools.net
Types of Partitions XLMiner allows us to create 2 kinds of partitions: Standard Partition: Creates 3 partitions based on the partition ratios provided. Data records are randomly elected and every record  has an equal chance of lying in any of the partition. ,[object Object]
Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
Data Set used for Partition http://dataminingtools.net
Standard Partition (Automatic)-Step 1 http://dataminingtools.net
Standard Partition (Automatic)-Output 	Testing Set			Validation Set http://dataminingtools.net
Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
Oversampled Partition – Step 1 http://dataminingtools.net
Oversampled Partition – Output The records in the training data set http://dataminingtools.net
Oversampled Partition – Output Rows in Validation set = 27,  		Rows in testing set = 30% of 27 = 12. http://dataminingtools.net

Más contenido relacionado

La actualidad más candente

Necto 16 training 15 formulas and exceptions
Necto 16 training 15   formulas and exceptionsNecto 16 training 15   formulas and exceptions
Necto 16 training 15 formulas and exceptionsPanorama Software
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
Comparison statisticalsignificancetestir
Comparison statisticalsignificancetestirComparison statisticalsignificancetestir
Comparison statisticalsignificancetestirClaudia Ribeiro
 
Conditional formatting
Conditional formattingConditional formatting
Conditional formattingum5ashm
 
multiple linear regression in spss (procedure and output)
multiple linear regression in spss (procedure and output)multiple linear regression in spss (procedure and output)
multiple linear regression in spss (procedure and output)Unexplord Solutions LLP
 
Conditional formatting
Conditional formattingConditional formatting
Conditional formattingum5ashm
 
Computer simulation technique the definitive introduction - harry perros
Computer simulation technique   the definitive introduction - harry perrosComputer simulation technique   the definitive introduction - harry perros
Computer simulation technique the definitive introduction - harry perrosJesmin Rahaman
 
chi square test of independence or test of association (procedre ad output)
chi square test of independence or test of association (procedre ad output)chi square test of independence or test of association (procedre ad output)
chi square test of independence or test of association (procedre ad output)Unexplord Solutions LLP
 
1 h nmr spectrum using chemdraw
1 h nmr spectrum using chemdraw1 h nmr spectrum using chemdraw
1 h nmr spectrum using chemdrawmanimekalai34
 
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)Alexis Bondu
 
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...Soumya Nagadadinni
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selectionDavis David
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
 

La actualidad más candente (20)

Necto 16 training 15 formulas and exceptions
Necto 16 training 15   formulas and exceptionsNecto 16 training 15   formulas and exceptions
Necto 16 training 15 formulas and exceptions
 
Dma unit 2
Dma unit  2Dma unit  2
Dma unit 2
 
Comparison and evaluation of alternative designs
Comparison and evaluation of alternative designsComparison and evaluation of alternative designs
Comparison and evaluation of alternative designs
 
Classification
ClassificationClassification
Classification
 
Dsa unit 1
Dsa unit 1Dsa unit 1
Dsa unit 1
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Input modeling
Input modelingInput modeling
Input modeling
 
Comparison statisticalsignificancetestir
Comparison statisticalsignificancetestirComparison statisticalsignificancetestir
Comparison statisticalsignificancetestir
 
Conditional formatting
Conditional formattingConditional formatting
Conditional formatting
 
multiple linear regression in spss (procedure and output)
multiple linear regression in spss (procedure and output)multiple linear regression in spss (procedure and output)
multiple linear regression in spss (procedure and output)
 
Conditional formatting
Conditional formattingConditional formatting
Conditional formatting
 
Computer simulation technique the definitive introduction - harry perros
Computer simulation technique   the definitive introduction - harry perrosComputer simulation technique   the definitive introduction - harry perros
Computer simulation technique the definitive introduction - harry perros
 
chi square test of independence or test of association (procedre ad output)
chi square test of independence or test of association (procedre ad output)chi square test of independence or test of association (procedre ad output)
chi square test of independence or test of association (procedre ad output)
 
1 h nmr spectrum using chemdraw
1 h nmr spectrum using chemdraw1 h nmr spectrum using chemdraw
1 h nmr spectrum using chemdraw
 
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)
EXTRACTION OF SEQUENTIAL RULES (VIDEO 4/4)
 
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...
Accuracy-Constrained Privacy-Preserving Access Control Mechanism For Relation...
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
 
Output analysis of a single model
Output analysis of a single modelOutput analysis of a single model
Output analysis of a single model
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 

Destacado

HistoriografíA Latina LatíN Ii
HistoriografíA Latina LatíN IiHistoriografíA Latina LatíN Ii
HistoriografíA Latina LatíN Iilara
 
MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseDataminingTools Inc
 
Public Transportation
Public TransportationPublic Transportation
Public Transportationdpapageorge
 
RapidMiner: Advanced Processes And Operators
RapidMiner:  Advanced Processes And OperatorsRapidMiner:  Advanced Processes And Operators
RapidMiner: Advanced Processes And OperatorsDataminingTools Inc
 
Direct-services portfolio
Direct-services portfolioDirect-services portfolio
Direct-services portfoliovlastakolaja
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozanolara
 
Excel Datamining Addin Intermediate
Excel Datamining Addin IntermediateExcel Datamining Addin Intermediate
Excel Datamining Addin IntermediateDataminingTools Inc
 
Pentaho: Reporting Solution Development
Pentaho: Reporting Solution DevelopmentPentaho: Reporting Solution Development
Pentaho: Reporting Solution DevelopmentDataminingTools Inc
 
Survival Strategies For Testers
Survival Strategies For TestersSurvival Strategies For Testers
Survival Strategies For TestersErik Altena
 
Quantica Construction Search
Quantica Construction SearchQuantica Construction Search
Quantica Construction SearchQSSCONSTRUCT
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesDataminingTools Inc
 

Destacado (19)

Data Applied: Association
Data Applied: AssociationData Applied: Association
Data Applied: Association
 
HistoriografíA Latina LatíN Ii
HistoriografíA Latina LatíN IiHistoriografíA Latina LatíN Ii
HistoriografíA Latina LatíN Ii
 
MS Sql Server: Manipulating Database
MS Sql Server: Manipulating DatabaseMS Sql Server: Manipulating Database
MS Sql Server: Manipulating Database
 
Public Transportation
Public TransportationPublic Transportation
Public Transportation
 
Txomin Hartz Txikia
Txomin Hartz TxikiaTxomin Hartz Txikia
Txomin Hartz Txikia
 
Oracle: DML
Oracle: DMLOracle: DML
Oracle: DML
 
RapidMiner: Advanced Processes And Operators
RapidMiner:  Advanced Processes And OperatorsRapidMiner:  Advanced Processes And Operators
RapidMiner: Advanced Processes And Operators
 
Matlab Importing Data
Matlab Importing DataMatlab Importing Data
Matlab Importing Data
 
Direct-services portfolio
Direct-services portfolioDirect-services portfolio
Direct-services portfolio
 
LíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara LozanoLíRica Latina 2ºBac Lara Lozano
LíRica Latina 2ºBac Lara Lozano
 
Excel Datamining Addin Intermediate
Excel Datamining Addin IntermediateExcel Datamining Addin Intermediate
Excel Datamining Addin Intermediate
 
Mysql:Operators
Mysql:OperatorsMysql:Operators
Mysql:Operators
 
Retrieving Data From A Database
Retrieving Data From A DatabaseRetrieving Data From A Database
Retrieving Data From A Database
 
Pentaho: Reporting Solution Development
Pentaho: Reporting Solution DevelopmentPentaho: Reporting Solution Development
Pentaho: Reporting Solution Development
 
Survival Strategies For Testers
Survival Strategies For TestersSurvival Strategies For Testers
Survival Strategies For Testers
 
Quantica Construction Search
Quantica Construction SearchQuantica Construction Search
Quantica Construction Search
 
SPSS: File Managment
SPSS: File ManagmentSPSS: File Managment
SPSS: File Managment
 
Miedo Jajjjajajja
Miedo JajjjajajjaMiedo Jajjjajajja
Miedo Jajjjajajja
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rules
 

Similar a XL-MINER:Partition (20)

XL Miner: Classification
XL Miner: ClassificationXL Miner: Classification
XL Miner: Classification
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classification
 
XL-MINER:Data Utilities
XL-MINER:Data UtilitiesXL-MINER:Data Utilities
XL-MINER:Data Utilities
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
prova4
prova4prova4
prova4
 
provalast
provalastprovalast
provalast
 
test3
test3test3
test3
 
test2
test2test2
test2
 
provoora
provooraprovoora
provoora
 
remoto2
remoto2remoto2
remoto2
 
provacompleta2
provacompleta2provacompleta2
provacompleta2
 
finalelocale2
finalelocale2finalelocale2
finalelocale2
 
domenica2
domenica2domenica2
domenica2
 
provarealw4
provarealw4provarealw4
provarealw4
 
test2
test2test2
test2
 
prova3
prova3prova3
prova3
 
stasera1
stasera1stasera1
stasera1
 
provarealw2
provarealw2provarealw2
provarealw2
 
prova5
prova5prova5
prova5
 

Más de DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Más de DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

XL-MINER:Partition

  • 1. Introduction to XLMiner™: PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2. Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data. http://dataminingtools.net
  • 3.
  • 4. Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
  • 5. Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
  • 6. Data Set used for Partition http://dataminingtools.net
  • 7. Standard Partition (Automatic)-Step 1 http://dataminingtools.net
  • 8. Standard Partition (Automatic)-Output Testing Set Validation Set http://dataminingtools.net
  • 9. Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
  • 10. Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
  • 11. Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
  • 12. Oversampled Partition – Step 1 http://dataminingtools.net
  • 13. Oversampled Partition – Output The records in the training data set http://dataminingtools.net
  • 14. Oversampled Partition – Output Rows in Validation set = 27, Rows in testing set = 30% of 27 = 12. http://dataminingtools.net
  • 15. Thank you For more visit: http://dataminingtools.net http://dataminingtools.net
  • 16. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net