SlideShare una empresa de Scribd logo
1 de 15
ผู้ช่วยศาสตราจารย์จิรัฎฐา  ภูบุญอบ  ( jiratta . [email_address] . ac . th, 08-9275-9797 ) DATA PREPROCESSING 2
1.  WHY DO WE NEED TO PREPROCESS THE DATA? ,[object Object],[object Object],[object Object],[object Object],[object Object]
2. DATA CLEANING ,[object Object],3000 D 30 99999 F 55101 1005 1000 S 0 50000 M 6269 1004 7000 S 45 10000000 90210 1003 4000 W 40 -40000 F J2S7K7 1002 5000 M C 75000 M 10048 1001 Transaction Amount Marital Status Age Income Gender Zip Cust ID
3. The three main types of problem data  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
4.  HANDLING MISSING DATA ,[object Object]
4.  HANDLING MISSING DATA   ( cont’d ) ,[object Object],[object Object],[object Object],[object Object]
4.  HANDLING MISSING DATA   ( cont’d ) ,[object Object]
4.  HANDLING MISSING DATA   ( cont’d ) ,[object Object]
4.  HANDLING MISSING DATA   ( cont’d ) ,[object Object]
5. GRAPHICAL METHODS FOR IDENTIFYING OUTLIERS ,[object Object]
5. GRAPHICAL METHODS FOR IDENTIFYING OUTLIERS ,[object Object]
6. DATA TRANSFORMATION ,[object Object]
6. DATA TRANSFORMATION  ( ต่อ ) ,[object Object],[object Object],[object Object],[object Object],min–max normalization values will range from zero to one
6. DATA TRANSFORMATION  ( ต่อ ) ,[object Object],[object Object],[object Object],[object Object],Z-score standardization values will usually range between –4 and 4
6. DATA TRANSFORMATION  ( ต่อ ) ,[object Object]

Más contenido relacionado

Destacado (6)

Datapreprocessing
DatapreprocessingDatapreprocessing
Datapreprocessing
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
DataPreProcessing
DataPreProcessing DataPreProcessing
DataPreProcessing
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Ghhh
GhhhGhhh
Ghhh
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 

Similar a File 498 Doc 17 02dm Datapreprocessing 2

Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
Hagai Aronowitz
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
JawaherAlbaddawi
 
Time Series Assignment- Household Electricity Consumption
Time Series Assignment- Household Electricity ConsumptionTime Series Assignment- Household Electricity Consumption
Time Series Assignment- Household Electricity Consumption
Bala Gowtham
 

Similar a File 498 Doc 17 02dm Datapreprocessing 2 (20)

Missing Value imputation, Poor man's
Missing Value imputation, Poor man'sMissing Value imputation, Poor man's
Missing Value imputation, Poor man's
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Srikanta Mishra
Srikanta MishraSrikanta Mishra
Srikanta Mishra
 
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
 
Data reduction
Data reductionData reduction
Data reduction
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
 
Survey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy AlgorithmsSurvey paper on Big Data Imputation and Privacy Algorithms
Survey paper on Big Data Imputation and Privacy Algorithms
 
4. six sigma descriptive statistics
4. six sigma descriptive statistics4. six sigma descriptive statistics
4. six sigma descriptive statistics
 
Descriptive Analytics: Data Reduction
 Descriptive Analytics: Data Reduction Descriptive Analytics: Data Reduction
Descriptive Analytics: Data Reduction
 
Time Series Assignment- Household Electricity Consumption
Time Series Assignment- Household Electricity ConsumptionTime Series Assignment- Household Electricity Consumption
Time Series Assignment- Household Electricity Consumption
 
Exploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfExploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdf
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27
 
Textual information analysis for the integration of different data repositories
Textual information analysis for the integration of different data repositoriesTextual information analysis for the integration of different data repositories
Textual information analysis for the integration of different data repositories
 
Chapter 3 Data Preprocessing techniques.pptx
Chapter 3 Data Preprocessing techniques.pptxChapter 3 Data Preprocessing techniques.pptx
Chapter 3 Data Preprocessing techniques.pptx
 
Towards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model Checking
 
Reliability
ReliabilityReliability
Reliability
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
Enhanced Latent Fingerprint Segmentation through Dictionary Based Approach
Enhanced Latent Fingerprint Segmentation through Dictionary Based ApproachEnhanced Latent Fingerprint Segmentation through Dictionary Based Approach
Enhanced Latent Fingerprint Segmentation through Dictionary Based Approach
 

Último

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Último (20)

FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 

File 498 Doc 17 02dm Datapreprocessing 2