SlideShare una empresa de Scribd logo
1 de 10
Air Quality in Taiwan 2013 
By Tony Cheng 
typhoon.tony2002@gmail.com 
NYC Data Science Academy 
Student Demo day 11-19-2014 
R-002 Taiwan Open Data and Data Science
 Explore 
 Using history data to find the pattern of air pollution in different 
city 
 Data sources 
 EPA Taiwan air quality history data 1987-2013 
▪ Hourly data form 79 monitoring stations in Taiwan 
▪ Variables contained 11 types of air pollutant (e.g. SO2, NO2, Ozone, PM10…) 
and weather monitoring data (e.g. temperature, wind, rainfall…) 
 Parameters 
 Data in 2013 
 Select Stations 
▪ Zhongshan (Taipei City), Xitun (Taichung City), Xiaogang (Kaohsiung City) 
and Hualien (Hualien County)
 Process 
 Clean data and make data frame 
 Find the characteristic of each pollutant in different cities and 
time period 
 Data visualization 
 Packages 
 reshape2 
 ggplot2 
 lattice 
 plyr 
 scale 
 shiny
 Transfer original data from xls file to csv file by 
manual before using R. 
 Read various files automatically by using “dir()” 
and “for…loop”. 
Before 
After 
library(reshape2) 
## Get the list of csv file. 
filelist = dir() 
file_i = length(filelist) 
data_l = data.frame() 
## Read csv file and Change the data format. 
for (i in 1:file_i){ 
data = read.csv(filelist[i], head=TRUE) 
names(data) = c("Date", "STN", "Type", paste(0:23, sep="")) 
data$Date = as.Date(data$Date) # Change date from string to date format. 
# Change dataframe to long. 
tmp = melt(data=data, 
id=c('STN','Date', 'Type')) 
# Combine all data from each station. 
if (i == 1){ 
data_l = tmp 
} else { 
data_l = rbind(data_l,tmp) 
rm(tmp) 
} 
} 
data_l$value = as.numeric(data_l$value) # Change value into numeric data. 
names(data_l) = c('STN','Date','Type','Time','value') 
data_w = dcast(data_l,STN+Date+Time~Type,value='value',fun=mean) 
# Get the year, month, day and weekdays. 
data_w$year = format(data_w$Date, "%y") 
data_w$month = format(data_w$Date, "%m") 
data_w$days = format(data_w$Date, "%d") 
data_w$weekdays = weekdays(data_w$Date) 
* STN = Station name in Chinese
PM10 Ozone 
NO2 SO2
library(ggplot2) 
library(reshape2) 
# Set Color Table 
colortablb = c("#99FFFF", "#00FFFF", "#00FF00", "#CCFF33", "#FFFF00", 
"#FFCC00", "#FF6600", "#FF3333", "#FF33CC", "#660033") 
# Cut data into ten part 
drawSTN = "中山" 
Time_of_Day = data_w$Time[data_w$STN==drawSTN] 
mag = cut_number(data_w$NO2[data_w$STN==drawSTN], n = 10) 
rosedata = data.frame(dir=Time_of_Day,mag=mag) 
# Plot rose chart 
p <- ggplot(rosedata,aes(x=Time_of_Day,fill=mag))+ geom_bar()+ coord_polar() + 
ggtitle("Air Pollutant during A Day")+ scale_fill_manual(values=colortablb) 
print(p) 
High relation between nitrogen dioxide (NO2) 
and human daily activity. The air quality is good 
at midnight and bad at rush hour. 
Zhongshan (中山) 
Xitun (西屯)
# Date of Holiday 
data_w$holiday = (data_w$weekdays == "星期六" | data_w$weekdays == "星期日") 
# List of national holiday 
Holidaylist = c("2013-01-01","2013-02-11","2013-02-12","2013-02-13", 
"2013-02-14“, "2013-02-15","2013-02-28","2013-04-04","2013-04-05”, 
"2013-06-12“, "2013-09-19","2013-09-20","2013-10-10") 
Holidaylist = as.Date(Holidaylist, '%Y-%m-%d') 
for (i in 1:length(Holidaylist)){ 
data_w$holiday[data_w$Date==Holidaylist[i]] = TRUE 
} 
National holiday in 2013 
01/01, 02/11~15, 02/28, 04/04~05, 06/12, 
09/19~20, 10/10 
The concentration of nitrogen 
dioxide on weekday is much higher 
than holiday except for the 
midnight of holiday.
Zhongshan (中山) Hualien (花蓮) 
Xitun (西屯) Xiaogang (小港) 
NO2 
Nitrogen dioxide (NO2) at Zhongshan is the 
most terrible, rush hour on weekdays 
especially! 
Avoid exercising outside between 7-9 a.m. and 
5-10 p.m. in most of cities. 
# Rescale all data to 0 and 1 
HPtmp_r = HPtmp[HPtmp$Type==“NO2”, c(1,3,4)] 
HPtmp_r$rescale = rescale(HPtmp$value[HPtmp$Type==“NO2”], to=c(0,1)) 
p = ggplot(HPtmp_r[HPtmp_r$STN == “中山”,], aes(variable, weekdays)) + 
geom_tile(aes(fill = rescale),colour = "white")+ 
scale_fill_gradient(low = "cyan", high = "firebrick4",limits=c(0,1))+ 
xlab("Time")+ylab("Weekday")+ 
theme(axis.text = element_text(size=16))+ 
theme(axis.title = element_text(size=20)) 
print(p)
 Using the others stations and data in previous years for 
detailed analysis. 
 Relation between air quality and health data, sale volume of 
air cleaner or BBQ on Moon Festival?

Más contenido relacionado

Destacado

Повітря Кривбассу / Air of Kryvbass
Повітря Кривбассу / Air of KryvbassПовітря Кривбассу / Air of Kryvbass
Повітря Кривбассу / Air of KryvbassYevhen Vasylenko
 
Rise And Stall Of SNS In China
Rise And Stall Of SNS In ChinaRise And Stall Of SNS In China
Rise And Stall Of SNS In Chinasinocismblog
 
Product book print final
Product book print finalProduct book print final
Product book print finalNguyen Doan
 
Tutoriales para decorar tus uñas!
Tutoriales para decorar tus uñas!Tutoriales para decorar tus uñas!
Tutoriales para decorar tus uñas!GaBii JaRa
 
Catalogo de uñas acrilicas
Catalogo de uñas acrilicasCatalogo de uñas acrilicas
Catalogo de uñas acrilicasjenii29
 
escultura de la uña
escultura de la uñaescultura de la uña
escultura de la uñagueste508797
 
Revista uñas un decorado para cada mes
Revista uñas   un decorado para cada mesRevista uñas   un decorado para cada mes
Revista uñas un decorado para cada mesJeni Peña
 
Manicure introduction
Manicure introductionManicure introduction
Manicure introductionBecky_
 
Curso pdf de manicure
Curso pdf de manicureCurso pdf de manicure
Curso pdf de manicuregueste508797
 
Nail care tools and equipment
Nail care tools and equipmentNail care tools and equipment
Nail care tools and equipmentLheng Alfaro
 
Razonamiento inductivo y deductivo
Razonamiento inductivo y deductivoRazonamiento inductivo y deductivo
Razonamiento inductivo y deductivoAcademiamatematica
 
Magnetic nail-design-catalogue-2014 copy
Magnetic nail-design-catalogue-2014 copyMagnetic nail-design-catalogue-2014 copy
Magnetic nail-design-catalogue-2014 copyrfa2014
 

Destacado (20)

Повітря Кривбассу / Air of Kryvbass
Повітря Кривбассу / Air of KryvbassПовітря Кривбассу / Air of Kryvbass
Повітря Кривбассу / Air of Kryvbass
 
Brochure for gel nails
Brochure for gel nailsBrochure for gel nails
Brochure for gel nails
 
Rise And Stall Of SNS In China
Rise And Stall Of SNS In ChinaRise And Stall Of SNS In China
Rise And Stall Of SNS In China
 
Product book print final
Product book print finalProduct book print final
Product book print final
 
State Board Practical Set-up
State Board Practical Set-upState Board Practical Set-up
State Board Practical Set-up
 
Tutoriales para decorar tus uñas!
Tutoriales para decorar tus uñas!Tutoriales para decorar tus uñas!
Tutoriales para decorar tus uñas!
 
Nail Care 101
Nail Care 101Nail Care 101
Nail Care 101
 
Catalogo de uñas acrilicas
Catalogo de uñas acrilicasCatalogo de uñas acrilicas
Catalogo de uñas acrilicas
 
Nails
NailsNails
Nails
 
escultura de la uña
escultura de la uñaescultura de la uña
escultura de la uña
 
Cosmetology and nails
Cosmetology and nailsCosmetology and nails
Cosmetology and nails
 
Revista uñas un decorado para cada mes
Revista uñas   un decorado para cada mesRevista uñas   un decorado para cada mes
Revista uñas un decorado para cada mes
 
Manicure introduction
Manicure introductionManicure introduction
Manicure introduction
 
Curso pdf de manicure
Curso pdf de manicureCurso pdf de manicure
Curso pdf de manicure
 
SNS Regular set
SNS Regular setSNS Regular set
SNS Regular set
 
Nail care tools and equipment
Nail care tools and equipmentNail care tools and equipment
Nail care tools and equipment
 
K to 12 nail care learning module
K to 12 nail care learning moduleK to 12 nail care learning module
K to 12 nail care learning module
 
K to 12 TLE Curriculum Guide
K to 12 TLE Curriculum GuideK to 12 TLE Curriculum Guide
K to 12 TLE Curriculum Guide
 
Razonamiento inductivo y deductivo
Razonamiento inductivo y deductivoRazonamiento inductivo y deductivo
Razonamiento inductivo y deductivo
 
Magnetic nail-design-catalogue-2014 copy
Magnetic nail-design-catalogue-2014 copyMagnetic nail-design-catalogue-2014 copy
Magnetic nail-design-catalogue-2014 copy
 

Similar a Air Quality in Taiwan 2013

3. R- list and data frame
3. R- list and data frame3. R- list and data frame
3. R- list and data framekrishna singh
 
Time series analysis in Stata
Time series analysis in StataTime series analysis in Stata
Time series analysis in Statashahisec1
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisUniversity of Illinois,Chicago
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisUniversity of Illinois,Chicago
 
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Vivian S. Zhang
 
A Beginners Guide to Weather & Climate Data, Margriet Groenendijk
A Beginners Guide to Weather & Climate Data, Margriet GroenendijkA Beginners Guide to Weather & Climate Data, Margriet Groenendijk
A Beginners Guide to Weather & Climate Data, Margriet GroenendijkPôle Systematic Paris-Region
 
Date time function in Database
Date time function in DatabaseDate time function in Database
Date time function in DatabaseSarfaraz Ghanta
 
Introduction to Data Structure
Introduction to Data StructureIntroduction to Data Structure
Introduction to Data Structurechouguleamruta24
 
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...GIS in the Rockies
 
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdfBANSALANKIT1077
 
A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...Kalpa Gunaratna
 
Advanced Date/Time Handling with PHP
Advanced Date/Time Handling with PHPAdvanced Date/Time Handling with PHP
Advanced Date/Time Handling with PHPAnis Berejeb
 
Don't Repeat Yourself, and Automated Code Reviews
Don't Repeat Yourself, and Automated Code ReviewsDon't Repeat Yourself, and Automated Code Reviews
Don't Repeat Yourself, and Automated Code ReviewsGramener
 
unit 5_Real time Data Analysis vsp.pptx
unit 5_Real time Data Analysis  vsp.pptxunit 5_Real time Data Analysis  vsp.pptx
unit 5_Real time Data Analysis vsp.pptxprakashvs7
 

Similar a Air Quality in Taiwan 2013 (20)

3. R- list and data frame
3. R- list and data frame3. R- list and data frame
3. R- list and data frame
 
Time series analysis in Stata
Time series analysis in StataTime series analysis in Stata
Time series analysis in Stata
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
 
Dictionary in python
Dictionary in pythonDictionary in python
Dictionary in python
 
A Beginners Guide to Weather & Climate Data, Margriet Groenendijk
A Beginners Guide to Weather & Climate Data, Margriet GroenendijkA Beginners Guide to Weather & Climate Data, Margriet Groenendijk
A Beginners Guide to Weather & Climate Data, Margriet Groenendijk
 
datetimefuction-170413055211.pptx
datetimefuction-170413055211.pptxdatetimefuction-170413055211.pptx
datetimefuction-170413055211.pptx
 
R time series analysis
R   time series analysisR   time series analysis
R time series analysis
 
Date time function in Database
Date time function in DatabaseDate time function in Database
Date time function in Database
 
Introduction to Data Structure
Introduction to Data StructureIntroduction to Data Structure
Introduction to Data Structure
 
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...
2015 FOSS4G Track: The Spatial Database - Lessons from the Enterprise and Pos...
 
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf
(----IN C) Incorrect output on some test cases- I'm using a linked lis.pdf
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...A Statistical and Schema Independent Approach to Identify Equivalent Properti...
A Statistical and Schema Independent Approach to Identify Equivalent Properti...
 
Advanced Date/Time Handling with PHP
Advanced Date/Time Handling with PHPAdvanced Date/Time Handling with PHP
Advanced Date/Time Handling with PHP
 
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
 
Don't Repeat Yourself, and Automated Code Reviews
Don't Repeat Yourself, and Automated Code ReviewsDon't Repeat Yourself, and Automated Code Reviews
Don't Repeat Yourself, and Automated Code Reviews
 
20 date-times
20 date-times20 date-times
20 date-times
 
unit 5_Real time Data Analysis vsp.pptx
unit 5_Real time Data Analysis  vsp.pptxunit 5_Real time Data Analysis  vsp.pptx
unit 5_Real time Data Analysis vsp.pptx
 

Último

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Último (20)

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Air Quality in Taiwan 2013

  • 1. Air Quality in Taiwan 2013 By Tony Cheng typhoon.tony2002@gmail.com NYC Data Science Academy Student Demo day 11-19-2014 R-002 Taiwan Open Data and Data Science
  • 2.  Explore  Using history data to find the pattern of air pollution in different city  Data sources  EPA Taiwan air quality history data 1987-2013 ▪ Hourly data form 79 monitoring stations in Taiwan ▪ Variables contained 11 types of air pollutant (e.g. SO2, NO2, Ozone, PM10…) and weather monitoring data (e.g. temperature, wind, rainfall…)  Parameters  Data in 2013  Select Stations ▪ Zhongshan (Taipei City), Xitun (Taichung City), Xiaogang (Kaohsiung City) and Hualien (Hualien County)
  • 3.  Process  Clean data and make data frame  Find the characteristic of each pollutant in different cities and time period  Data visualization  Packages  reshape2  ggplot2  lattice  plyr  scale  shiny
  • 4.  Transfer original data from xls file to csv file by manual before using R.  Read various files automatically by using “dir()” and “for…loop”. Before After library(reshape2) ## Get the list of csv file. filelist = dir() file_i = length(filelist) data_l = data.frame() ## Read csv file and Change the data format. for (i in 1:file_i){ data = read.csv(filelist[i], head=TRUE) names(data) = c("Date", "STN", "Type", paste(0:23, sep="")) data$Date = as.Date(data$Date) # Change date from string to date format. # Change dataframe to long. tmp = melt(data=data, id=c('STN','Date', 'Type')) # Combine all data from each station. if (i == 1){ data_l = tmp } else { data_l = rbind(data_l,tmp) rm(tmp) } } data_l$value = as.numeric(data_l$value) # Change value into numeric data. names(data_l) = c('STN','Date','Type','Time','value') data_w = dcast(data_l,STN+Date+Time~Type,value='value',fun=mean) # Get the year, month, day and weekdays. data_w$year = format(data_w$Date, "%y") data_w$month = format(data_w$Date, "%m") data_w$days = format(data_w$Date, "%d") data_w$weekdays = weekdays(data_w$Date) * STN = Station name in Chinese
  • 6. library(ggplot2) library(reshape2) # Set Color Table colortablb = c("#99FFFF", "#00FFFF", "#00FF00", "#CCFF33", "#FFFF00", "#FFCC00", "#FF6600", "#FF3333", "#FF33CC", "#660033") # Cut data into ten part drawSTN = "中山" Time_of_Day = data_w$Time[data_w$STN==drawSTN] mag = cut_number(data_w$NO2[data_w$STN==drawSTN], n = 10) rosedata = data.frame(dir=Time_of_Day,mag=mag) # Plot rose chart p <- ggplot(rosedata,aes(x=Time_of_Day,fill=mag))+ geom_bar()+ coord_polar() + ggtitle("Air Pollutant during A Day")+ scale_fill_manual(values=colortablb) print(p) High relation between nitrogen dioxide (NO2) and human daily activity. The air quality is good at midnight and bad at rush hour. Zhongshan (中山) Xitun (西屯)
  • 7. # Date of Holiday data_w$holiday = (data_w$weekdays == "星期六" | data_w$weekdays == "星期日") # List of national holiday Holidaylist = c("2013-01-01","2013-02-11","2013-02-12","2013-02-13", "2013-02-14“, "2013-02-15","2013-02-28","2013-04-04","2013-04-05”, "2013-06-12“, "2013-09-19","2013-09-20","2013-10-10") Holidaylist = as.Date(Holidaylist, '%Y-%m-%d') for (i in 1:length(Holidaylist)){ data_w$holiday[data_w$Date==Holidaylist[i]] = TRUE } National holiday in 2013 01/01, 02/11~15, 02/28, 04/04~05, 06/12, 09/19~20, 10/10 The concentration of nitrogen dioxide on weekday is much higher than holiday except for the midnight of holiday.
  • 8. Zhongshan (中山) Hualien (花蓮) Xitun (西屯) Xiaogang (小港) NO2 Nitrogen dioxide (NO2) at Zhongshan is the most terrible, rush hour on weekdays especially! Avoid exercising outside between 7-9 a.m. and 5-10 p.m. in most of cities. # Rescale all data to 0 and 1 HPtmp_r = HPtmp[HPtmp$Type==“NO2”, c(1,3,4)] HPtmp_r$rescale = rescale(HPtmp$value[HPtmp$Type==“NO2”], to=c(0,1)) p = ggplot(HPtmp_r[HPtmp_r$STN == “中山”,], aes(variable, weekdays)) + geom_tile(aes(fill = rescale),colour = "white")+ scale_fill_gradient(low = "cyan", high = "firebrick4",limits=c(0,1))+ xlab("Time")+ylab("Weekday")+ theme(axis.text = element_text(size=16))+ theme(axis.title = element_text(size=20)) print(p)
  • 9.
  • 10.  Using the others stations and data in previous years for detailed analysis.  Relation between air quality and health data, sale volume of air cleaner or BBQ on Moon Festival?

Notas del editor

  1. Today I will talk about the air quality. In recent years, the air pollution is more and more terrible, and more and more news report about the haze pollution in China. So I would like to know that how Taiwan’s air quality is. The history air quality data from Taiwan’s Environmental Protection Administration is used. It contains 11 types of pollutant and weather data, which are from 79 monitoring station. I select 4 stations in this project, 中山 is in downtown of Taipei City, 西屯 is in the Taichung City ,小港 is near the industrial zone of Kaohsiung City , and 花蓮 is a small city in eastern Taiwan.
  2. Here is the process in my project and the packages we used. We cleaned the data and made a data frame at fist. And then we used the data frame we made to find the characteristic of each air pollutant, such as in different cities and different time period. After finding out the pattern of pollutant, use ggplot2, lattice and shiny for data visualization.
  3. The dataset format from EPA website is xls file, so we use other software transfer the data to csv file before using R. In order to read files automatically and make data frame, here we use “dir” to get the file name, and “for…loop” to read all csv file.
  4. This 4 figures show 4 major air pollutants density distribution in different stations. We find that the concentration of PM10, O3 and NO2 in 花蓮 are lower than other station. 中山 has more NO2 than others Air quality in 小港 is most terrible, SO2 is much higher then other stations.