My talk on Sarasehan Geologi Populer, 16th March 2015, at Badan Geologi. This talk covers various open source tools for geological and hydrogeological analysis with focus on Cikapundung river case. Some examples of R code to extract hidden pattern in the data set, in order to explain natural phenomenon.
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
[Hydro]geological analysis using open source app: case Cikapundung River
1. Geological analysis with open-source software:
case Cikapundung River
Event: Sarasehan Geologi Populer, Badan Geologi Indonesia
Dasapta Erwin Irawan
16th March 2015
5. A bit about me
Personal data
Name: Dasapta Erwin Irawan
Job: Lecturer/researcher at Groundwater Engineering Program,
ITB
Education (Geology, ITB): 1994-1998 (undergrad), 1999-2001
(Master), 2005-2009 (PhD)
6. A bit about me
Visiting program
Nov-Dec 2009
Center for Environmental Remote Sensing (CeRES), Chiba
University (2009)
Supervisor: Prof. Josaphat T.S. Sumantyo
Research: remote sensing for hydrological purposes
Feb 2014-Feb 2015
Faculty of Agriculture and Environment, University of Sydney
Supervisor: Dr. Willem Vervoort
Research: hydrological modeling in R
7. A bit about me
Research
hydrogeology
hydrochemistry
multivariate [statistical] analysis
8. A bit about me
Media social
Website: - R and Linux - Writing - SlideShare
Twitter: @dasaptaerwin
Email: d_erwin_irawan[at]yahoo[dot]com
11. Software skills (not free version)
office: Microsoft Office (Word, Excel, ppt) -> annual
subscription (from USD 10 per month)
citation and referencing: EndNote -> from USD 250
statistical: Minitab, SPSS, Statistica, Stata -> basic version
from USD 700 (2012)
spatial / GIS: ArcGIS, Mapinfo, etc -> annual subscription
(basic version from USD 100 per year)
Sources:
Openwetware
ESRI
12. Software skills (free equivalent)
office: OpenOffice or LibreOffice
citation and referencing: Zotero, Mendeley, etc
statistical: R and R Studio, Orange Data Mining, PSPP, etc
GIS: QGIS, GRASSGIS, R
13. Why open source?
free as breathing
mostly cross-platform (Linux, Mac, Win)
strong community, hence rapid development
supporting reproducibility
14. What is reproducibility in science?
Every step can be:
re-do
re-analysed and re-evaluate
re-developed
15. What is reproducibility in science?
Those principles are applied to:
data (items and locations)
software used in the analyses:
each software has distinct feature and algorithm
what would happen if not everyone could purchase the software?
20. Background
Cikapundung has important roles:
is one of the major water source for Bandung Basin: WTP
Dago Pakar = 40 L/sec
electrical generator (since 1923):
PLTA Bengkok = 3 MW
PLTA Dago = 0.7 MW
Drainase kota
24. What do we know so far?
There types of groundwater and river water interactions (Lubis
and Puradimaja 2006)
isolated stream at Maribaya area (upstream)
effluent stream (or gaining stream) at Maribaya to Viaduct
segment (Bandung central)
influent stream (or losing stream) from Viaduct to Dayeuhkolot
some facts of springs and seepages at isolated segment
(Tanuwijaya 2014).
32. Data format
variables or measurements in columns
cases or samples in rows
no merged columns or rows
read also Data is the new soil
33. Why pairs analysis
equivalent to correlation matrix
the fastest way to see correlations between variables
pls bear in mind correlation does not always mean
causality
36. Load
# load data
data <- as.data.frame(read.csv("BandungData.csv",
header = TRUE))
attach(data)
## The following object is masked from package:datasets:
##
## CO2
37. Data structure
# data structure
str(data)
## 'data.frame': 295 obs. of 33 variables:
## $ no : int 16 22 263 17 12 18 13 19 14 20 ...
## $ code : int 116 122 8 117 112 118 113 119 114 120 .
## $ year : int 1997 1997 1997 1997 1997 1997 1997 1997
## $ type : Factor w/ 2 levels "groundwater",..: 1 1 2 1
## $ x : num 785175 785168 799275 785175 785181 ...
## $ y : num 10752836 10752843 10753680 10752840 107
## $ distx : num 6897 6904 0 6897 6891 ...
## $ elv : int 1338 1336 1336 1320 1300 1247 1240 1230
## $ aq : Factor w/ 3 levels "breccias","clay",..: 3 3
## $ zone : Factor w/ 2 levels "eff","inf": 1 1 1 1 1 1
## $ ec : num 71.9 71.9 77 71.9 71.9 71.9 71.9 71.9 7
## $ ph : num 6.89 6.89 6.39 6.89 6.89 ...
## $ hard : num 11 11 26.4 11 11 11 11 11 11 11 ...
## $ tds : num 58.7 58.7 50 58.7 58.7 ...
38. pairs plot 1
pairs(data)
no
0 1.0 10740000 700 1.0 5 0 −50 0 0.0 0 0 0 0 0 20
0
0
code
year
2000
1.0
type
x
790000
y
distx
0
700
elv
aq
1.0
1.0
zone
ec
200
5
ph
hard
0
0
tds
temp
15
−50
eh
Q
0
0
Ca
Mg
0
0.0
Fe
Mn
0.00
0
K
Na
0
0
CO3
HCO3
0
0
CO2
Cl
0
0
SO4
NO2
0.0
0
NO3
SiO2
10
20
cumrain
0 2000 770000 0 1.0 200 0 15 0 0 0.00 0 0 0 0.0 10 40
40
lag1
39. pairs plot 1
ugly, too small
no legend and axis
we need to tweak it: group the variables and change plot code
46. Why PCA (Principle Component Analysis)?
nature embeds multivariable process
has been widely used and developed since the 60’s
simple, straighforward, nearest neighbour (cluster) principles
offers nice visualisation
47. [Simple] codes
# install library
install.packages("pcaMethods") # for PCA
install.packages("gridExtra") # for plot lay out
# load library
library(pcaMethods) # for PCA
library(gridExtra) # for plot lay out
# run PCA
pca1 <- pca(group1,
method = "svdImpute",
scale = "uv",
center = T,
nPcs = 3,
evalPcs = 1:3)
52. Results: Extract Eigenvalues PCA1
1.0 2.0 3.0
1.21.31.41.51.6
Principal Component
Variance
1.0 2.0 3.0
1.11.21.31.41.51.61.71.8
Principal Component
Variance
1.0 2.0 3.0
1.21.31.41.51.6
Principal Component
Variance
53. Results: plot PCA Group 1
−2 0 2 4
−2024
PC 1
R^2 = 0.26
−2024
PC 2
R^2 = 0.17
−2 0 2 4
−3−2−1012
−2 0 2 4 −3 −2 −1 0 1 2
−3−2−1012
PC 3
R^2 = 0.13
o
o
groundwater
river water
54. Results: plot PCA Group 2
−2 0 2 4 6
−20246
PC 1
R^2 = 0.29
−20123
PC 2
R^2 = 0.14
−2 0 2 4 6
−4−2012
−2 0 1 2 3 −4 −2 0 1 2
−4−2012
PC 3
R^2 = 0.11
o
o
groundwater
river water
55. Results: plot PCA Group 3
−2 0 2 4 6
−20246
PC 1
R^2 = 0.2
−20246
PC 2
R^2 = 0.13
−2 0 2 4 6
−3−2−1012
−2 0 2 4 6 −3 −2 −1 0 1 2
−3−2−1012
PC 3
R^2 = 0.1
o
o
groundwater
river water
56. Results: loadings and scores Group1
−0.4 0.0 0.4
−0.4−0.20.00.20.4
Variable loadings
Group1
PC1
PC2
distx
ec
elv
ph
hard
tds
temp
eh
cumrain
lag1
−2 0 2 4
−2024
Case scores
Group1
PC1
PC2
Water type:
Groundwater
River Water
57. Results: loadings and scores Group2
−0.2 0.2
−0.6−0.4−0.20.00.20.4
Variable loadings
Group2
PC1
PC2
distx
ec
elv
Ca
Mg
Fe
Mn
K
Na
cumrain
lag1
−2 2 4 6
−2−10123
Case scores
Group2
PC1
PC2
Water type:
Groundwater
River Water
58. Results: loadings and scores Group3
−0.2 0.2
−0.4−0.20.00.20.4
Variable loadings
Group3
PC1
PC2
distx
ec
elv
CO3
HCO3
CO2
ClSO4
NO2
NO3
SiO2
cumrain
lag1
−2 0 2 4 6
−20246
Case scores
Group3
PC1
PC2
Water type:
Groundwater
River Water
60. why bubble plot?
shows spatial variation as well as values distribution
simple and straigtforward visualisation
61. [simple] codes
# load library (assuming all libraries are installed)
library(gstat)
library(sp)
library(rgdal)
library(latticeExtra)
# open and load data
df <- read.csv("BandungData.csv", header=TRUE)
# convert xy values as coordinates
coordinates(df) <- ~ x + y
72. Remarks
higher mineral concentration in river water than groundwater
should have occured in effluent flow.
higher mineral concentration in groundwater than river water
should have occured in influent flow.
both natural indications are not detected, except for NO2.
73. Remarks
the anomaly is due to dilution effect.
dilution overides enrichment effect.
the opposite would happen if sampling is conducted in dry
season.
possibility of different catchment between groundwater and
river water.
74. Closing
Future research opportunities:
to add more data in different locations along river bank, taken
in both rain and dry season.
more exploratory statistical analysis, eg: multiple regression
tree to extract data pattern.
75. Main references
Lubis, RF and Puradimaja, DJ, 2006, Hydrodynamic
relationsships between groundwater and river water:
CIkapundung river stream, West Java, Indonesia
Darul, A, Irawan, DE, and Trilaksono, NJ, 2014a, Groundwater
and river water interaction on Cikapundung River: Revisited,
International Conference on Math and Natural Sciences, ITB.
Darul, A, 2014b, Model konseptual interaksi air tanah dan air
sungai di bantaran S. Cikapundung, Bandung, Jawa Barat,
Tesis S2, Supervisor: Dr. Dasapta Erwin Irawan dan
Dr. Nurjanna Joko Trilaksono.
Tanuwijaya, ZAJ, 2014, Identifikasi interaksi air sungai dan air
tanah di DAS Cikapundung, Disertasi, Geologi Universitas
Padjadjaran.
76. These slides were made using open-source tools
Ubuntu Linux (14.04)
R
Dia flowcharter
Gimp image editor