Digital Indonesia Report 2024 by We Are Social .pdf
Data Science Environment with R on openSUSE Leap 15.1
1. Data Science Environment with R
on openSUSE Leap 15.1
Sabar Suwarsono
2019-10-05
CC BY-SA 4.0
soewarsono@klim.or.id
@soewarsono
2. Who am I?
●
Data Scientist with R enthusiast
●
Member of Komunitas GNU/Linux Malang (KLiM)
●
Member of openSUSE Indonesia
●
Member of Komunitas R Indonesia
●
Sysadmin on University of Islam Malang
4. Data Science is the intersection of
Hacking Skills, Math & Statistics
Knowledge and Substantive Expertise
Those are the pillars of data science: computing,
statistics, mathematics and quantitative
disciplines combined to analyze data for better
decision making
5. Hacking Skills
Ability to build things and find clever solutions to
problems.
●
Programming/Coding: Python and R (and others)
●
Databases: MySQL, PostgreSQL, Cassandra,
●
MongoDB and CouchDB.
●
Visualization: D3, Tableau, Qlikview and Markdown.
●
Big Data: Hadoop, MapReduce and Spark.
8. Math & Statistics
Being able understand the right solution to each
problem
●
Linear algebra: Matrix manipulation
●
Machine Learning: Random Forests, SVM,
Boosting
●
Descriptive statistics: Describe, Cluster
●
Statistical inference: Generate new knowledge .
10. Substantive Expertise
Ability to ask good questions requires domain understanding,
that’s why a data scientist can’t create data based solutions
without a good industry knowledge
●
Is this A or B or C? (classification)
●
Is this weird? (anomaly detection).
●
How much/how many? (regression).
●
How is it organized? (clustering).
●
What should I do next? (reinforcement learning)
11. Why Data Science?
●
Data is the oil for today's world. With the right tools, technologies, algorithms, we
can use data and convert it into a distinctive business advantage
●
Data Science can help you to detect fraud using advanced machine learning
algorithms
●
It helps you to prevent any significant monetary losses
●
Allows to build intelligence ability in machines
●
You can perform sentiment analysis to gauge customer brand loyalty
●
It enables you to take better and faster decisions
●
Helps you to recommend the right product to the right customer to enhance your
business
16. Know your neighbour!
●
Who are you?
●
What you do with data?
●
How would you describe
●
your experience with R?
17. Compiling R from source and
why you shouldn't do it
●
I want a faster R.
●
I can compile against other linear algebra
libraries and get a speedup that way.
●
I need special customization.
●
I just want to.
18. Dependency of R installation From Source
on openSUSE
pattern devel_basis lapack-devel
gcc-fortran java-11-openjdk
readline-devel java-11-openjdk-accessibility
xorg-x11-devel java-11-openjdk-devel
xz-devel java-11-openjdk-headless
libcurl-devel java-11-openjdk-jmods
texlive libpng-devel