2. Introduction Of ML
Different sub topics of AI
Types of Machine Learning
Different Languages use in ML
Advantages of R over other Languages
R Programming
Project on HEART DISEASE ANALYSIS WITH R
Content:
3. In computer science machine learning refers to a type of data analysis
that uses algorithms that learn from data. It is a type of artificial
intelligence (AI) that provides systems with the ability to learn without
being explicitly programmed.
What is ML ??
4. Different sub topics of AI
• Neural Network
• Planning
• Robotics
• ML
• Natural Language processing
• Perception
• Knowledge
• Cognitive system
7. R programming
Python
Java-Family / C-Family
Matlab
Mathematica
Stata etc
Different Languages use in ML
8. Why We Used R in ML..???
why
why
why
why
why
why
why
why
why
9. • Reduced Coad length
• Availability(4000+ functions are available)
• Simple and easy to learn
• Free an Open source
ADVANTAGES OF R OVER OTHER LANGUAGES
10. R Programming
myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"
Basic Syntax
Data Type
• Logical v <- TRUE print(class(v))
• Numeric v <- 23.5 print(class(v))
• Integer v <- 2L print(class(v))
• Complex v <- 2+5i print(class(v))
• Character v <- "TRUE" print(class(v))
• Raw v <- charToRaw("Hello") print(class(v))
OPERATORS
• Add v <- c( 2,5.5,6)
• Sub t <- c(8, 3, 4)
• Mul print(v+t)
• Div etc print(v-t)
• print(v*t)
11. Accessing Vector Elements
R Programming
# Accessing vector elements using position.
t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)];print(u)
OUTPUT: [1] "Mon" "Tue" "Fri"
# Accessing vector elements using logical indexing.
v <- t[c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE)]
print(v)
[1] "Sun" "Fri"
# Accessing vector elements using negative indexing. x <-
t[c(-2,-5)] print(x)
[1] "Sun" "Tue" "Wed" "Fri" "Sat"
12. R Programming
# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)
# Give the chart file a name.
png(file = "linearregression.png")
# Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")
# Save the file.
dev.off()
Regression
13. R Programming
# Load the party package. It will automatically load other # dependent packages.
library(party)
# Create the input data frame.
input.dat <- readingSkills[c(1:105),]
# Give the chart file a name.
png(file = "decision_tree.png")
# Create the tree.
output.tree <- ctree( nativeSpeaker ~ age + shoeSize + score, data = input.dat)
# Plot the tree.
plot(output.tree)
# Save the file.
dev.off()
DECSION TREE
14. R Programming
OUTPUT
null device 1
Loading required package: methods
Loading required package: grid
Loading required package: mvtnorm
Loading required package: modeltools
Loading required package: stats4
Loading required package: strucchange
Loading required package: zoo
Attaching package: ‘zoo’
The following objects are masked from ‘package:base’:
as.Date,as.Date.numeric
Loading required package: sandwich
15. R Programming
Graphs and Plots
# Create data for the graph.
x <- c(21, 62, 10, 53)
labels <- c("London", "New York", "Singapore", "Mumbai")
# Give the chart file a name.
png(file = "city.jpg")
# Plot the chart. pie(x,labels)
# Save the file.
dev.off()
Pi CHART
16. R Programming
Graphs and Plots
# Give the chart file a name.
png(file = "boxplot.png")
# Plot the chart.
boxplot(mpg ~ cyl, data = mtcars, xlab =
"Number of Cylinders", ylab = "Miles Per
Gallon", main = "Mileage Data")
# Save the file. dev.off()
BOX PLOT
17. R Programming
Graphs and Plots
# Create the data for the chart
H <- c(7,12,28,3,41)
# Give the chart file a name
png(file = "barchart.png")
# Plot the bar chart
barplot(H)
# Save the file
dev.off()
BAR PLOTS
18. R Programming
Graphs and Plots
# Create the data for the chart.
v <- c(7,12,28,3,41)
# Give the chart file a name.
png(file = "line_chart.jpg")
# Plot the bar chart. plot(v,type = "o")
# Save the file.
dev.off()
LINE GRAPHS
19. Image Recognition
Speech Recognition
Medical Diagnosis
Statistical Arbitrage
Prediction
APPLICATIONs OF ML
20. FEW STEPS TO HANDLE THE DATA
• Data Selection
Consider what data is available, what data is missing and what data can be
removed.
• Data Preprocessing
Organize your selected data by formatting, cleaning and sampling from it.
• Data Transformation
Transform preprocessed data ready for machine learning by engineering
features using scaling, attribute decomposition and attribute aggregation.
21. Identify and handle missing values
Data FORMATTING
DATA NORMALIZATION (CENTERING/SCALING)
DATA BINNING
PRE-PROCESSING DATA IN R
22. Missing value occur when no data value is stored for a variable(feature) in an
observation
Could be represented as “?” , “N/A” , 0 or Just blank cell
Drop the missing value:
Drop the variable
Drop the data entry
Replace the missing value:
Replace it with an average of similar datapoint
Replace it by frequency
Replace it based on other function
Identify and handle missing values
23. Applying calculation to an entire column
Like conversion of data like from meter to kilometer
Data FORMATTING
24. My making the range consistent between variables , normalization
enables a fairer comparison between features
Few methods:
1. Simple Feature Scaling x(new)=x(old)/x(max)
2. Min-Max x(new) = (x(old)-x(min))/(x(max)-x(min))
DATA NORMALIZATION (CENTERING/SCALING)
25. Binning Grouping of values into “bins”
Converts numeric into categorical variables
Grouping a set of numerical values into subset “bins”
DATA BINNING
26. Data preparation / transformation is a large subject that can involve a lot of
iterations, exploration and analysis. Getting good at data preparation will
make you a master at machine learning.
DATA TRANSFORMATION
27. PROJECT ON Heart disease
• First we need .csv file here csv stands for comma separated values
>>> getwd()
>>> setwd("C:UsersVinitDesktopMachineLearning")
>>> table1<-read.table("heart_disease_male.csv",header=T,sep=",")
• We can also use read.csv() function in place of read.table() as it is a csv file
28. Output:
age chest_pain rest_bpress blood_sugar rest_electro max_heart_rate exercice_angina disease
1 43 asympt 140 f normal 135 yes 1
2 39 atyp_angina 120 f normal 160 yes 0
3 39 non_anginal 160 t normal 160 no 0
4 42 non_anginal 160 f normal 146 no 0
5 49 asympt 140 f normal 130 no
0
6 50 asympt 140 f normal 135 no 0
>>>head(table1)
37. In conclusion, Machine learning is an incredible breakthrough in the field
of artificial intelligence. While it does have some frightening implications
when you think about it, these Machine Learning Applications are several
of the many ways this technology can improve our lives.
Conclusion