2. Table of contents
Introduction
About the Training
Certificate
What I have learned
Conclusion
3. Introduction
Training is the process of teaching, informing or educating people so that
they may become well qualified as possible to do their job, and they
become qualified to perform in positions of greater difficulty and
responsibility.
Training is an organized and planned effort by a company in order to
facilitate employees learning regarding job related competencies.
4. About the training
Industrial training at Internshala from 24th November 2020 to 05th
January 2021.
I completed my online industrial training from “Internshala” located in
Gurgaon whose time period was of 42 days.
I have completed my online training under the guidance MR. Kunal Jain
and MR. Sunil Roy.
6. What I have learned
Introduction to machine learning.
Classification of machine learning.
Data preprocessing.
Regression / Classification
Linear regression.
Logistic regression.
Decision tree.
K- Mean Clustering
7. Introduction to machine learning
Machine learning enables a machine to automatically learn from data, improve
performance from experiences, and predict things without being explicitly
programmed.
In the real world, we are surrounded by humans who can learn everything from
their experiences with their learning capability, and we have computers or
machines which work on our instructions. But can a machine also learn from
experiences or past data like a human does? So here comes the role of Machine
Learning.
8. Classification of machine learning
Broadly machine learning can be categorized into two categories,
Supervised Learning : Supervised Learning is a type of learning in witch we are
given a data set and we already know what are correct output should look like,
having the idea that there is a relationship between the input and output.
Basically, it is learning task of learning a function that maps an input to an output
based on example input-output pair.
Unsupervised learning : Unsupervised learning is a type of learning that allow
us to approach problems with little or no idea our problem should look like. We
can derive the structure by clustering the data based on relationship among the
variables in data. With unsupervised learning there is no feedback based on
prediction result.
9. Data preprocessing
Data preprocessing is a process of preparing the raw data and making it suitable
for a machine learning model. It is the first and crucial step while creating a
machine learning model.
Steps involved in data preprocessing.
Data cleaning
Data integration
Data reduction
Data transformation
10. Regression / Classification
Regression and Classification algorithms are Supervised Learning algorithms. Both the
algorithms are used for prediction in Machine learning and work with the labeled datasets. But
the difference between both is how they are used for different machine learning problems.
The main difference between Regression and Classification algorithms that Regression
algorithms are used to predict the continuous values such as price, salary, age, etc. and
Classification algorithms are used to predict/Classify the discrete values such as Male or
Female, True or False, Spam or Not Spam, etc.
11. linear regression
Linear regression may be defined as the statistical model that analyzes the linear
relationship between a dependent variable with given set of independent
variables. Linear relationship between variables means that when the value of
one or more independent variables will change (increase or decrease), the value
of dependent variable will also change accordingly (increase or decrease).
Mathematically the relationship can be represented with the help of following
equation −
Y = mX + c
12. Logistic regression
Logistic regression is a classification algorithm. It is used to predict a binary
outcome based on a set of independent variables.
Logistic regression is the correct type of analysis to use when you’re working with
binary data. You know you’re dealing with binary data when the output or
dependent variable is categorical in nature; in other words, if it fits into one of two
categories (such as “yes” or “no”, “pass” or “fail”, and so on).
13. Decision tree
Decision Tree is a Supervised learning technique that can be used
for both classification and Regression problems, but mostly it is preferred
for solving Classification problems. It is a tree-structured classifier, where
internal nodes represent the features of a dataset, branches represent
the decision rules and each leaf node represents the outcome.
The decisions or the test are performed on the basis of features of
the given dataset.
“It is a graphical representation for getting all the possible solutions
to a problem/decision based on given conditions.”
14. K-Mean Clustering
K-Means Clustering is an unsupervised learning algorithm which groups the unlabeled
dataset into different clusters. Here K defines the number of pre-defined clusters that need to
be created in the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.
“It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a
way that each dataset belongs only one group that has similar properties.”
15. Project
Brest cancer detection
We have extracted features of breast cancer patient cells and
normal person cells. ML model to
classify malignant and benign tumor. To complete this ML
project we are using the supervised machine learning classifier
algorithm.