Introduction To Machine Learning: About Data, Numerical Data or Quantitative Data, Categorical Data or Qualitative Data, Ordinal Data, what is Machine Learning, examples of Machine Learning, Examples of Deep Learning, Examples of Artificial Intelligence, Supervised Learning Vs Unsupervised Learning, how to execute ML Programs.
1. Introduction To Machine Learning: About Data, Numerical
Data or Quantitative Data, Categorical Data or Qualitative
Data, Ordinal Data, what is Machine Learning, examples of
Machine Learning, Examples of Deep Learning, Examples of
Artificial Intelligence, Supervised Learning Vs Unsupervised
Learning, how to execute ML Programs.
Unit-I
2. INTRODUCTION
since 2012 Data in increasing day by day.
Examples:
When book a flight ticket data is created
When we go for health check-up
Our water bills, electricity bills, Newspapers, News magazines contain data
We also create and share data by the way of messages, mails, YouTube, Twitter, Facebook,
instagram
Google, Amazon, and Microsoft etc. have conducted research on how this data can be used for welfare
of society.
Example: An ice cream manufacturing company
Generally business organizations that follow their decisions depending on the past data to make at least
18% more profits than the other companies that take decisions based on their attitude.
3. INTRODUCTION Contd..
Analysing data and drawing useful conclusions from it - is a very important task for
any organization. This is known as Data science and a person or group of people
who does this task efficiently is called a Data scientist.
Data Science is an umbrella term for various technologies like Machine Learning
(ML), Deep Learning (DL), Artificial Intelligence (AI), statistical methods
A Data scientist's duty is to analyze data and make insights that are useful for
business growth. Since Data Scientist's role is very important in any business
organization, he receives highest honor and salary in the organization.
5. Contd..
A Data scientist should have good knowledge in the fields of
1.Mathematics and Statistics
Linear Algebra Vectors, Matrices, Dot products
Mean, Median, Mode, Standard Deviation, distributions like Normal Distribution,
Binomial Distribution, Bernoulli distribution, Geometric Distribution
2. Computer Programming (R and Python) and implement Machine learning models like Regression
techniques, Classification techniques, Decision Trees, Artificial Neural Networks (ANNs), etc.
3. Domain knowledge.
A Data scientist should have domain knowledge also. Domain knowledge is nothing but the
knowledge about the business or for which your are working.
6. Contd..
Data science is science of data, how we dive into the ocean up with precious gems is
the main aim of Data science. Data science contains concepts like Machine Learning,
Deep Learning and Artificial Intelligence.
7. About Data
Data : "factual information used as a basis for reasoning, discussion, or calculation". -
By Merriam Webster dictionary
Human beings produce data and then collect it on which calculations can be done
and conclusions can be drawn
Data: “Information in digital form that can be transmitted or accessed".
That means the data can be stored in digital form in computers and can be
processed to get useful insights.
Data Science considers 4 types of data:
1. Numerical data or quantitative data
2. Categorical data or qualitative data
3. Ordinal data
4. Images, Audio and Video data
9. 1. Numerical Data or Quantitative Data
Numerical data is represented in the form of numbers. There are two types.
a) Discrete Data: It is represented as an integer or floating-point number.
Discrete data includes discrete variables that are finite, numeric, countable, and
non-negative integers (5, 10, 15, and so on).
Discrete data can also be categorical - containing a finite number of data values,
such as the gender of a person.
Example:
The number of customers in a bank. (10000 customers - is an integer number)
The salary of an employee. R 45000.55 - is a floating-point number)
10. 1. Numerical Data or Quantitative Data Cont..
b) Continuous Data: It moves continuously without ending. This can go up to infinite
length and height related measurements will come under this category.
Example
The time between customer arrivals to a bank ATM. (This time duration is measured
as 5 minutes 30 seconds 20 milliseconds 10 microseconds 5 nanoseconds and it goes on
continuously. Since we cannot measure this continuous data due to limitation of an clocks,
we cut it after seconds and say it is 5 minutes 30 seconds.)
How much rain fell on a given day? (This may be 15 centimetres 8 millimetres 5
micrometres and it continues like this. Since we cannot measure this continuous data due
to limitation of our tapes, we cut it after millimetres and say it is 15 centimetres 8
millimetres rain).
11.
12.
13. 2. Categorical Data or Qualitative Data (Non-numeric)
Categorical data has no inherent mathematical meaning. It contains only a limited
number of possibilities.
Examples:
Gender. (This has only 2 possibilities like Male, Female)
Marital Status. (This has only 4 possibilities like Single, Married, Widowed,
Divorced)
Religion of a student in a class. (This has 6 possibilities like Hindu, Christi Sikh,
Jain, Buddhist)
Categorical data will have limited (finite) number of possible called categories or
classes. We can assign numbers to categories in order to represent them more compactly
but the numbers don't have mathematical meaning.
Example: we can assign 1 for Male and 0 for Female but these numbers do not
have any mathematical meaning.
15. Ordinal data is a mixture of numerical and categorical data that has mathematical
meaning.
Example:
Movie ratings on a 1 to 5 scale. (Ratings can be 1,2,3,4, or 5 only. But these values have
mathematical meaning: 1 means worse movie, 2 represents better movie and 5 is excellent
movie)
Sizes of a shirt. (This can be S/M/L/XL. So only 4 values but they have meanings: S
means small shirt, M means Medium size, L for Large and XL for Extra-large shirt)
3. Ordinal Data
16. 4. Images, Audio and Video Data
An image is stored in the form of several pixels.
A pixel (short form for picture element) is a minute dot
When we show a character like "A' on the screen, it is composed of several minute dots
that are called pixels
These pixels can be represented by either 1 or 0. A bright pixel is represented by a 1
and a dark pixel where there is no light is represented by a 0
An image is converted into a matrix of 1s and 0s. These 1s and 0s are called binary
digits or bits.
17. 4. Images, Audio and Video Data
Su
Suppose if it a colourful image, we will get 3 colours of pixels. They are Red(R),
Green(G) and Blue (B) which are called primary colours.
All colours in the Universe of these 3 colours. Each pixel in the colourful image is
represented in this with R, G and B colours.
When these Red pixels, Green pixels and Blue pixels are separated, then 3 channels are
used. Each channel is a matrix of pixels.
All these are represented by binary digits 1s and 0s. Thus, there will be 3 matrices (0 or 1).
An Audio file indicates variations in the intensity of sound. These variations represented by
values from 0 to 1.
A video file indicates a group of images (called frames) that are moved quickly such that
our eyes cannot identify them separately. The images are displayed quickly such that they
give illusion that they are moving.
18. This is the concept behind animation or movement that we observe in the videos. Thus,
a video contains several hundreds of images which all can be converted into 1s and
0s.
So, we consider the images, audio and video files as 1s and 0s only in our Data
Science and Machine Learning projects.
4. Images, Audio and Video Data Contd..
Su
19.
20. Machine Learning(ML)
Su
ML is the field of computer science that gives “computers the ability to learn without
being explicitly programmed”
Machine learning it contains computer algorithms which are used to observe and analyze
data on their own
These algorithms are used to represent the relationship between various elements in the
data.
The final representation of data using algorithm is called Machine learning model. This
model works like brain for the computer in understanding the data.
Data generally divided into two parts:
Train data
Test data.
The train data is used to provide training to the model.
The trained model will be able to predict the future behaviour with new data which is not
part our already existing data, called test data.
21.
22. So, the computer is able to learn how to analyze new data and provide useful
insights/predictions
Machine Learning Cond..
Su
How Machine Learning Model works
26. Examples:
1. Predicting traffic conditions
Green, Orange, Red colours
Based on Speed
2. Product recommendations
Purchasing camera on Amazon, Horror Movie on Netflix
Useful to improve business of E-Commerce
3. Voice recognition system
Voice to Text conversion
Siri, Google Assistant etc.
4. Stock market predictions
Share Market trend
5. Medical diagnosis
Aanalyze past data of the patient and prescribe correct medicine.
Forecast the oncoming disease and for prescribing appropriate medicine
6. Spam email filtering
Analyzing the words of the email to detect whether it is a spam mail or not
Machine Learning Cond..
27. Deep Learning(DL)
Su
Deep learning is a subset of Machine learning where machines learn on their own using
Artificial Neural Networks (ANN).
An Artificial Neural Network contains several layers and, in each layer, there will be
several nodes.
Each node takes data and sends the data to the next layer where the data is processed.
The nodes are similar to biological neurons that exist in the brain of the humans.
Neurons in the human brain that accept signals in the form of light and process them to
understand the meaning of the signals.
For example Recognize the objects through our eyes
29. Deep Learning(DL) Contd..
Su
Examples
Image recognition
To identify persons, animal, places and objects
To detect criminals
To suggest names for photos
Sentiment analysis
Sentiment represents the opinion expressed by a person regarding whether a product or
service is good or bad.
Rating based on movie reviews
Language translation :Translating from one language to another can be done with the help of
Neural networks that are already trained with language words.
Music composition :Neural Networks can make computers recognize the patterns in
music(sound).
30. Artificial Intelligence(AI)
Su
Artificial Intelligence is a wide-ranging branch of computer science concerned with
building smart machines capable of performing tasks that typically require human
intelligence (ability to think and act).
Every part of the machine is created such that it imitates the working of human body
parts like robotic eye is created which uses sensors (some robotic eyes will not even
use sensors) to detect the objects and Artificial neural network analyses the object to
understand what it is.
31. Examples:
1. Self driving cars:
Tesla, Google and Waymo
Drives by following the signals, people and other objects on the road.
Uses sensors to identify other vehicles and objects while driving.
2. AI Robots: Robots can do complex tasks , If it can take own decisions then it is called
AI Robot. Used for the tasks that cannot be performed by human beings example,
repairing hot furnaces, cleaning nuclear waste, conducting complex surgeries, etc.
Artificial Intelligence(AI)
Su
Sofia – AI Robot
Tesla- self Driving car
33. When the AI Robot looks like a human being it is called cyborg or humanoid.
The first functional cyborg was developed by Hong Kong based company Hanson
Robotics in the year 2016 and named Sophia.
It is capable of imitating human gestures and facial expressions. She can also engage
in simple conversations.
In October 2017, Sophia has been given citizenship from Saudi Arabia. Thus, it is
the first robot to receive citizenship of any country.
Artificial Intelligence(AI)
Su
34. Supervised Learning models take data with column names and (labels)
target be predicted.
The data is generally divided into train and test data.
They understand the training data and check their learning
accuracy using test data.
Predict the salary of new employee based on his experience
Unsupervised Learning models do not have data with labels and target
columns.
They try to understand the given data by extracting features and
patterns on their own.
They will not make any predictions but they provide some
information about the given data
Classify the items of super market into one of the classes
depending on their maximum similarity like vegetable, oil or soap
etc.
Supervised Learning vs Unsupervised Learning
Su
35. Several ML algorithms(models) have been developed by Data Scientists and
Mathematicians in both Supervised and Unsupervised Learning systems and classified as
shown below.
Supervised Learning vs Unsupervised Learning Contd..
Su
40. How to Execute Machine Learning Programs
All the Machine Learning models are presented in the form of several classes and
methods in Python language.
They are given as part of libraries which we have to import and use in our programs.
Since we have to work with data, and sometimes there may be huge data, we need an
environment which is suitable to handle huge data efficiently.
For this purpose, Anaconda is highly recommended by Data Scientists and Machine
Learning experts. Anaconda is an environment (or platform) that contains several
IDEs and tools to develop Data Science and Machine Learning related programs.
Anaconda provides very important IDEs like Spider and Jupyter Notebook. Also, it
contains its own copy of R and Python programming languages.
42. Supervised vs. Unsupervised Learning
• Supervised Learning:
• All the predictors, Xi, and the response, Yi, are observed.
• Many regression and classification methods
• Unsupervised Learning:
• Here, only the Xi’s are observed (not Yi’s).
• We need to use the Xi’s to guess what Y would have been, and then build a
model form there.
• Clustering and principal components analysis