Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Unsupervised learning seminar report

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Page | 1
UNSUPERVISED LEARNING (UL)
SEMINAR REPORT
Page | 2
CHAPTER 1
INTRODUCTION
MACHINE LEARNING
FIG 1
Machine Learning tutorial provides basic and advanced concepts of
m...
Page | 3
What Is Machine Learning
In the real world, we are surrounded by humans who can learn
everything from their exper...
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 26 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Unsupervised learning seminar report (20)

Anuncio

Más reciente (20)

Anuncio

Unsupervised learning seminar report

  1. 1. Page | 1 UNSUPERVISED LEARNING (UL) SEMINAR REPORT
  2. 2. Page | 2 CHAPTER 1 INTRODUCTION MACHINE LEARNING FIG 1 Machine Learning tutorial provides basic and advanced concepts of machine learning. Our machine learning tutorial is designed for students and working professionals. Machine learning is a growing technology which enables computers to learn automatically from past data. Machine learning uses various algorithms for building mathematical models and making predictions using historical data or information. Currently, it is being used for various tasks such as image recognition, speech recognition, email filtering, Facebook auto-tagging, recommender system, and many more. This machine learning tutorial gives you an introduction to machine learning along with the wide range of machine learning techniques such as Supervised, Unsupervised, and Reinforcement learning. You will learn about regression and classification models, clustering methods, hidden Markov models, and various sequential models.
  3. 3. Page | 3 What Is Machine Learning In the real world, we are surrounded by humans who can learn everything from their experiences with their learning capability, and we have computers or machines which work on our instructions. But can a machine also learn from experiences or past data like a human does? So here comes the role of Machine Learning. FIG 2 Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the development of algorithms which allow a computer to learn from the data and past experiences on their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define it in a summarized way as: Machine learning enables a machine to automatically learn from data, improve performance from experiences, and predict things without being explicitly programmed.
  4. 4. Page | 4 FIG 3 With the help of sample historical data, which is known as training data, machine learning algorithms build a mathematical model that helps in making predictions or decisions without being explicitly programmed. Machine learning brings computer science and statistics together for creating predictive models. Machine learning constructs or uses the algorithms that learn from historical data. The more we will provide the information, the higher will be the performance. Classification of Machine Learning At a broad level, machine learning can be classified into three types: 1) Supervised learning 2) Unsupervised learning 3) Reinforcement learning
  5. 5. Page | 5 FIG 4 FIG 5
  6. 6. Page | 6 FIG 6 FIG 7
  7. 7. Page | 7 CHAPTER 2 UNSUPERVISED LEARNING What is unsupervised learning? As the name suggests, unsupervised learning is a machine learning technique in which models are not supervised using training dataset. Instead, models itself find the hidden patterns and insights from the given data. It can be compared to learning which takes place in the human brain while learning new things. It can be defined as: Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset and are allowed to act on that data without any supervision. FIG 8
  8. 8. Page | 8 Unsupervised learning cannot be directly applied to a regression or classification problem because unlike supervised learning, we have the input data but no corresponding output data. The goal of unsupervised learning is to find the underlying structure of dataset, group that data according to similarities, and represent that dataset in a compressed format. Example: Suppose the unsupervised learning algorithm is given an input dataset containing images of different types of cats and dogs. The algorithm is never trained upon the given dataset, which means it does not have any idea about the features of the dataset. The task of the unsupervised learning algorithm is to identify the image features on their own. Unsupervised learning algorithm will perform this task by clustering the image dataset into the groups according to similarities between images. FIG 9
  9. 9. Page | 9 CHAPTER 3 USES AND WORKING OF UNSUPERVISED LEARNING Why use Unsupervised Learning? Below are some main reasons which describe the importance of Unsupervised Learning: Unsupervised learning is helpful for finding useful insights from the data. Unsupervised learning is much similar as a human learns to think by their own experiences, which makes it closer to the real AI. Unsupervised learning works on unlabeled and uncategorized data which make unsupervised learning more important. In real-world, we do not always have input data with the corresponding output so to solve such cases, we need unsupervised learning. Working of Unsupervised Learning ? Working of unsupervised learning can be understood by the below diagram:
  10. 10. Page | 10 FIG 10 FIG 11 FIG 12
  11. 11. Page | 11 Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns from the data and then will apply suitable algorithms such as k-means clustering, Decision tree, etc. In unsupervised learning, an AI system is presented with unlabeled, uncategorized data and the system’s algorithms act on the data without prior training. The output is dependent upon the coded algorithms. Subjecting a system to unsupervised learning is an established way of testing the capabilities of that system. Unsupervised learning algorithms can perform more complex processing tasks than supervised learning systems. However, unsupervised learning can be more unpredictable than the alternate model. A system trained using the unsupervised model, might, for example, figure out on its own how to differentiate cats and dogs, it might also add unexpected and undesired categories to deal with unusual breeds, which might end up cluttering things instead of keeping them in order. For unsupervised learning algorithms, the AI system is presented with an unlabeled and uncategorized data set. The thing to keep in mind is that this system has not undergone any prior training. In essence, unsupervised learning can be thought of as learning without a teacher. In case of supervised learning, the system has both the inputs and the outputs. So depending on the difference between the desired output and the observed output, the system is set to learn and improve. However, in the case of unsupervised learning, the system only has inputs and no outputs.
  12. 12. Page | 12 CHAPTER 4 TYPES OF UNSUPERVISED LEARNING ALGORITHM Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to the similarities and difference between the objects. Types of Unsupervised Learning Algorithm: The unsupervised learning algorithm can be further categorized into two types of problems: FIG 13
  13. 13. Page | 13 FIG 14 Clustering: Clustering is a method of grouping the objects into clusters such that objects with most similarities remains into a group and has less or no similarities with the objects of another group. Cluster analysis finds the commonalities between the data objects and categorizes them as per the presence and absence of those commonalities. Association: An association rule is an unsupervised learning method which is used for finding the relationships between variables in the large database. It determines the set of items that occurs together in the dataset. Association rule makes marketing strategy more effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.
  14. 14. Page | 14 UNSUPERVISED LEARNING ALGORITHMS: Below is the list of some popular unsupervised learning algorithms: 1) K-means clustering 2) KNN (k-nearest neighbors) 3) Hierarchal clustering 4) Anomaly detection 5) Neural Networks 6) Principle Component Analysis 7) Independent Component Analysis 8) Apriori algorithm 9) Singular value decomposition Hierarchical Clustering: Hierarchical clustering is an algorithm which builds a hierarchy of clusters. It begins with all the data which is assigned to a cluster of their own. Here, two close cluster are going to be in the same cluster. This algorithm ends when there is only one cluster left. K-means Clustering K means it is an iterative clustering algorithm which helps you to find the highest value for every iteration. Initially, the desired number of clusters are selected. In this clustering method, you need to cluster the data points into k groups. A larger k means smaller groups with more granularity in the same way. A lower k means larger groups with less granularity. The output of the algorithm is a group of "labels." It assigns data point to one of the k groups. In k-means clustering, each group is defined by creating a centroid for each group. The centroids are like the heart of the cluster, which captures the points closest to them and adds them to the cluster.
  15. 15. Page | 15 K-mean clustering further defines two subgroups: Agglomerative clustering Dendrogram Agglomerative clustering: This type of K-means clustering starts with a fixed number of clusters. It allocates all data into the exact number of clusters. This clustering method does not require the number of clusters K as an input. Agglomeration process starts by forming each data as a single cluster. This method uses some distance measure, reduces the number of clusters (one in each iteration) by merging process. Lastly, we have one big cluster that contains all the objects. Dendrogram: In the Dendrogram clustering method, each level will represent a possible cluster. The height of dendrogram shows the level of similarity between two join clusters. The closer to the bottom of the process they are more similar cluster which is finding of the group from dendrogram which is not natural and mostly subjective. K- Nearest neighbors K- nearest neighbour is the simplest of all machine learning classifiers. It differs from other machine learning techniques, in that it doesn't produce a model. It is a simple algorithm which stores all available cases and classifies new instances based on a similarity measure. It works very well when there is a distance between examples. The learning speed is slow when the training set is large, and the distance calculation is nontrivial.
  16. 16. Page | 16 Principal Components Analysis: In case you want a higher-dimensional space. You need to select a basis for that space and only the 200 most important scores of that basis. This base is known as a principal component. The subset you select constitute is a new space which is small in size compared to original space. It maintains as much of the complexity of data as possible.
  17. 17. Page | 17 CHAPTER 5 DIFFERENCE BETWEEN SUPERVISED AND UNSUPERVISED LEARNING Supervised and Unsupervised learning are the two techniques of machine learning. But both the techniques are used in different scenarios and with different datasets. Below the explanation of both learning methods along with their difference table is given. FIG 15 Supervised Machine Learning: Supervised learning is a machine learning method in which models are trained using labeled data. In supervised learning, models need to find the mapping function to map the input variable (X) with the output variable (Y). Supervised learning needs supervision to train the model, which is similar to as a student learns things in the presence of a teacher.
  18. 18. Page | 18 Supervised learning can be used for two types of problems: Classification and Regression. Example: Suppose we have an image of different types of fruits. The task of our supervised learning model is to identify the fruits and classify them accordingly. So to identify the image in supervised learning, we will give the input data as well as output for that, which means we will train the model by the shape, size, color, and taste of each fruit. Once the training is completed, we will test the model by giving the new set of fruit. The model will identify the fruit and predict the output using a suitable algorithm. Unsupervised Machine Learning: Unsupervised learning is another machine learning method in which patterns inferred from the unlabeled input data. The goal of unsupervised learning is to find the structure and patterns from the input data. Unsupervised learning does not need any supervision. Instead, it finds patterns from the data by its own. Unsupervised learning can be used for two types of problems: Clustering and Association. Example: To understand the unsupervised learning, we will use the example given above. So unlike supervised learning, here we will not provide any supervision to the model. We will just provide the input dataset to the model and allow the model to find the patterns from the data. With the help of a suitable algorithm, the model will train itself and divide the fruits into different groups according to the most similar features between them.
  19. 19. Page | 19 The main differences between Supervised and Unsupervised learning are given below: Supervised Learning Unsupervised Learning Supervised learning algorithms are trained using labeled data. Unsupervised learning algorithms are trained using unlabeled data. Supervised learning model takes direct feedback to check if it is predicting correct output or not. Unsupervised learning model does not take any feedback. Supervised learning model predicts the output. Unsupervised learning model finds the hidden patterns in data. In supervised learning, input data is provided to the model along with the output. In unsupervised learning, only input data is provided to the model. The goal of supervised learning is to train the model so that it can predict the output when it is given new data. The goal of unsupervised learning is to find the hidden patterns and useful insights from the unknown dataset. Supervised learning needs supervision to train the model. Unsupervised learning does not need any supervision to train the model. Supervised learning can be categorized in Classification and Regression pro blems. Unsupervised Learning can be classified in Clustering and Associations pro blems. Supervised learning can be used for those cases where we know the input as well as corresponding outputs. Unsupervised learning can be used for those cases where we have only input data and no corresponding output data. Supervised learning model produces an accurate result. Unsupervised learning model may give less accurate result as compared to supervised learning. Supervised learning is not close to true Artificial intelligence as in this, we first train the model for each Unsupervised learning is more close to the true Artificial Intelligence as it learns similarly
  20. 20. Page | 20 data, and then only it can predict the correct output. as a child learns daily routine things by his experiences. It includes various algorithms such as Linear Regression, Logistic Regression, Support Vector Machine, Multi-class Classification, Decision tree, Bayesian Logic, etc. It includes various algorithms such as Clustering, KNN, and Apriori algorithm. FIG 16
  21. 21. Page | 21 FIG 17
  22. 22. Page | 22 TABLE 2
  23. 23. Page | 23 CHAPTER 6 APPLICATIONS AND PRONS AND CONS APPPLICATIONS OF UNSUPERVISED LEARNING • Some applications of unsupervised machine learning techniques are: • Clustering automatically split the dataset into groups base on their similarities • Anomaly detection can discover unusual data points in your dataset. It is useful for finding fraudulent transactions • Association mining identifies sets of items which often occur together in your dataset • Latent variable models are widely used for data preprocessing. Like reducing the number of features in a dataset or decomposing the dataset into multiple components
  24. 24. Page | 24 PRONS AND CONS Advantages of Unsupervised Learning • Unsupervised learning is used for more complex tasks as compared to supervised learning because, in unsupervised learning, we don't have labeled input data. • Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data. Disadvantages of Unsupervised Learning • Unsupervised learning is intrinsically more difficult than supervised learning as it does not have corresponding output. • The result of the unsupervised learning algorithm might be less accurate as input data is not labeled, and algorithms do not know the exact output in advance. FIG 18
  25. 25. Page | 25 CHAPTER 7 SUMMARY Unsupervised learning is a machine learning technique, where you do not need to supervise the model. Unsupervised machine learning helps you to finds all kind of unknown patterns in data. Clustering and Association are two types of Unsupervised learning. Four types of clustering methods are 1) Exclusive 2) Agglomerative 3) Overlapping 4) Probabilistic. Important clustering types are: 1) Hierarchical clustering 2) K-means clustering 3) K-NN 4) Principal Component Analysis 5) Singular Value Decomposition 6) Independent Component Analysis. Association rules allow you to establish associations amongst data objects inside large databases. In Supervised learning, Algorithms are trained using labelled data while in Unsupervised learning Algorithms are used against data which is not labelled. Anomaly detection can discover important data points in your dataset which is useful for finding fraudulent transactions.
  26. 26. Page | 26 The biggest drawback of Unsupervised learning is that you cannot get precise information regarding data sorting. CONCLUSION Machine learning algorithms learn from data to find hidden relations, to make predictions, to interact with the world, … A machine learning algorithm is as good as its input data Good model + Bad data = Bad Results Deep learning is making significant breakthroughs in: speech recognition, language processing, computer vision, control systems, … If you are not using or considering using Deep Learning to understand or solve vision problems, you almost certainly should be ... REFERENCES : https://www.javatpoint.com/unsupervised-machine-learning https://en.wikipedia.org/wiki/Unsupervised_learning https://www.guru99.com/unsupervised-machine-learning.html

×