Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×
Próximo SlideShare
Machine learning
Machine learning
Cargando en…3

Eche un vistazo a continuación

1 de 43 Anuncio

Más Contenido Relacionado


Similares a Machine Can Think (20)

Más reciente (20)


Machine Can Think

  2. 2. Agenda  Introduction  Basics  Types of Machine Learning  Machine Learning Technologies  Application  Vision in next few years
  3. 3. Quick Questionnaire  How many people have heard about Machine Learning ?  How many people know about Machine Learning ?  How many people are using Machine Learning ?
  4. 4. What is Machine Learning ?  Subfield of Artificial Intelligence.  First Arthur Samuel gave the concept of Machine Learning, In 1959.  "Field of study that gives computers the ability to learn without being explicitly programmed“.  Computer program is said to be learn from Experience (E) with some class of tasks (T) and performance measure (P) if its performance at tasks in T as measured by P improves with E.
  5. 5. What is Machine Learning ?  Explores the study and construction of algorithms that can learn from and make predictions on data.  Algorithms operate by building a model from example inputs.  Data driven predictions or decisions.  Unlike strictly static program instructions as we do.
  6. 6. Artificial Intelligence  Machine Learning is the branch of the Artificial Intelligence.  Inserting the learning capabilities just like humans into machines.  Even the fastest supercomputer is 32 times slower than Human Brain.  Predictions says that in 2o6o , we are able to form the digital brain like humans.  NLP (Natural Language Processing ) is also based on the Machine Learning , more the data the machine has , more its prediction goes to perfect.  Titanic Disaster could be saved through Machine Learning.
  7. 7. Use of Machine Learning  Google Search, Google News ,Page Ranking decided by Machine Learning.  Upload images , automatically detects the face of your friend.  Spam filter which is used to filter our mails from tones of spam mails.  Right product for the right customers.
  8. 8. More applications  Speech and hand-writing recognition  Autonomous robot control  Data mining and bioinformatics: motifs, alignment, …  Playing games  Fault detection  Clinical diagnosis  Credit scoring, fraud detection  Web mining: search engines  Market basket analysis
  9. 9. Why Machine Learning  Human expertise does not exist (navigating on Mars)-  TARS in Interstellar.  Humans are unable to explain their expertise (speech recognition).  Solution changes in time (routing on a computer network).  Solution needs to be adapted to particular cases (user biometrics).
  10. 10. Terminology / Basic Terms  Features – The numbers of features and distinct traits that can be used to describe each item in quantitative manner.  Samples – Sample is an item to process. It can document, picture, sound, video or any other file contains data.  Feature Vector – n dimensional vector that represents some object.  Training Set – Set of data to discover potentially predictive relationships.
  11. 11. Terminology with Example Features Color – Red Type- Logo Shape Features Color – Light Blue Type – Logo Shape Here sample are –both apples, Feature Vector =[Color, Type, Shape] , Training Set- Taken all at time
  12. 12. Categories
  13. 13. Types of Problems and Tasks  Depending on the nature of the learning "signal" or "feedback" available to a learning system. Supervised Learning Unsupervised Learning Reinforcement Learning
  14. 14. Example of Supervised Learning
  15. 15. Supervised Learning  Learning from labelled data, and different set of training examples.  Input and output is fixed.  the goal is to learn a general rule that maps inputs to outputs.  Or find the correlation to between input and output to find the algo which is general to all the training examples.  Input data called Vector & Output value called Supervisory signal.  Presence of Expert or Teacher.  E.g.- Neural Networks , Decision Trees , Bayesian Classification.
  16. 16. To solve Supervised Learning problem  Determine the type of training examples.  Decide what kind of data is to be used as a training set.  Gather a training set.  Set of input object and corresponding output is gathered.  Determine the input feature representation of the learned function.  The input object is transformed into a feature vector, which contains a number of features that are descriptive of the object.
  17. 17. To solve Supervised Learning problem  Determine the structure of the learned function and corresponding learning algorithm.  Find out the function or algorithm which maps all the training sets.  Just like bridge how input is connected with output.  Complete the design.  Addition of some control parameters & adjusted by optimizing performance.  Evaluate the accuracy of the learned function.  Check it is working properly or not, if not redesign again.
  18. 18. Supervised Learning Flow Chart Raw Data AlgorithmSample Data Trained Product Verification Production
  19. 19. Application  Bioinformatics  Database marketing  Handwriting recognition  Spam detection  Pattern Recognition  Speech Recognition
  20. 20. Unsupervised Learning  No labels are given to the learning algorithm.  Find structure in its input with the help of Clustering.  Discover hidden patterns in data and find the suitable algorithm.  As input is unlabeled, there is no error or reward signal to evaluate a potential solution. This makes it different form others.  Self guided learning algorithm.  Plays important role in data mining methods to preprocess the data.  Approaches to Unsupervised Learning – K means, hierarchical clustering, mixture models.
  21. 21. Unsupervised Learning
  22. 22. K- means / Hierarchical  K means is a method of vector quantization.  Partition of n observation into k cluster, and it belongs to nearest mean  Popular of clustering analysis in data mining.  NP Hard Problem.  Hierarchical clustering builds a hierarchy of clusters.  Agglomerative (Bottom Up Approach)  Divisive (Top down Approach)
  23. 23. Applications
  24. 24. Difference Supervised Vs Unsupervised
  25. 25. Reinforcement Learning  Program interacts with a dynamic environment.  No explicit instructions.  Decide its own whether it is near to goal or not.  “Approximate Dynamic Programming”  Unlike supervised learning correct input/output pairs are never presented.  No optimization step is there like supervised learning to tell we have reached up to our goal.  There is a focus on on-line performance.  Finds a balance between exploration (of uncharted territory) and exploitation (of current knowledge)
  26. 26. Basic Reinforcement Learning Model  Set of environment states S.  Set of actions A.  Rules of transitions between states.  Rules that determine the scalar immediate reward of transition.  Rules that describe what the agent observes.
  27. 27. Algorithms used for Reinforcement Learning  Criterion of optimality  the problem studied is episodic, an episode ending when some terminal state is reached.  Brute force (2 Step Policies)  For each possible policy, sample returns while following it.  Choose the policy with the largest expected return.  1.Value function estimation 2. Direct policy search  Value function approaches  It finds the policy which return maximize but maintaining sets.  Based on MKP(Markov Decision Parameters)
  28. 28. Applications of Reinforcement Learning  Game theory  Control theory  Operations research  Information theory  Simulation-based optimization  Multi-agent systems  Swarm intelligence  Statistics  Genetic algorithms
  29. 29. Semi-Supervised Learning  Semi-supervised learning is a class of supervised learning tasks.  But it uses large amount of unlabelled data with the labelled data.  Actually it falls between supervised learning and supervised learning.  Assumptions used in semi-supervised learning.  Smoothness assumption - Points which are close to each other are more likely to share a label.  Cluster assumption - The data tend to form discrete clusters, and points in the same cluster are more likely to share a label  Manifold assumption - The data lie approximately on a manifold of much lower dimension than the input space.
  30. 30. How ML used in Hospitals
  31. 31. Machine Learning Methods based on output of a machine-learned system
  32. 32. Another Categorization  Based on “desired output” of a machine-learned system Classification Regression Clustering
  33. 33. Classification  Predict class from observations.  Inputs are divided into two or more classes.  Model assigns unseen inputs to one (or multi-label classification) or more of these classes.  Spam filtering is an example of classification, where the inputs are email (or other) messages and the classes are "spam" and "not spam"
  34. 34. Regression  Relation between mean value of one variable and corresponding value of another variable.  Statistical method to find the relation between different variables.  Predict the output with the training data and observations.  Popular method – Logistic Regression or binary regression.  The outputs are continuous rather than discrete.
  35. 35. Clustering  Grouping a set of objects in such way that objects in the same group are similar to each other.  Objects are not predefined.  Grouping in meaningful group.  Unlike in classification, the groups are not known beforehand, making this typically an unsupervised task.  Example – Man’s shoes , woman’s shoes , man’s t-shirt, woman’s t-shirts.  So they are two category “man & woman” and “t-shirts & shoes”.
  36. 36. Popular Framework / Tools  Weka  Carrot2  Gate  OpenNLP  LingPipe  Mallet – Topic Modelling  Gensim – Topic Modelling (Python)  Apache Mahout  Mlib – Apache Spark  Scikit learn – Python
  37. 37. Difference Classification  Classification means to group the output into class.  Classification to predict the type of tumor i.e. harmful or not using the training data sets.  If it is discrete / categorical variable , then it is classification problem. Regression  Regression means to predict the output value using training data.  Regression to predict the price of the house from training data sets.  If it is real / continuous then it is regressions problem.
  38. 38. Approaches
  39. 39. Decision Tree Learning  Predictive model.  Maps observations about an item to conclusions about the item's target value.  Used in Statistics and data mining.  Tree models where the target variable can take a finite set of values are called classification trees.  Leaves represent class labels & branches represent conjunctions.  When target variable can take continuous values - regression trees.  In data mining, a decision tree describes data but not decisions.  Example – Wikipedia
  40. 40. Artificial Neural Networks  Inspired by Biological Neural Networks(Central Nervous System of animal).  Used when there are large number of inputs and generally unknown.  ANNs are generally presented as systems of interconnected "neurons" which exchange messages between each other.  Used to solve computer vision, speech recognition and handwriting recognition.  Eg. In handwriting recognition  1. Input neuron activated by the pixels of an input image.  2. Weighted and transformed by a function, the activations of these neurons are then passed on to other neurons.  3. This process is repeated until finally, an output neuron is activated. This determines which character was read.
  41. 41. Artificial Neural Networks Structure
  42. 42. Any Queries ? for more information :-