Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
Dma unit  2
Dma unit 2
Cargando en…3
×

Eche un vistazo a continuación

1 de 8 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Classification (20)

Anuncio

Más reciente (20)

Classification

  1. 1. Classification • Classification is a most familiar and most popular data mining technique. • Classification is a form of data analysis that extracts models describing important data classes. • Classification is a data mining function that assigns items in a collection to target categories or classes. • The goal of classification is to build a concise model that can be use to predict the class of records whose class label is not know. • models, called classifiers, predict categorical (discrete, unordered) class labels. • Such analysis can help provide us with a better understanding of the data at large. • applications, including fraud detection, target marketing, performance prediction, manufacturing, and medical diagnosis.
  2. 2. Classification •A bank loans officer needs analysis of her data to learn which loan applicants are “safe” and which are “risky” for the bank. •A marketing manager at AllElectronics needs data analysis to help guess whether a customer with a given profile will buy a new computer. •A medical researcher wants to analyze breast cancer data to predict which one of three specific treatments a patient should receive. •the data analysis task is classification, where a model or classifier is constructed to predict class (categorical) labels, such as “safe” or “risky” for the loan application data; “yes” or “no” for the marketing data; or “treatment A,” “treatment B,” or “treatment C” for the medical data.
  3. 3. Classification • Suppose that the marketing manager wants to predict how much a given customer will spend during a sale at AllElectronics. •This data analysis task is an example of numeric prediction, where the model constructed predicts a continuous-valued function, or ordered value, as opposed to a class label. •This model is a predictor. •Regression analysis is a statistical methodology that is most often used for numeric prediction; •hence the two terms tend to be used synonymously, although other methods for numeric prediction exist. • Classification and numeric prediction are the two major types of prediction problems.
  4. 4. Classification Data classification is a two-step process. 1.learning step (where a classification model is constructed) 2. classification step (where the model is used to predict class labels for given data). learning step (or training phase), where a classification algorithm builds the classifier by analyzing or “learning from” a training set made up of database tuples and their associated class labels.
  5. 5. Classification In the model build (training) process, a classification algorithm finds relationships between the values of the predictors and the values of the target. Different classification algorithms use different techniques for finding relationships. These relationships are summarized in a model, which can then be applied to a different data set in which the class assignments are unknown. Classification models are tested by comparing the predicted values to known target values in a set of test data. The historical data for a classification project is typically divided into two data sets: one for building the model; the other for testing the model. The class label attribute is discrete-valued and unordered. •The individual tuples making up the training set are referred to as training tuples and are randomly sampled from the database under analysis. • In the context of classification, data tuples can be referred to as samples, examples, instances, data points, or objects.
  6. 6. Classification
  7. 7. Supervised Learning supervised learning is when we teach or train the machine using data that is well labeled. Which means some data is already labeled with the correct answer. After that, the machine is provided with a new set of examples(data) so that the supervised learning algorithm analyses the training data(set of training examples) and produces a correct outcome For instance, suppose you are given a basket filled with different kinds of fruits. Now the first step is to train the machine with all different fruits one by one from labeled data.
  8. 8. Supervised Learning If the shape of the object is rounded and has a depression at the top, is red in color, then it will be labeled as –Apple. If the shape of the object is a long curving cylinder having Green- Yellow color, then it will be labeled as –Banana. Now suppose after training the data, you have given a new separate fruit, say Banana from the basket, and asked to identify it.

×