SlideShare una empresa de Scribd logo
1 de 4
Descargar para leer sin conexión
Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research
                        and Applications (IJERA)       ISSN: 2248-9622 b www.ijera.com
                                 Vol. 2, Issue 4, July-August 2012, pp.164-167
        Customer Card Classification Based on C5.0 & CART Algorithms
                *Prof. Nilima Patil, **Prof. Rekha Lathi, ***Prof. Vidya Chitre
             *Computer Department K.C.college of Engineering ,Thane,(E) Mumbai Thane,(E) , Mumbai, India
        **Computer Department, Pillai Institute of Technology, New Panvel,Navi Mumbai, India
            Computer Department, Bharti Vidyapith college of Engg. Technology, Kharghar, Navi Mumbai, India



Abstract—Data mining is the useful tool to discovering             technology with great potential to help companies focus on
the knowledge from large data. For this we require                 the most important information in their data warehouses.
proper      classification  methods     &      algorithms.         Data mining tools predict future trends and behaviors,
Classification is most common method used for finding              allowing businesses to make proactive, knowledge-driven
the mine rule from the large database. Decision tree               decisions. The automated, prospective analyses offered by
method generally used for the Classification, because it is        data mining move beyond the analyses of past events
the simple hierarchical structure for the user                     provided by retrospective tools typical of decision support
understanding & decision making. Different data mining             systems. Data mining tools can answer business questions
algorithms available for classification but decision tree          that traditionally were too time consuming to resolve. They
mining is simple way. The objective of this paper is to            scour databases for hidden patterns, finding predictive
provide the way for Decision Making process of                     information that experts may miss because it lies outside
Customer for recommended the membership card. Here I               their expectations.
am using C5.0 & CART algorithms on customer database
for classification. Both algorithms first applied on               III. CLASSIFICATION ALGORITHMS
training dataset & created the decision tree, some rules                     Decision Tree [7] is an important model to realize
are derived from decision tree. Same rules then applied            the classification. It was a learning system—CLS builded by
on evaluation data set. Comparing the results of both              Hunt etc, when they researched on human concept modeling
algorithms recommended the card to the new customer                early in the 1960s. To the late 70s, J. Ross .Quinlan put
those having similar characteristics.                              forward the ID3 algorithm [3]-[5]. In 1993, Quinlan
Keywords- Data mining, classification algorithm, decision          developed C4.5/C5.0 algorithm on basis of the ID3
tree, Regression tree, membership card                             algorithm. Below is the detailed explanation of the C5.0
                                                                   algorithm and the CART algorithm [1] which are used in
 I. INTRODUCTION                                                   this paper.
           In commercial operation, using the membership
 card service is the most superior method to help the              A.C5.0 Algorithm:
 businessmen to accumulate the customers’ information. On
 the one hand they may obtain customers’ basic information          C5.0 algorithm is an extension of C4.5 algorithm. C5.0 is
 to maintain long-term contact with them. On the special day,      the classification algorithm which applies in big data set.
 like the holiday or the customer’s birthday, they can by          C5.0 is better than C4.5 on the efficiency and the memory.
 delivering a warm blessing, promote customers’                    C5.0 model works by splitting the sample based on the field
 satisfaction. On the other hand, they may through the             that provides the maximum information gain. The C5.0
 customer’s transaction information, like the purchase             model can split samples on basis of the biggest information
 volume, the purchase frequency, analyze what is the value         gain field..The sample subset that is get from the former
 of the customer to the enterprise and analyze the                 split will be split afterward. The process will continue until
 characteristics of each kind card customers to help the           the sample subset cannot be split and is usually according to
 enterprise with a clear goal to recommend the card to a new       another field. Finally, examine the lowest level split, those
 customer.                                                         sample subsets that don’t have remarkable contribution to
 The classification analysis [6] is by analyzing the data in the   the model will be rejected.
 demonstration database, to make the accurate description or
 establish the accurate model or mine the classifying rule for     Information Gain:

 each category, generated classifying rule used to classify        Gain [8] is computed to estimate the gain produced by a
 records in other databases.                                       split over an attribute

 II. DATA MINING TECHNOLOGY                                             Let S be the sample:
          Data mining [2] is the extraction of hidden
                                                                           Ci is Class I; i = 1,2,…,m
 predictive information from large databases. It is a powerful
 new                                                                        I(s1,s2,..,sm)= - Σ pi log2 (pi)

                                                                           Si is the no. of samples in class i

                                                                                                                    164 | P a g e
Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research
                           and Applications (IJERA)       ISSN: 2248-9622 b www.ijera.com
                                    Vol. 2, Issue 4, July-August 2012, pp.164-167
             Pi = Si /S

             log2 is the binary logarithm

            Let Attribute A have v distinct values.

            Entropy = E(A) is                                       CART divides the property which leads a minimum value
                                                                     after the division.
        Σ{(S1j+S2j+..+Smj)/S}*I(s1j,..smj)             j=1
                                                                      IV. METHODOLOGY USED
            Where Sij is samples in Class i and subset j of                  In commercial operation, using the membership
             Attribute A.                                            card service is the most superior method to help the
                                                                     businessmen to accumulate the customers’ information. The
        I(S1j,S2j,..Smj)= - Σ pij log2 (pij)                         Membership card system management which is helpful not
                                                                     only to accumulate the customer’s information but also to
            Gain(A)=I(s1,s2,..,sm) - E(A)                           offer corresponding service for different card-rank users.
                                                                     From this way we can enhance customers' loyalty to the
    Gain ratio then chooses, from among the tests with at least      store. Therefore, so as to recommend corresponding card to
    average gain,                                                    the appropriate customer, we want to obtain different card-
                                                                     rank customers’ characteristics and which is the most
    The Gain Ratio= P(A)                                             important factor that affects the customers to choose this
                                                                     kind of card not that kind.
                                                                     There are various data mining classification algorithm. For
                                                                     this application we can make use these data mining
                                                                     techniques in order to find some rules from large database.
    Gain Ratio(A)= Gain(A)/P(A)                                      Firstly on training data set, I am using two well known
                                                                     decision tree classification algorithms are C5.0 and CART.
    B. CART Algorithm                                                Figure 1 shows Block diagram below.

   Classification and Regression Trees (CART) is a flexible
    method to describe how the variable Y distributes after
    assigning the forecast vector X. This model uses the binary
    tree to divide the forecast space into certain subsets on
    which Y distribution is continuously even. Tree's leaf nodes
    correspond to different division areas which are determined
    by Splitting Rules relating to each internal node. By moving
    from the tree root to the leaf node, a forecast sample will be
    given an only leaf node, and Y distribution on this node also
    be determined.

    Splitting criteria:

    CART uses GINI Index to determine in which attribute the
    branch should be generated. The strategy is to choose the
    attribute whose GINI Index is minimum after splitting.

    GINI index:

    Assuming training set T includes n samples, the target
    property has m values, among them, the ith value show in T
    with a probability Pi, so T’s GINI Index can be described as
    below:



                                                                                      Figure 1: Block Diagram

                                                                     Before implementing the algorithms, big dataset of
                                                                     customer information is stored in the database. Through the
    Assuming the A be divided to q subsets, {T1, T2, · · · Tq},
                                                                     analysis of the customer basic information table's attributes,
    among them, Ti’s sample number is ni, so the GINI Index
                                                                     we need to use the target column as a membership card. In
    divided according to property A can be described below:
                                                                     the data understanding stage, by means of analyzing the
                                                                     characteristics of the primary data, we can make further
                                                                                                                      165 | P a g e
Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research
                       and Applications (IJERA)       ISSN: 2248-9622 b www.ijera.com
                                Vol. 2, Issue 4, July-August 2012, pp.164-167
understanding of the data distribution and the affect of               3)    The same with algorithm C4.5, CART also builds
various factors to membership card type. Through the                        the tree before pruning cross-validated using cost-
preliminary analysis, we try to find out the statistics of                  complexity model.
customer records in the database, means the percentage of
golden card customer’s account, the silver card customers,         VI. RESULTS & RULE SUMMERY
the bronze card customers & the normal card customer.                       For algorithm processing used customer store
                                                                   database of 7000 records. Total database are divided into
   Methodology used in this project:-                              70% as a training dataset & 30%         evaluation dataset.
      First consider the Data source as a training data of        Algorithms applied on training dataset & Decision trees are
          customer for algorithms processing. 5000 records         generated for both algorithms. Evaluation dataset used for
          used in training dataset & 2000 records used in          result verification. Both C5.0 & CART Decision trees are
          Evaluation set.                                          given below.
      Select the membership card as a output, & other             C5.0 Decision tree:
          attributes used as a input.
      Apply the two algorithms on the database for
          finding the root & leaf of the tree by using splitting
          criteria.
      Perform the same operations to each attribute up to
          last split & create decision tree
      Pruning method is used on both the decision trees
          for better accuracy.
I am using C5.0 and CART algorithm separately to carry on
the classification for decision tree & rule set formation.

V. COMPARATIVE STUDY
         In customer membership card model two
algorithms will be used and comparative study of both of
them will be done.                                                                 Figure 2: C5.0 Decision tree

                                                                   CART decision tree:
C5.0 algorithm:

In customer membership card model C5.o algorithm is used
to split up data set & find out the result in the form of
decision tree or rule set.

    1)   Splitting criteria used in c5.0 algorithm is
         information gain. The C5.0 model can split
         samples on basis of the biggest information gain.

    2)   Test criteria is decision tree have any number of
         branches available not fixed branches like CART.

    3)   Pruning method performed after creating decision
         tree i.e. post pruning single pass based on binomial
         confidence limits.                                                       Figure 3: CART decision tree
    4)   Speed of c5.0 algorithm is significantly faster than      Rule sets are formed from decision tree of both algorithms it
         c4.5 & more accurate.                                     shows that which factors are mainly affected for customer
                                                                   card classification & take those attributes as the main
CART algorithm:
                                                                   standards to recommend card to a future
CART, short for Classification And Regression Tree, When           customers.
the value of the target attribute is ordered, it is called          Rule set result of both the algorithms are given below.
regression tree; when the value is discrete, it is called
classification tree.                                               Classification result of C5.0 Algorithm.

    1) Selection criterion used in CART algorithm is Gini          Rule summary: Customer whose yearly income is from
       index (diversity index).                                    10,000 to 30,000 is the normal card customer; Customer
                                                                   whose yearly income is from 30,000 to 150,000 and having
    2) The decision-tree generated by CART algorithm is            less than 3 children is the bronze card customer, having
       a simple structured binary tree.                            more than 3 children is the golden card customer; Customer
                                                                   whose yearly income is more than 150,000, and unmarried

                                                                                                                   166 | P a g e
Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research
                       and Applications (IJERA)       ISSN: 2248-9622 b www.ijera.com
                                Vol. 2, Issue 4, July-August 2012, pp.164-167
is the silver card customer, married is the golden card         REFERENCES
customer.
                                                                [1]   Lin Zhang, Yan Chen, Yan Liang “Application of
Classification result of CART algorithm:                              data mining classification algorithm for Customer
                                                                      membership Card Model” College of Transportation
Rule summary: Customer whose yearly income is from                    and Management China 2008.
10,000 to 30,000 is the normal card customer; Customer          [2]   Jiawei Han, Michelin Kamber,“Data Mining
whose yearly income is from 30,000 to 150,000, and having             Concepts and Techniques” [M], Morgan Kaufmann
less than 3 children is the bronze card customer; Customer            publishers, USA, 2001, 70-181.
whose yearly income is more than 30,000, and having more        [3]   Zhixian Niu, Lili Zong, Qingwei Yan, Zhenxing
than 3 children is the golden card customer; Customer                 Zhao College of Computer and Software, “ Auto-
whose yearly income is more than 150,000, and marital                 Recognizing DBMS Workload Based on C5.0
status single is the silver card customer.                            Algorithm” Taiyuan University of Technology, 2009.
Comparing the two classification algorithm we can reach         [4]   Quinlan J R, “Induction of Decision Trees”, [J],
such a conclusion: The main factors that have a big                   Machine Learning, 1986.
influence on customer ranks are the income and child            [5]   Quinlan J R, “C4.5: Programs for Machine
number. Normal card customers, bronze card customer and               Learning”, [M], San Mateo, CA: Morgan Kaufman
golden card customers that are obtained from the two                  Publishers, 1993.
methods are basically the same.                                 [6]   Yue He† , Jingsi Liu , Lirui Lin “Research on
The following graph shows the accuracy of C5.0 is greater             application of data mining.” School of Business and
than CART algorithm.. In the graph, on X axis algorithms              administration, Sichuan University May 2010
are consider and Y axis shows percentage of accuracy.           [7]   S.B. Kotsiantis, Supervised Machine Learning: “A
                                                                      Review of Classification Techniques”, Informatica
                                                                      31,2007,pp. 249-26.
                                                                [8]    Zhu Xiaoliang, Wang Jian, Wu Shangzhuo, Yan
                                                                      Hongcan “Research and Application of the improved
                                                                      Algorithm C4.5 on Decision Tree” Hebei Polytechnic
                                                                      University, International Conference on Test and
                                                                      Measurement 2009.




            Figure 4: Comparison of algorithms

VII. CONCLUSION
         Applying data mining classification algorithm in
the customer membership card classification model can help
to understand Children number & income level factors are
affecting on the card ranks. Therefore, we are able to these
two attributes as the main standards to recommend card to a
new customer. Moreover, we can refer other attributes to
help judge which card to recommend to future customers.
On the basis of resulting factors the enterprise with a clear
goal to recommend the corresponding membership card to
the customer in order to provide the special service for each
kind of card users, and is helpful to enterprise's
development.




                                                                                                            167 | P a g e

Más contenido relacionado

La actualidad más candente

PCA and DCT Based Approach for Face Recognition
PCA and DCT Based Approach for Face RecognitionPCA and DCT Based Approach for Face Recognition
PCA and DCT Based Approach for Face Recognitionijtsrd
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
Figure 1
Figure 1Figure 1
Figure 1butest
 
An apriori based algorithm to mine association rules with inter itemset distance
An apriori based algorithm to mine association rules with inter itemset distanceAn apriori based algorithm to mine association rules with inter itemset distance
An apriori based algorithm to mine association rules with inter itemset distanceIJDKP
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection IJSTA
 
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line TransactionA Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line TransactionIJMER
 
Unsupervised Learning Techniques to Diversifying and Pruning Random Forest
Unsupervised Learning Techniques to Diversifying and Pruning Random ForestUnsupervised Learning Techniques to Diversifying and Pruning Random Forest
Unsupervised Learning Techniques to Diversifying and Pruning Random ForestMohamed Medhat Gaber
 

La actualidad más candente (7)

PCA and DCT Based Approach for Face Recognition
PCA and DCT Based Approach for Face RecognitionPCA and DCT Based Approach for Face Recognition
PCA and DCT Based Approach for Face Recognition
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
Figure 1
Figure 1Figure 1
Figure 1
 
An apriori based algorithm to mine association rules with inter itemset distance
An apriori based algorithm to mine association rules with inter itemset distanceAn apriori based algorithm to mine association rules with inter itemset distance
An apriori based algorithm to mine association rules with inter itemset distance
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection
 
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line TransactionA Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
A Novel Approach to Detect Mischief Activities (Fraud) In On-Line Transaction
 
Unsupervised Learning Techniques to Diversifying and Pruning Random Forest
Unsupervised Learning Techniques to Diversifying and Pruning Random ForestUnsupervised Learning Techniques to Diversifying and Pruning Random Forest
Unsupervised Learning Techniques to Diversifying and Pruning Random Forest
 

Similar a X24164167

CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.IJERD Editor
 
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET-  	  Comparative Study of Classification Algorithms for Sentiment Analy...IRJET-  	  Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...IRJET Journal
 
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...IRJET Journal
 
Performance Comparison of Dimensionality Reduction Methods using MCDR
Performance Comparison of Dimensionality Reduction Methods using MCDRPerformance Comparison of Dimensionality Reduction Methods using MCDR
Performance Comparison of Dimensionality Reduction Methods using MCDRAM Publications
 
Data Severance Using Machine Learning for Marketing Strategies
Data Severance Using Machine Learning for Marketing StrategiesData Severance Using Machine Learning for Marketing Strategies
Data Severance Using Machine Learning for Marketing StrategiesIRJET Journal
 
Smart Health Guide App
Smart Health Guide AppSmart Health Guide App
Smart Health Guide AppIRJET Journal
 
Agile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgnirudra Sikdar
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET Journal
 
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEaciijournal
 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)aciijournal
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationKate Subramanian
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceIRJET Journal
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceIRJET Journal
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...Eswar Publications
 
Bitcoin Price Prediction and Recommendation System using Deep learning techni...
Bitcoin Price Prediction and Recommendation System using Deep learning techni...Bitcoin Price Prediction and Recommendation System using Deep learning techni...
Bitcoin Price Prediction and Recommendation System using Deep learning techni...IRJET Journal
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351IRJET Journal
 

Similar a X24164167 (20)

CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.
Implementation of Improved ID3 Algorithm to Obtain more Optimal Decision Tree.
 
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET-  	  Comparative Study of Classification Algorithms for Sentiment Analy...IRJET-  	  Comparative Study of Classification Algorithms for Sentiment Analy...
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...
 
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
 
Performance Comparison of Dimensionality Reduction Methods using MCDR
Performance Comparison of Dimensionality Reduction Methods using MCDRPerformance Comparison of Dimensionality Reduction Methods using MCDR
Performance Comparison of Dimensionality Reduction Methods using MCDR
 
Data Severance Using Machine Learning for Marketing Strategies
Data Severance Using Machine Learning for Marketing StrategiesData Severance Using Machine Learning for Marketing Strategies
Data Severance Using Machine Learning for Marketing Strategies
 
Smart Health Guide App
Smart Health Guide AppSmart Health Guide App
Smart Health Guide App
 
Agile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity managementAgile analytics : An exploratory study of technical complexity management
Agile analytics : An exploratory study of technical complexity management
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
 
Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)Advanced Computational Intelligence: An International Journal (ACII)
Advanced Computational Intelligence: An International Journal (ACII)
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence Application
 
NETWORJS3.pdf
NETWORJS3.pdfNETWORJS3.pdf
NETWORJS3.pdf
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data Science
 
Credit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data ScienceCredit Card Fraud Detection Using Machine Learning & Data Science
Credit Card Fraud Detection Using Machine Learning & Data Science
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...
 
Bitcoin Price Prediction and Recommendation System using Deep learning techni...
Bitcoin Price Prediction and Recommendation System using Deep learning techni...Bitcoin Price Prediction and Recommendation System using Deep learning techni...
Bitcoin Price Prediction and Recommendation System using Deep learning techni...
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim Prediction
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
 

Último

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Último (20)

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

X24164167

  • 1. Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 b www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.164-167 Customer Card Classification Based on C5.0 & CART Algorithms *Prof. Nilima Patil, **Prof. Rekha Lathi, ***Prof. Vidya Chitre *Computer Department K.C.college of Engineering ,Thane,(E) Mumbai Thane,(E) , Mumbai, India **Computer Department, Pillai Institute of Technology, New Panvel,Navi Mumbai, India Computer Department, Bharti Vidyapith college of Engg. Technology, Kharghar, Navi Mumbai, India Abstract—Data mining is the useful tool to discovering technology with great potential to help companies focus on the knowledge from large data. For this we require the most important information in their data warehouses. proper classification methods & algorithms. Data mining tools predict future trends and behaviors, Classification is most common method used for finding allowing businesses to make proactive, knowledge-driven the mine rule from the large database. Decision tree decisions. The automated, prospective analyses offered by method generally used for the Classification, because it is data mining move beyond the analyses of past events the simple hierarchical structure for the user provided by retrospective tools typical of decision support understanding & decision making. Different data mining systems. Data mining tools can answer business questions algorithms available for classification but decision tree that traditionally were too time consuming to resolve. They mining is simple way. The objective of this paper is to scour databases for hidden patterns, finding predictive provide the way for Decision Making process of information that experts may miss because it lies outside Customer for recommended the membership card. Here I their expectations. am using C5.0 & CART algorithms on customer database for classification. Both algorithms first applied on III. CLASSIFICATION ALGORITHMS training dataset & created the decision tree, some rules Decision Tree [7] is an important model to realize are derived from decision tree. Same rules then applied the classification. It was a learning system—CLS builded by on evaluation data set. Comparing the results of both Hunt etc, when they researched on human concept modeling algorithms recommended the card to the new customer early in the 1960s. To the late 70s, J. Ross .Quinlan put those having similar characteristics. forward the ID3 algorithm [3]-[5]. In 1993, Quinlan Keywords- Data mining, classification algorithm, decision developed C4.5/C5.0 algorithm on basis of the ID3 tree, Regression tree, membership card algorithm. Below is the detailed explanation of the C5.0 algorithm and the CART algorithm [1] which are used in I. INTRODUCTION this paper. In commercial operation, using the membership card service is the most superior method to help the A.C5.0 Algorithm: businessmen to accumulate the customers’ information. On the one hand they may obtain customers’ basic information C5.0 algorithm is an extension of C4.5 algorithm. C5.0 is to maintain long-term contact with them. On the special day, the classification algorithm which applies in big data set. like the holiday or the customer’s birthday, they can by C5.0 is better than C4.5 on the efficiency and the memory. delivering a warm blessing, promote customers’ C5.0 model works by splitting the sample based on the field satisfaction. On the other hand, they may through the that provides the maximum information gain. The C5.0 customer’s transaction information, like the purchase model can split samples on basis of the biggest information volume, the purchase frequency, analyze what is the value gain field..The sample subset that is get from the former of the customer to the enterprise and analyze the split will be split afterward. The process will continue until characteristics of each kind card customers to help the the sample subset cannot be split and is usually according to enterprise with a clear goal to recommend the card to a new another field. Finally, examine the lowest level split, those customer. sample subsets that don’t have remarkable contribution to The classification analysis [6] is by analyzing the data in the the model will be rejected. demonstration database, to make the accurate description or establish the accurate model or mine the classifying rule for Information Gain: each category, generated classifying rule used to classify Gain [8] is computed to estimate the gain produced by a records in other databases. split over an attribute II. DATA MINING TECHNOLOGY Let S be the sample: Data mining [2] is the extraction of hidden  Ci is Class I; i = 1,2,…,m predictive information from large databases. It is a powerful new I(s1,s2,..,sm)= - Σ pi log2 (pi)  Si is the no. of samples in class i 164 | P a g e
  • 2. Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 b www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.164-167 Pi = Si /S log2 is the binary logarithm  Let Attribute A have v distinct values.  Entropy = E(A) is CART divides the property which leads a minimum value after the division. Σ{(S1j+S2j+..+Smj)/S}*I(s1j,..smj) j=1 IV. METHODOLOGY USED  Where Sij is samples in Class i and subset j of In commercial operation, using the membership Attribute A. card service is the most superior method to help the businessmen to accumulate the customers’ information. The I(S1j,S2j,..Smj)= - Σ pij log2 (pij) Membership card system management which is helpful not only to accumulate the customer’s information but also to  Gain(A)=I(s1,s2,..,sm) - E(A) offer corresponding service for different card-rank users. From this way we can enhance customers' loyalty to the Gain ratio then chooses, from among the tests with at least store. Therefore, so as to recommend corresponding card to average gain, the appropriate customer, we want to obtain different card- rank customers’ characteristics and which is the most The Gain Ratio= P(A) important factor that affects the customers to choose this kind of card not that kind. There are various data mining classification algorithm. For this application we can make use these data mining techniques in order to find some rules from large database. Gain Ratio(A)= Gain(A)/P(A) Firstly on training data set, I am using two well known decision tree classification algorithms are C5.0 and CART. B. CART Algorithm Figure 1 shows Block diagram below.  Classification and Regression Trees (CART) is a flexible method to describe how the variable Y distributes after assigning the forecast vector X. This model uses the binary tree to divide the forecast space into certain subsets on which Y distribution is continuously even. Tree's leaf nodes correspond to different division areas which are determined by Splitting Rules relating to each internal node. By moving from the tree root to the leaf node, a forecast sample will be given an only leaf node, and Y distribution on this node also be determined. Splitting criteria: CART uses GINI Index to determine in which attribute the branch should be generated. The strategy is to choose the attribute whose GINI Index is minimum after splitting. GINI index: Assuming training set T includes n samples, the target property has m values, among them, the ith value show in T with a probability Pi, so T’s GINI Index can be described as below: Figure 1: Block Diagram Before implementing the algorithms, big dataset of customer information is stored in the database. Through the Assuming the A be divided to q subsets, {T1, T2, · · · Tq}, analysis of the customer basic information table's attributes, among them, Ti’s sample number is ni, so the GINI Index we need to use the target column as a membership card. In divided according to property A can be described below: the data understanding stage, by means of analyzing the characteristics of the primary data, we can make further 165 | P a g e
  • 3. Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 b www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.164-167 understanding of the data distribution and the affect of 3) The same with algorithm C4.5, CART also builds various factors to membership card type. Through the the tree before pruning cross-validated using cost- preliminary analysis, we try to find out the statistics of complexity model. customer records in the database, means the percentage of golden card customer’s account, the silver card customers, VI. RESULTS & RULE SUMMERY the bronze card customers & the normal card customer. For algorithm processing used customer store database of 7000 records. Total database are divided into Methodology used in this project:- 70% as a training dataset & 30% evaluation dataset.  First consider the Data source as a training data of Algorithms applied on training dataset & Decision trees are customer for algorithms processing. 5000 records generated for both algorithms. Evaluation dataset used for used in training dataset & 2000 records used in result verification. Both C5.0 & CART Decision trees are Evaluation set. given below.  Select the membership card as a output, & other C5.0 Decision tree: attributes used as a input.  Apply the two algorithms on the database for finding the root & leaf of the tree by using splitting criteria.  Perform the same operations to each attribute up to last split & create decision tree  Pruning method is used on both the decision trees for better accuracy. I am using C5.0 and CART algorithm separately to carry on the classification for decision tree & rule set formation. V. COMPARATIVE STUDY In customer membership card model two algorithms will be used and comparative study of both of them will be done. Figure 2: C5.0 Decision tree CART decision tree: C5.0 algorithm: In customer membership card model C5.o algorithm is used to split up data set & find out the result in the form of decision tree or rule set. 1) Splitting criteria used in c5.0 algorithm is information gain. The C5.0 model can split samples on basis of the biggest information gain. 2) Test criteria is decision tree have any number of branches available not fixed branches like CART. 3) Pruning method performed after creating decision tree i.e. post pruning single pass based on binomial confidence limits. Figure 3: CART decision tree 4) Speed of c5.0 algorithm is significantly faster than Rule sets are formed from decision tree of both algorithms it c4.5 & more accurate. shows that which factors are mainly affected for customer card classification & take those attributes as the main CART algorithm: standards to recommend card to a future CART, short for Classification And Regression Tree, When customers. the value of the target attribute is ordered, it is called Rule set result of both the algorithms are given below. regression tree; when the value is discrete, it is called classification tree. Classification result of C5.0 Algorithm. 1) Selection criterion used in CART algorithm is Gini Rule summary: Customer whose yearly income is from index (diversity index). 10,000 to 30,000 is the normal card customer; Customer whose yearly income is from 30,000 to 150,000 and having 2) The decision-tree generated by CART algorithm is less than 3 children is the bronze card customer, having a simple structured binary tree. more than 3 children is the golden card customer; Customer whose yearly income is more than 150,000, and unmarried 166 | P a g e
  • 4. Prof. Nilima Patil, Prof. Rekha Lathi, Prof. Vidya Chitre / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 b www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.164-167 is the silver card customer, married is the golden card REFERENCES customer. [1] Lin Zhang, Yan Chen, Yan Liang “Application of Classification result of CART algorithm: data mining classification algorithm for Customer membership Card Model” College of Transportation Rule summary: Customer whose yearly income is from and Management China 2008. 10,000 to 30,000 is the normal card customer; Customer [2] Jiawei Han, Michelin Kamber,“Data Mining whose yearly income is from 30,000 to 150,000, and having Concepts and Techniques” [M], Morgan Kaufmann less than 3 children is the bronze card customer; Customer publishers, USA, 2001, 70-181. whose yearly income is more than 30,000, and having more [3] Zhixian Niu, Lili Zong, Qingwei Yan, Zhenxing than 3 children is the golden card customer; Customer Zhao College of Computer and Software, “ Auto- whose yearly income is more than 150,000, and marital Recognizing DBMS Workload Based on C5.0 status single is the silver card customer. Algorithm” Taiyuan University of Technology, 2009. Comparing the two classification algorithm we can reach [4] Quinlan J R, “Induction of Decision Trees”, [J], such a conclusion: The main factors that have a big Machine Learning, 1986. influence on customer ranks are the income and child [5] Quinlan J R, “C4.5: Programs for Machine number. Normal card customers, bronze card customer and Learning”, [M], San Mateo, CA: Morgan Kaufman golden card customers that are obtained from the two Publishers, 1993. methods are basically the same. [6] Yue He† , Jingsi Liu , Lirui Lin “Research on The following graph shows the accuracy of C5.0 is greater application of data mining.” School of Business and than CART algorithm.. In the graph, on X axis algorithms administration, Sichuan University May 2010 are consider and Y axis shows percentage of accuracy. [7] S.B. Kotsiantis, Supervised Machine Learning: “A Review of Classification Techniques”, Informatica 31,2007,pp. 249-26. [8] Zhu Xiaoliang, Wang Jian, Wu Shangzhuo, Yan Hongcan “Research and Application of the improved Algorithm C4.5 on Decision Tree” Hebei Polytechnic University, International Conference on Test and Measurement 2009. Figure 4: Comparison of algorithms VII. CONCLUSION Applying data mining classification algorithm in the customer membership card classification model can help to understand Children number & income level factors are affecting on the card ranks. Therefore, we are able to these two attributes as the main standards to recommend card to a new customer. Moreover, we can refer other attributes to help judge which card to recommend to future customers. On the basis of resulting factors the enterprise with a clear goal to recommend the corresponding membership card to the customer in order to provide the special service for each kind of card users, and is helpful to enterprise's development. 167 | P a g e