SlideShare una empresa de Scribd logo
1 de 96
Feature Selection
Concepts and Methods
Electronic & Computer Department
Isfahan University Of Technology
Reza Ramezani 1
What are Features?
 Features are attributes that their value make
an instance.
 With features we can identify instances.
 Features are determinant values that
determine instance belong to which class.
2
Classifying
Features
 Relevance: These are features that have an
influence on the output and whose role can
not be assumed by the rest.
 Irrelevance: Features that don't have any
influence on the output, and whose values
are generated at random for each example.
 Redundance: A redundancy exists whenever
a feature can take the role of another.
3
What is Feature
Selection? Feature selection, is a preprocessing step to
machine learning that choose a subset of original
features according to a certain evaluation
criterion and is effective in:
 Removing/Reduce effect of irrelevant data
 removing redundant data
 reducing dimensionality (binary model)
 increasing learning accuracy
 and improving result comprehensibility.
4
Other Definitions
 Process which select a subset of features
defined by one of three approaches:
1) the subset with a specified size that
optimizes an evaluation measure
2) the subset of smaller size that satisfies a
certain restriction on the evaluation
measure
3) the subset with the best compromise among
its size and the value of its evaluation
measure (general case). 5
Feature Selection
Algorithm (FSA)
6
Classifying FSAs
 The FSAs can be classified according to
the kind of output they yield:
1) Algorithms that giving a weighed linear order
of features. (Continuous feature selection
problem)
2) Algorithms that giving a subset of the original
features. (Binary feature selection problem)
Note that both types can be seen in an unified way
by noting that in (2) the weighting is binary.
7
Notation
8
Relevance of a feature
 The purpose of a FSA is to identify relevant
features according to a definition of
relevance.
 Unfortunately the notion of relevance in
machine learning has not yet been
rigorously defined on a common
agreement.
 Let us to define Relevance in many aspect:
9
Relevance with respect to
an objective
10
Strong relevance with
respect to S
11
Tid Refund Marital
Status
Taxable
Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
MarSt
Refund
TaxInc
YESNO
NO
NO
Yes No
Married Single, Divorced
< 80K > 80K
12
Strong relevance with
respect to p
13
Weak relevance with
respect to S
14
Weak relevance with
respect to p
15
Strongly Relevant
Features The strongly relevant features are, in
theory, important to maintain a structure in
the domain
 And they should be conserved by any
feature selection algorithm in order to avoid
the addition of ambiguity to the sample.
16
Weakly Relevant
FeaturesWeakly relevant features could be important or
not, depending on:
 The other features already selected.
 The evaluation measure that has been
chosen
(accuracy, simplicity, consistency, etc.).
17
Relevance as a complexity
measure
 Define r(S,c)
 Smallest number of relevant features to c
 The error in S is the least possible for the
inducer.
In other words, it refers to the smallest number of
features required by a specific inducer to reach
optimum performance in the task of modeling c
using S.
18
Incremental usefulness
19
Example
X1…………...X11…………....X21……………..X30
100000000000000000000000000000 +
111111111100000000000000000000 +
000000000011111111110000000000 +
000000000000000000001111111111 +
000000000000000000000000000000 –
 X1 is strongly relevant, the rest are weakly relevant.
 r(S,c) = 3
 Incremental usefulness: after choosing {X1, X2}, none of
X3…X10 would be incrementally useful, but any of X11…X30
would.
20
General Schemes for
Feature Selection
 Relationship between a FSA and the inducer
 Inducer:
• Chosen process to evaluate the usefulness of the
features
• Learning Process
 Filter Scheme
 Wrapper Scheme
 Embedded Scheme
21
Filter Scheme
 Feature selection process takes place before
the induction step
 This scheme is independent of the induction
algorithm.
• High Speed
• Low Accuracy
22
Wrapper Scheme
 Use the learning algorithm as a subroutine to
evaluate the features subsets.
 Inducer must be known.
• Low Speed
• High Accuracy
23
Embedded Scheme
 Similar to the wrapper approach
 Features are specifically selected for a
certain inducer
 Inducer selects the features in the process of
learning (Explicitly or Implicitly).
24
MarSt
Refund
TaxInc
YESNO
NO
NO
Yes No
Married Single, Divorced
< 80K > 80K
25
Embedded Scheme
Example
Refund Marital
Status
Taxable
Income
Age
Cheat
Yes Single 125K 18 No
No Married 100K 30 No
No Single 70K 28 No
Yes Married 120K 19 No
No Divorced 95K 18 Yes
No Married 60K 20 No
Yes Divorced 220K 25 No
No Single 85K 30 Yes
No Married 75K 20 No
No Single 90K 18 Yes
10
Decision Tree Maker Algorithm, will
Automatically Remove ‘Age’ Feature.
Characterization of
FSAs
Search Organization: General strategy with
which the space of hypothesis is explored.
Generation of Successors: Mechanism by
which possible successor candidates of the
current state are proposed.
Evaluation Measure: Function by which
successor candidates are evaluated.
26
Types of Search
OrganizationWe consider three types of search:
 Exponential
 Sequential
 Random
27
Exponential Search
28
Sequential Search
29
Random Search
 Use Randomness to avoid the algorithm to
stay on a local minimum.
 Allow temporarily moving to other states
with worse solutions.
 These are anytime algorithms.
 Can give several optimal subsets as solution.
30
Types of Successors
Generation Forward
 Backward
 Compound
 Weighting
 Random
31
Forward Successors
Generation
32
Backward Successors
Generation
33
Forward and Backward
Method, Stopping Criterion
34
Compound Successors
Generation
35
Weighting Successors
Generation In weighting operators (continuous
features).
 All of the features are present in the solution
to a certain degree.
 A successor state is a state with a different
weighting.
 This is typically done by iteratively
sampling the available set of instances.
36
Random Successors
Generation Includes those operators that can potentially
generate any other state in a single step.
 Restricted to some criterion of advance:
• In the number of features
• In improving the measure J at each step.
37
Evaluation Measures
38
Evaluation
Measures, Probability of
Error
39
Evaluation
Measures, Probability of
Error
40
Evaluation
Measures, Probability of
Error
41
Evaluation
Measures, Divergence
42
Evaluation
Measures, Divergence
43
Divergence, Some
classical choices:
44
Divergence, Some
classical choices:
45
Evaluation
Measures, Dependence
46
Measures, Interclass
Distance
47
Evaluation
Measures, Consistency
48
Evaluation
Measures, Consistency
49
Evaluation
Measures, Consistency
50
51
General Algorithm for
Feature Selection
 All FSA can be represented in a space of
characteristics according to the criteria of:
 search organization (Org)
 Generation of successor states (GS)
 Evaluation measures (J)
 This space <•Org, GS, J> encompasses the
whole spectrum of possibilities for a FSA.
 hybrid FSA when FSA requires more than a
point in the same coordinate to be characterized.
52
53
FCBF
Feature Correlation
Based Filter
(Filter Mode)
<Sequential, Compound, Infor
mation> 54
Previous Works and
Their Defects1) Huge Time Complexity
Binary Mode:
 Subset search algorithms search through
candidate feature subsets guided by a certain
search strategy and a evaluation measure.
 Different search
strategies, namely, exhaustive, heuristic, and
random search, are combined with this
evaluation measure to form different algorithms.
55
Previous Works and
Their Defects The time complexity is exponential in terms
of data dimensionality for exhaustive search
 and quadratic for heuristic search.
 The complexity can be linear to the number
of iterations in a random search, but
experiments show that in order to find best
feature subset, the number of iterations
required is mostly at least quadratic to the
number of features. 56
Previous Works and
Their Defects2) Inability to recognize redundant features.
Relief:
 The key idea of Relief is to estimate the relevance
of features according to how well their values
distinguish between the instances of the same and
different classes that are near each other.
 Relief randomly samples a number (m) of instances
from the training set and updates the relevance
estimation of each feature based on the difference
between the selected instance and the two nearest
instances of the same and opposite classes.
57
Previous Works and
Their Defects Time complexity of Relief for a data set with
M instances and N features is O(mMN).
 With m being a constant, the time complexity
becomes O(MN), which makes it very
scalable to data sets with both a huge number
of instances and a very high dimensionality.
 However, Relief does not help with removing
redundant features.
58
Good Feature
 A feature is good if it is relevant to the class
concept but is not redundant to any of the
other relevant features.
Correlation as
Goodness Measure
 A feature is good if it is highly correlated to
the class but not highly correlated to any of59
Approaches to Measure The
Correlation
 Classical Linear Correlation
 (Linear Correlation Coefficient)
 Information theory
 (Entropy or Uncertainty)
60
Linear Correlation
Coefficient
61
Advantages
 It helps to remove features with near zero
linear correlation to the class.
 It helps to reduce redundancy among
selected features.
Disadvantages
 It may not be able to capture correlations
that are not linear in nature.
 Calculation requires all features contain
numerical values. 62
Entropy
63
Entropy, Information
Gain The amount by which the entropy of X
decreases reflects additional information
about X provided by Y:
IG(X|Y) = H(X) - H(X|Y)
 Feature Y is regarded more correlated to feature X
than to feature Z, if IG(X|Y) > IG(Z|Y)
 Information gain is symmetrical for two random
variables X and Y: IG(X|Y) = IG(Y|X)
64
Entropy, Symmetrical
Uncertainty
65
Entropy, Symmetrical
Uncertainty
66
Algorithm Steps
 Aspects of developing a procedure to select
good features for classification:
1) How to decide whether a feature is relevant to
the class or not (C-correlation).
2) How to decide whether such a relevant feature
is redundant or not when considering it with
other relevant features (F-correlation).
 Select features with SU greater than a threshold.
67
Predominant
Correlation
68
Redundant Feature
69
Predominant Feature
 A feature is predominant to the class, iff:
 Its correlation to the class is predominant
 Or can become predominant after removing its
redundant peers.
 Feature selection for classification is a process
that identifies all predominant features to the
class concept and removes the rest.
70
Heuristic
71
72
FCBF Algorithm
73
74
GA-SVM
Generic Algorithm
Support Vector Machine
(Wrapper Mode)
<Sequential, Compound, Clas
sifier> 75
Support Vector Machine
(SVM) SVM, one of the best techniques for pattern
classification.
 Widely use in many application areas.
 SVM classifies data by determining a set of
support vectors and their distance to
hyperplane.
 SVM provides a generic mechanism that fits
the hyperplane surface to the training data.
76
SVM Main Idea
 With this hypothesis that classes are linearly
separable, make hyperplane with maximum
margin to separate classes.
 When classes are not linearly separable, map
them to high dimensional space to linearly
separate them.
77Separating Surface:
A+
A-
Support Vector
78
Class +1
Class -1
X2
X1
SV
SV
SV
Kernel
79
1 2 4 5 6
class 2 class 1class 1
1 Dimension
1 2 4 5 6
class 2 class 1class 1
2 Dimension
Kernel
 Data in higher dimensional!
 The user may select a kernel function for the
SVM during the training process.
 The kernel parameters setting for SVM in a
training process impacts on the classification
accuracy.
 The parameters that should be optimized
include penalty parameter C and the kernel
function parameters.
80
Linear SVM
81
Linear SVM
82
Linear SVM
83
Linear Generalized SVM
84
Linear Generalized SVM
85
NonLinear SVM
86
NonLinear SVM, Kernels
87
NonLinear SVM, Kernels
88
Genetic Algorithm (GA)
 Genetic algorithms (GA), as a optimization search
methodology is a promising alternative to
conventional heuristic methods.
 GA work with a set of candidate solutions called a
population.
 Based on the Darwinian principle of ‘survival of
the fittest’, the GA obtains the optimal solution
after a series of iterative computations.
 GA generates successive populations of alternate
solutions that are represented by a chromosome.
 A fitness function assesses the quality of a solution
in the evaluation step.
89
90
GA Feature Selection
Structure
91
Evaluation Measure
 Three criteria used to design a fitness
function:
 Classification accuracy
 The number of selected features
 The feature cost
 Thus, for the individual (chromosome) with:
 High classification accuracy
 Small number of features
 Low total feature cost
Produce a high fitness value.
92
Evaluation Measure
93
94
95
Thanks For Your
Regard
75
Thanks For Your
Regard

Más contenido relacionado

La actualidad más candente

Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithmparry prabhu
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IMachine Learning Valencia
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningFrancesco Casalegno
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentationAyanaRukasar
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Introdution and designing a learning system
Introdution and designing a learning systemIntrodution and designing a learning system
Introdution and designing a learning systemswapnac12
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descentSuraj Parmar
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine LearningKnoldus Inc.
 
Random forest
Random forestRandom forest
Random forestUjjawal
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 

La actualidad más candente (20)

Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
 
Hyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine LearningHyperparameter Optimization for Machine Learning
Hyperparameter Optimization for Machine Learning
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Introdution and designing a learning system
Introdution and designing a learning systemIntrodution and designing a learning system
Introdution and designing a learning system
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Random forest
Random forestRandom forest
Random forest
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 

Similar a Feature selection concepts and methods

Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind MapAshish Patel
 
Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...IRJET Journal
 
KnowledgeFromDataAtScaleProject
KnowledgeFromDataAtScaleProjectKnowledgeFromDataAtScaleProject
KnowledgeFromDataAtScaleProjectMarciano Moreno
 
Literature Survey
Literature SurveyLiterature Survey
Literature Surveybutest
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee AttritionShruti Mohan
 
customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedinAsoka Korale
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
Feature selection for classification
Feature selection for classificationFeature selection for classification
Feature selection for classificationefcastillo744
 
final report (ppt)
final report (ppt)final report (ppt)
final report (ppt)butest
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...IEEEGLOBALSOFTTECHNOLOGIES
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...IEEEGLOBALSOFTTECHNOLOGIES
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...IEEEGLOBALSOFTTECHNOLOGIES
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...IEEEFINALYEARPROJECTS
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine LearningMehwish690898
 
Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysYasutoTamura1
 
Learning machine learning with Yellowbrick
Learning machine learning with YellowbrickLearning machine learning with Yellowbrick
Learning machine learning with YellowbrickRebecca Bilbro
 
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksA Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksEditor IJCATR
 

Similar a Feature selection concepts and methods (20)

Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind Map
 
Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...Network Based Intrusion Detection System using Filter Based Feature Selection...
Network Based Intrusion Detection System using Filter Based Feature Selection...
 
KnowledgeFromDataAtScaleProject
KnowledgeFromDataAtScaleProjectKnowledgeFromDataAtScaleProject
KnowledgeFromDataAtScaleProject
 
Literature Survey
Literature SurveyLiterature Survey
Literature Survey
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee Attrition
 
customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedin
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Feature selection for classification
Feature selection for classificationFeature selection for classification
Feature selection for classification
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
final report (ppt)
final report (ppt)final report (ppt)
final report (ppt)
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
 
Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds | Automata
Aspiring Minds | Automata
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative ways
 
Learning machine learning with Yellowbrick
Learning machine learning with YellowbrickLearning machine learning with Yellowbrick
Learning machine learning with Yellowbrick
 
A Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification TasksA Review on Feature Selection Methods For Classification Tasks
A Review on Feature Selection Methods For Classification Tasks
 

Más de Reza Ramezani

Real time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReal time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReza Ramezani
 
Fault tolerant real-time scheduling
Fault tolerant real-time schedulingFault tolerant real-time scheduling
Fault tolerant real-time schedulingReza Ramezani
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attributionReza Ramezani
 
An introduction to forensic linguistics
An introduction to forensic linguisticsAn introduction to forensic linguistics
An introduction to forensic linguisticsReza Ramezani
 
An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)Reza Ramezani
 
Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Reza Ramezani
 
Deadlock detection in distributed systems
Deadlock detection in distributed systemsDeadlock detection in distributed systems
Deadlock detection in distributed systemsReza Ramezani
 
Fault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemFault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemReza Ramezani
 
Question answering in linked data
Question answering in linked dataQuestion answering in linked data
Question answering in linked dataReza Ramezani
 
Finding Association Rules in Linked Data
Finding Association Rules in Linked DataFinding Association Rules in Linked Data
Finding Association Rules in Linked DataReza Ramezani
 

Más de Reza Ramezani (10)

Real time operating systems for safety-critical applications
Real time operating systems for safety-critical applicationsReal time operating systems for safety-critical applications
Real time operating systems for safety-critical applications
 
Fault tolerant real-time scheduling
Fault tolerant real-time schedulingFault tolerant real-time scheduling
Fault tolerant real-time scheduling
 
Authorship attribution
Authorship attributionAuthorship attribution
Authorship attribution
 
An introduction to forensic linguistics
An introduction to forensic linguisticsAn introduction to forensic linguistics
An introduction to forensic linguistics
 
An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)An improved to ak max sat (max-sat problem)
An improved to ak max sat (max-sat problem)
 
Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...Multi criteria decision support system on mobile phone selection with ahp and...
Multi criteria decision support system on mobile phone selection with ahp and...
 
Deadlock detection in distributed systems
Deadlock detection in distributed systemsDeadlock detection in distributed systems
Deadlock detection in distributed systems
 
Fault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector systemFault injection techniques, design pattern for fault injector system
Fault injection techniques, design pattern for fault injector system
 
Question answering in linked data
Question answering in linked dataQuestion answering in linked data
Question answering in linked data
 
Finding Association Rules in Linked Data
Finding Association Rules in Linked DataFinding Association Rules in Linked Data
Finding Association Rules in Linked Data
 

Último

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Feature selection concepts and methods

  • 1. Feature Selection Concepts and Methods Electronic & Computer Department Isfahan University Of Technology Reza Ramezani 1
  • 2. What are Features?  Features are attributes that their value make an instance.  With features we can identify instances.  Features are determinant values that determine instance belong to which class. 2
  • 3. Classifying Features  Relevance: These are features that have an influence on the output and whose role can not be assumed by the rest.  Irrelevance: Features that don't have any influence on the output, and whose values are generated at random for each example.  Redundance: A redundancy exists whenever a feature can take the role of another. 3
  • 4. What is Feature Selection? Feature selection, is a preprocessing step to machine learning that choose a subset of original features according to a certain evaluation criterion and is effective in:  Removing/Reduce effect of irrelevant data  removing redundant data  reducing dimensionality (binary model)  increasing learning accuracy  and improving result comprehensibility. 4
  • 5. Other Definitions  Process which select a subset of features defined by one of three approaches: 1) the subset with a specified size that optimizes an evaluation measure 2) the subset of smaller size that satisfies a certain restriction on the evaluation measure 3) the subset with the best compromise among its size and the value of its evaluation measure (general case). 5
  • 7. Classifying FSAs  The FSAs can be classified according to the kind of output they yield: 1) Algorithms that giving a weighed linear order of features. (Continuous feature selection problem) 2) Algorithms that giving a subset of the original features. (Binary feature selection problem) Note that both types can be seen in an unified way by noting that in (2) the weighting is binary. 7
  • 9. Relevance of a feature  The purpose of a FSA is to identify relevant features according to a definition of relevance.  Unfortunately the notion of relevance in machine learning has not yet been rigorously defined on a common agreement.  Let us to define Relevance in many aspect: 9
  • 10. Relevance with respect to an objective 10
  • 12. Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 10 MarSt Refund TaxInc YESNO NO NO Yes No Married Single, Divorced < 80K > 80K 12
  • 16. Strongly Relevant Features The strongly relevant features are, in theory, important to maintain a structure in the domain  And they should be conserved by any feature selection algorithm in order to avoid the addition of ambiguity to the sample. 16
  • 17. Weakly Relevant FeaturesWeakly relevant features could be important or not, depending on:  The other features already selected.  The evaluation measure that has been chosen (accuracy, simplicity, consistency, etc.). 17
  • 18. Relevance as a complexity measure  Define r(S,c)  Smallest number of relevant features to c  The error in S is the least possible for the inducer. In other words, it refers to the smallest number of features required by a specific inducer to reach optimum performance in the task of modeling c using S. 18
  • 20. Example X1…………...X11…………....X21……………..X30 100000000000000000000000000000 + 111111111100000000000000000000 + 000000000011111111110000000000 + 000000000000000000001111111111 + 000000000000000000000000000000 –  X1 is strongly relevant, the rest are weakly relevant.  r(S,c) = 3  Incremental usefulness: after choosing {X1, X2}, none of X3…X10 would be incrementally useful, but any of X11…X30 would. 20
  • 21. General Schemes for Feature Selection  Relationship between a FSA and the inducer  Inducer: • Chosen process to evaluate the usefulness of the features • Learning Process  Filter Scheme  Wrapper Scheme  Embedded Scheme 21
  • 22. Filter Scheme  Feature selection process takes place before the induction step  This scheme is independent of the induction algorithm. • High Speed • Low Accuracy 22
  • 23. Wrapper Scheme  Use the learning algorithm as a subroutine to evaluate the features subsets.  Inducer must be known. • Low Speed • High Accuracy 23
  • 24. Embedded Scheme  Similar to the wrapper approach  Features are specifically selected for a certain inducer  Inducer selects the features in the process of learning (Explicitly or Implicitly). 24
  • 25. MarSt Refund TaxInc YESNO NO NO Yes No Married Single, Divorced < 80K > 80K 25 Embedded Scheme Example Refund Marital Status Taxable Income Age Cheat Yes Single 125K 18 No No Married 100K 30 No No Single 70K 28 No Yes Married 120K 19 No No Divorced 95K 18 Yes No Married 60K 20 No Yes Divorced 220K 25 No No Single 85K 30 Yes No Married 75K 20 No No Single 90K 18 Yes 10 Decision Tree Maker Algorithm, will Automatically Remove ‘Age’ Feature.
  • 26. Characterization of FSAs Search Organization: General strategy with which the space of hypothesis is explored. Generation of Successors: Mechanism by which possible successor candidates of the current state are proposed. Evaluation Measure: Function by which successor candidates are evaluated. 26
  • 27. Types of Search OrganizationWe consider three types of search:  Exponential  Sequential  Random 27
  • 30. Random Search  Use Randomness to avoid the algorithm to stay on a local minimum.  Allow temporarily moving to other states with worse solutions.  These are anytime algorithms.  Can give several optimal subsets as solution. 30
  • 31. Types of Successors Generation Forward  Backward  Compound  Weighting  Random 31
  • 34. Forward and Backward Method, Stopping Criterion 34
  • 36. Weighting Successors Generation In weighting operators (continuous features).  All of the features are present in the solution to a certain degree.  A successor state is a state with a different weighting.  This is typically done by iteratively sampling the available set of instances. 36
  • 37. Random Successors Generation Includes those operators that can potentially generate any other state in a single step.  Restricted to some criterion of advance: • In the number of features • In improving the measure J at each step. 37
  • 51. 51
  • 52. General Algorithm for Feature Selection  All FSA can be represented in a space of characteristics according to the criteria of:  search organization (Org)  Generation of successor states (GS)  Evaluation measures (J)  This space <•Org, GS, J> encompasses the whole spectrum of possibilities for a FSA.  hybrid FSA when FSA requires more than a point in the same coordinate to be characterized. 52
  • 53. 53
  • 54. FCBF Feature Correlation Based Filter (Filter Mode) <Sequential, Compound, Infor mation> 54
  • 55. Previous Works and Their Defects1) Huge Time Complexity Binary Mode:  Subset search algorithms search through candidate feature subsets guided by a certain search strategy and a evaluation measure.  Different search strategies, namely, exhaustive, heuristic, and random search, are combined with this evaluation measure to form different algorithms. 55
  • 56. Previous Works and Their Defects The time complexity is exponential in terms of data dimensionality for exhaustive search  and quadratic for heuristic search.  The complexity can be linear to the number of iterations in a random search, but experiments show that in order to find best feature subset, the number of iterations required is mostly at least quadratic to the number of features. 56
  • 57. Previous Works and Their Defects2) Inability to recognize redundant features. Relief:  The key idea of Relief is to estimate the relevance of features according to how well their values distinguish between the instances of the same and different classes that are near each other.  Relief randomly samples a number (m) of instances from the training set and updates the relevance estimation of each feature based on the difference between the selected instance and the two nearest instances of the same and opposite classes. 57
  • 58. Previous Works and Their Defects Time complexity of Relief for a data set with M instances and N features is O(mMN).  With m being a constant, the time complexity becomes O(MN), which makes it very scalable to data sets with both a huge number of instances and a very high dimensionality.  However, Relief does not help with removing redundant features. 58
  • 59. Good Feature  A feature is good if it is relevant to the class concept but is not redundant to any of the other relevant features. Correlation as Goodness Measure  A feature is good if it is highly correlated to the class but not highly correlated to any of59
  • 60. Approaches to Measure The Correlation  Classical Linear Correlation  (Linear Correlation Coefficient)  Information theory  (Entropy or Uncertainty) 60
  • 62. Advantages  It helps to remove features with near zero linear correlation to the class.  It helps to reduce redundancy among selected features. Disadvantages  It may not be able to capture correlations that are not linear in nature.  Calculation requires all features contain numerical values. 62
  • 64. Entropy, Information Gain The amount by which the entropy of X decreases reflects additional information about X provided by Y: IG(X|Y) = H(X) - H(X|Y)  Feature Y is regarded more correlated to feature X than to feature Z, if IG(X|Y) > IG(Z|Y)  Information gain is symmetrical for two random variables X and Y: IG(X|Y) = IG(Y|X) 64
  • 67. Algorithm Steps  Aspects of developing a procedure to select good features for classification: 1) How to decide whether a feature is relevant to the class or not (C-correlation). 2) How to decide whether such a relevant feature is redundant or not when considering it with other relevant features (F-correlation).  Select features with SU greater than a threshold. 67
  • 70. Predominant Feature  A feature is predominant to the class, iff:  Its correlation to the class is predominant  Or can become predominant after removing its redundant peers.  Feature selection for classification is a process that identifies all predominant features to the class concept and removes the rest. 70
  • 73. 73
  • 74. 74
  • 75. GA-SVM Generic Algorithm Support Vector Machine (Wrapper Mode) <Sequential, Compound, Clas sifier> 75
  • 76. Support Vector Machine (SVM) SVM, one of the best techniques for pattern classification.  Widely use in many application areas.  SVM classifies data by determining a set of support vectors and their distance to hyperplane.  SVM provides a generic mechanism that fits the hyperplane surface to the training data. 76
  • 77. SVM Main Idea  With this hypothesis that classes are linearly separable, make hyperplane with maximum margin to separate classes.  When classes are not linearly separable, map them to high dimensional space to linearly separate them. 77Separating Surface: A+ A-
  • 78. Support Vector 78 Class +1 Class -1 X2 X1 SV SV SV
  • 79. Kernel 79 1 2 4 5 6 class 2 class 1class 1 1 Dimension 1 2 4 5 6 class 2 class 1class 1 2 Dimension
  • 80. Kernel  Data in higher dimensional!  The user may select a kernel function for the SVM during the training process.  The kernel parameters setting for SVM in a training process impacts on the classification accuracy.  The parameters that should be optimized include penalty parameter C and the kernel function parameters. 80
  • 89. Genetic Algorithm (GA)  Genetic algorithms (GA), as a optimization search methodology is a promising alternative to conventional heuristic methods.  GA work with a set of candidate solutions called a population.  Based on the Darwinian principle of ‘survival of the fittest’, the GA obtains the optimal solution after a series of iterative computations.  GA generates successive populations of alternate solutions that are represented by a chromosome.  A fitness function assesses the quality of a solution in the evaluation step. 89
  • 90. 90
  • 92. Evaluation Measure  Three criteria used to design a fitness function:  Classification accuracy  The number of selected features  The feature cost  Thus, for the individual (chromosome) with:  High classification accuracy  Small number of features  Low total feature cost Produce a high fitness value. 92
  • 94. 94