Data Mining Model Predicts Graduate Employment in Malaysia

A Data Mining Approach to Construct
Graduates Employability Model in Malaysia
Myzatul Akmam Sapaat, Aida Mustapha, Johanna Ahmad, Khadijah Chamili,
Rahamirzam Muhamad
Faculty of Computer Science and Information Technology, Universiti Putra Malaysia,
43400 UPM Serdang, Selangor, Malaysia
{angahmyz@yahoo.com, aida@fsktm.upm.edu.my, anna_lee207@yahoo.co.uk,
khadijah@usim.edu.my, raha_muhd@yahoo.com}
ABSTRACT
This study is to construct the Graduates
Employability Model using classification
task in data mining. To achieve it, we use
data sourced from the Tracer Study, a web-
based survey system from the Ministry of
Higher Education, Malaysia (MOHE) for the
year 2009. The classification experiment is
performed using various Bayes algorithms
to determine whether a graduate has been
employed, remains unemployed or in an
undetermined situation. The performance of
Bayes algorithms are also compared against
a number of tree-based algorithms.
Information Gain is also used to rank the
attributes and the results showed that top
three attributes that have direct impact on
employability are the job sector, job status
and reason for not working. Results showed
that J48, a variant of decision-tree algorithm
performed with highest accuracy, which is
92.3% as compared to the average of 91.3%
from other Bayes algorithms. This leads to
the conclusion that a tree-based classifier is
more suitable for the tracer data due to the
information gain strategy.
KEYWORDS
Classification, Bayes Methods, Decision
Tree, Employability
1 INTRODUCTION
Tracer Study is a web-based survey
system developed by the Ministry of
Higher Education, Malaysia (MOHE). It
is compulsory to be filled by all students
graduating from polytechnics, public or
private institutions before their
convocation for any level of degree
awarded. The sole purpose of the survey
is to guide future planning and to
improve various aspects of local higher
education administrative system. The
survey also serves as a tool to gauge the
adequacy of higher education in
Malaysia in supplying manpower needs
in all areas across technical, managerial
or social science. Data sourced from the
Tracer Study is invaluable because it
provides correlation about the graduate
qualifications and skills along with
employment status.
Graduates employability remains as
national issues due to the increasing
number of graduates produced by higher
education institutions each year.
According to statistics generated from
the Tracer Study, total number of
graduates produced by higher
institutions in 2008 is 139,278. In 2009,
the volume has increased to 155,278
graduates. Taking this into
consideration, 50% of graduates in 2009
1086
International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098
The Society of Digital Information and Wireless Communications, 2011 (ISSN: 2220-9085)

are bachelor holder from public and
private universities. Only 49.20% or
38,191 of them successfully employed
within the first six months after finishing
their studies. Previous research on
graduate employability covers wide
range of domain such as education,
engineering, and social science. While
the researches are mainly based on
surveys or interviews, little has been
done using data mining techniques.
Bayes’ theorem is among the earliest
statistical method that is used to identify
patterns in data. But as datasets have
grown in size and complexity, data
mining has emerged as a technology to
apply methods such as neural networks,
genetic algorithms, decision trees, and
support vector machines to uncover
hidden patterns [1]. Today, data mining
technologies are dealing with huge
amount of data from various sources, for
example relational or transactional
databases, data warehouse, images, flat
files or in the form World Wide Web.
Classification is the task of
generalizing observations in the training
data, which are accompanied by specific
class of the observations. The objective
of this paper is to predict whether a
graduate has been employed, remains
unemployed or in an undetermined
situation within the first six months after
graduation. This will be achieved
through a classification experiment that
classifies a graduate profile as employed,
unemployed or others. The main
contribution of this paper is the
comparison of classification accuracy
between various algorithms from the two
most commonly used data mining
techniques in the education domain,
which are the Bayes methods and
decision trees.
The remainder of this paper is
organized as follows. Section 2 presents
the related works on graduate
employability and reviews recent
techniques employed in data mining.
Section 3 introduces the dataset and the
experimental setting. Section 4 discusses
finding of the results. Finally Section 5
concludes the paper with some direction
for future work.
2 RELATED WORK
A number of works have been done to
identify the factors that influenced
graduates employability in Malaysia. It
is as an initiative step to align the higher
education with the industry, where
currently exists unquestionable impact
against each other. Nonetheless, most of
the previous works were carried out
beyond the data mining domain.
Besides, data sources for previous works
were collected and assembled through
survey in sample population.
Research in [2] identifies three major
requirements concerned by the
employers in hiring employees, which
are basic academic skills, higher order
thinking skills, and personal qualities.
The work is restricted in the education
domain specifically analyzing the
effectiveness of a subject, English for
Occupational Purposes (EOP) in
enhancing employability skills. Similar
to [2], work by [3] proposes to
restructure the curriculum and methods
of instruction in preparing future
graduates for the forthcoming challenges
based on the model of the T-shaped
professional and newly developed field
1087

of Service Science, Management and
Engineering (SSME).
More recently, [4] proposes a new
Malaysian Engineering Employability
Skills Framework (MEES), which is
constructed based on requirement by
accrediting bodies and professional
bodies and existing research findings in
employability skills as a guideline in
training package and qualification in
Malaysia. Nonetheless, not surprisingly,
graduates employability is rarely being
studied especially within the scope of
data mining, mainly due to limited and
authentic data source available.
Employability issues have also been
taken into consideration in other
countries. Research by The Higher
Education Academy with the Council for
Industry and Higher Education (CIHE)
in United Kingdom concluded that there
are six competencies that employers
observe in individual who can transform
the organizations and add values in their
careers [5]. The six competencies are
cognitive skills or brainpower, generic
competencies, personal capabilities,
technical ability, business or
organization awareness and practical
elements. Furthermore, it covers a set of
achievements comprises skills,
understandings and personal attributes
that make graduates more likely to gain
employment and successful in their
chosen occupations which benefits the
graduates, the community and also the
economy.
However, data mining techniques
have indeed been employed in education
domain, for instance in prediction and
classification of student academic
performance using Artificial Neural
Network [6, 7] and a combination of
clustering and decision tree classification
techniques [6]. Experiments in [8]
classifies students to predict their final
grade using six common classifiers
(Quadratic Bayesian classifier, 1-nearest
neighbour (1-NN), k-nearest neighbor
(k-NN), Parzen-window, multilayer
perceptron (MLP), and Decision Tree).
With regards to student performance, [9]
discovers individual student
characteristics that are associated with
their success according to grade point
averages (GPA) by using a Microsoft
Decision Trees (MDT) classification
technique. [10] has shown some
applications of data mining in
educational institution that extract useful
information from the huge data sets.
Data mining through analytical tool
offers user to view and use current
information for decision making process
such as organization of syllabus,
predicting the registration of students in
an educational program, predicting
student performance, detecting cheating
in online examination as well as
identifying abnormal/erroneous values.
Among the related work, we found
that work done by [11] is most related to
this research, whereby the work mines
historical data of students' academic
results using different classifiers (Bayes,
trees, function) to rank influencing
factors that contribute in predicting
student academic performance.
3 MATERIALS AND METHODS
The main objective of this paper is to
classify a graduate profile as employed,
unemployed or undetermined using data
sourced from the Tracer Study database
for the year of 2009. The dataset consists
1088

of 12,830 instances and 20 attributes
related to graduate profiles from 19
public universities and 138 private
universities. Table 1 shows the complete
attributes for the Tracer Study dataset.
To construct the classifiers, we use
the Waikato Environment for
Knowledge Analysis (WEKA), an open-
source data mining tool [12] which was
developed at University of Waikato New
Zealand. It provides various learning
algorithm that can be easily
implemented to the dataset. WEKA only
accepts dataset in Attribute-Relation File
Format (ARFF) format. Therefore, once
the data preparation being done, we
transform the dataset into ARFF file
with extension of .arff.
1089
nternational Journal on New Computer Architectures and Their Applications (IJNCAA) 1(4): 1086-1098

Table 1. Attributes from the Tracer Study dataset after the pre-processing is performed.
No. Attributes Values Descriptions
1 sex {male, female} Gender of the graduate
2 age {20-25, 25-30, 30-40, 40-50, >50} Age of the graduate
3 univ {public_univ, private_univ} University/institution of current
qualification
4 level {certificate, diploma,
advanced_diploma, first_degree,
postGraduate_diploma, masters_ thesis,
masters_courseWork& Thesis,
masters_courseWork, phd_ thesis,
phd_courseWork&Thesis, professional}
Level of study for current
qualification
5 field {technical, ict, education, science,
art&soc_science }
Field of study for current qualification
6 cgpa {2.00-2.49, 2.50-2.99, 3.00-3.66, 3.67-
4.00, failed, 4.01-6.17}
CGPA for current qualification
7 emp_status {employed, unemployed, others} Current employment status
8 general_IT skills {satisfied, extremely_satisfied, average,
strongly_not_satisfied, not_satisfied,
not_applicable}
Level of IT skills, Malay and English
language proficiency, general
knowledge, interpersonal
communication, creative and critical
thinking, analytical skills, problem
solving, inculcation of positive values,
and teamwork acquired from the
programme of study
9 Malay_lang
10 English_lang
11 gen_knowledge
12 interpersonal_
comm
13 cc_thinking
14 analytical
15 prob_solving
16 positive_value
17 teamwork
18 job_status {permanent, contract, temp, self_
employed, family_business}
Job status of employed graduates
19 job_sector {local_private_company, multinational_
company, own_company, government,
NGO, GLC, statutory_body, others}
Job sector of employed graduates
20 reason_not_
working
{job_hunting, waiting_for_ posting,
further_study, participating_skills_
program, waiting_posting_of_study,
unsuitable_job, resting, others, family_
responsibilities, medical_ issues, not_
interested_to_work,
not_going_to_work,
lack_of_confidence, chambering}
Reason for not working for
unemployed graduates
3.1 Data-Preprocessing
The raw data retrieved from the Tracer
Study database required pre-processing
to prepare the dataset for the
classification task. First, cleaning
activities involved eliminating data with
missing values in critical attributes,
identifying outliers, correcting
inconsistent data, as well as removing
duplicate data. From the total of 89,290
instances in the raw data, the data
cleaning process ended up 12,830
instances that are ready to be mined. For
1090

missing values (i.e., age attribute), we
replaced them with the mean values of
the attribute.
Second, data discretization is
required due to the fact that most of
attributes from the Tracer Study are
continuous attributes. In this case, we
discretized the values into interval so as
to prepare the dataset into categorical or
nominal attributes as below.
 cgpa previously in continuous
number is transformed into grade
range
 sex previously coded as 1 and 2 is
transformed into nominal
 age previously in continuous number
is transformed into age range
 field of study previously in
numerical code 1-4 is transformed
into nominal
 skill information (i.e., language
proficiency, general knowledge,
interpersonal communication etc)
previously in numerical 1-9 is
transformed into nominal
 employment status previously in
numerical code 1-3 is transformed
into nominal
3.2 Classification Task
The classification task at hand is to
predict the employment status
(employed, unemployed, others) for
graduate profiles in the Tracer Study.
The task is performed in two stages,
training and testing. Once the classifier
is constructed, testing dataset is used to
estimate the predictive accuracy of the
classifier.
There are four types of testing option
in WEKA, which are using the training
set, supplied test set, cross validation and
percentage split. If we use training set as
the test option, the test data will be
sourced from the same training data,
hence this will decrease reliable estimate
of the true error rate. Supplied test set
permit us to set the test data which been
prepared separately from the training
data. Cross-validation is suitable for
limited dataset whereby the number of
fold can be determined by user. 10-fold
cross validation is widely use to get the
best estimate of error. It has been proven
by extensive test on numerous datasets
with different learning techniques [13].
With a number of dataset and to avoid
overfitting, we employed hold-out
validation method with 70-30 percentage
split, whereby 70% out of the 12,830
instances is used for training while the
remaining instances are used for testing.
Various algorithms from both Bayes and
decision tree families are used in
predicting the accuracy of the
employment status.
Information Gain. Information Gain is
an attribute selection measure uses in
ID3. If node N represents tuples of
partition D, attribute with highest
information gain will be chosen as
splitting attribute for node N. It resulted
towards minimizing number of tests
needed to classify a given tuples as well
as guarantees that a simple tree is found.
The expected information needed to
classify a tuple in D is given by
m
Info(D) = - ∑ pi log2(pi)
i=1
1091

Bayes Methods. In Bayes methods, the
classification task consists of classifying
a class variable, given a set of attribute
variables. It is a type of statistical in
which the prior distribution is estimated
from the data before any new data are
observed, hence every parameter is
assigned with a prior probability
distribution [14]. A Bayesian classifier
learns from the samples over both class
and attribute variables.
The naïve Bayesian classifier works
as follows: Let D be a training set of
tuples and their associated class labels.
As usual, each tuple is represented by an
n-dimensional attribute vector, X = (x1,
x2, …, xn), depicting n measurements
made on the tuple from n attributes,
respectively, A1, A2, … , An.
Suppose that there are m classes, C1,
C2, …, Cm. Given a tuple, X, the
classifier will predict that X belongs to
the class having the highest posterior
probability, conditioned on X. That is,
the naïve Bayesian classifier predicts
that tuple X belongs to the class Ci if and
only if
P(Ci|X) > P(Cj|X) for 1 ≤ j ≤ m; j ≠ i
Thus, we maximize P(Ci|X). The class Ci
for which P(Ci|X) is maximized is called
the maximum posteriori hypothesis.
Under the Bayes method in WEKA, we
performed the experiment with eight
algorithms, which are Averaged One-
Dependence Estimators (AODE),
AODEsr, WAODE, Bayes Network,
HNB, Naïve Bayesian, Naïve Bayesian
Simple and Naïve Bayesian Updateable.
AODE, HNB and Naïve Bayesian was
also used in [11] and the rest algorithms
were chosen to further compare the
results from the Bayes algorithm
experiment using the same dataset.
AODE algorithm achieved the
highest accuracy percentage averaging
all of smaller searching-space in
alternative naive Bayes-like models that
have weaker and hence less detrimental
independence assumptions than naive
Bayes. The resulting algorithm is
computationally efficient while
delivering highly accurate classification
on many learning tasks. AODEsr and
WAODE are expended from AODE.
AODEsr complement AODE with
Subsumption Resolution, which is
capable to detect specializations between
two attribute values at classification time
and deletes the generalization attribute
value.
Meanwhile, WAODE constructs the
model called Weightily Averaged One-
Dependence Estimators by assigning
weight to each dataset. Bayes Network
learning using various search algorithms
and quality measures. HNB constructs
Hidden Naive Bayes classification
model with high classification accuracy
and AUC. In Naive Bayes, numeric
estimator precision values are chosen
based on analysis of the training data.
The Naïve Bayes Updateable classifier
will use a default precision of 0.1 for
numeric attributes when build classifier
is called with zero training instances.
Naive Bayes Simple modeled numeric
attributes by a normal distribution.
Tree Methods. Tree-based methods
classify instances by sorting the
instances down the tree from the root to
some leaf node, which provides the
classification of a particular instance.
Each node in the tree specifies a test of
1092

some attribute of the instance and each
branch descending from that node
corresponds to one of the possible values
for this attribute [15]. Figure 1 shows the
model produced by decision trees, which
is represented in the form of tree
structure.
Under the tree method in WEKA, we
performed the classification experiment
with nine algorithms, which are ID3,
J48, REPTree, J48graft, Random Tree,
Decision Stump, LADTree, Random
Forest and Simple Cart. J48 and
REPTree was also used in [11], but we
did not managed to use NBTree and
BFTree because the experiment worked
on large amount of datasets, thus
incompatible with the memory allocation
in WEKA. FT, User Classifier and LMT
algorithm also experienced the same
problem as NBTree and BFTree. In
addition, we employed ID3, J48graft,
Random Tree, Decision Stump, LAD
Tree, Random Forest and Simple Cart to
experiment with other alternative
algorithms in decision tree.
Figure 1. In a tree structure, each node denotes a
test on an attribute value, each branch represents
an outcome of the test, and tree leaves represent
classes or class distributions. A leaf node
indicates the class of the examples. The instances
are classified by sorting them down the tree from
the root node to some leaf node.
ID3 is a class for constructing an
unpruned decision tree based on the ID3
algorithm, which only deals with
nominal attributes. J48 is a class for
generating a pruned or unpruned C4.5
decision tree while J48 grafted generates
a grafted (pruned or unpruned) C4.5
decision tree. REPTree is fast decision
tree learner which builds a decision/
regression tree using information gain/
variance and prunes it using reduced-
error pruning (with backfitting).
Decision stump is usually being used in
conjunction with a boosting algorithm. A
multi-class alternating decision tree is
generated in LADTree using the
LogitBoost strategy. Random Forest
constructs a forest of random trees
whereas Random Tree constructs a tree
that considers K randomly chosen
attributes at each node without pruning.
SimpleCart implements minimal cost-
complexity pruning.
4 RESULTS AND DISCUSSIONS
We segregated the experimental results
into three parts. The first is the result
from ranking attributes in the Tracer
Study dataset using the Information
Gain. The second and third parts
presents the predictive accuracy results
by various algorithms from the Bayes
method and decision tree families,
respectively.
4.1 Information Gain
In this study, we employed Information
Gain to rank the attributes in
determining the target values as well as
to reduce the size of prediction. Decision
set of possible
answers
leaf leaf
root
node
set of possible
answers
1093

tree algorithms adopt a mutual-
information criterion to choose the
particular attribute to branch on that gain
the most information. This is inherently
a simple preference bias that explicitly
searches for a simple hypothesis.
Ranking attributes also increases the
speed and accuracy in making
prediction. Based on the attribute
selection using the Information Gain, the
job sector attribute was found the most
important factor in discriminating the
graduate profiles to predict the
graduate’s employment status. This is
shown in Figure 2.
Figure 2. Job sector is ranked the highest by attribute selection based on Information Gain. This is largely
because the attribute has small set of values, thus one instance is easily distinguishable than the remaining
instances.
4.2 Bayes Methods
Table 2 shows the classification
accuracies for various algorithms under
Bayes method. In addition, the table
provides comparative results for the
kappa statistics, mean absolute error,
root mean squared error, relative
absolute error, and root relative squared
error from the total of 3,840 testing
instances.
1094

The Weightily Averaged One-
Dependence Estimators (WAODE)
algorithm achieved the highest accuracy
percentage as compared to other
algorithms. Despite treating each tree
augmented naive Bayes equally, [16]
have extended AODE by assigning
weight for each tree augmented naive
Bayes differently as the facts that each
attributes do not play the same role in
classification.
Table 2. Classification accuracy using various algorithms under Bayes method in WEKA.
Algorithm Accurac
y (%)
Error
Rate
(%)
Kappa
Statistic
s
Mean
Absolut
e Error
Root
Mean
Squared
Error
Relative
Absolut
e Error
(%)
Root
Relative
Squared
Error
(%)
WAODE 91.3 8.7 0.834 0.073 0.203 20.8 48.4
AODE 91.1 8.9 0.827 0.069 0.208 19.5 49.6
Naïve
Bayesian
90.9 9.1 0.825 0.072 0.214 20.5 51.3
Naïve Bayes
simple
90.9 9.1 0.825 0.072 0.214 20.5 51.3
BayesNet 90.9 9.1 0.824 0.072 0.215 20.5 51.4
AODEsr 90.9 9.1 0.824 0.071 0.210 20.1 50.2
Naïve Bayes
Updateable
90.9 9.1 0.825 0.072 0.214 20.5 51.3
HNB 90.3 9.7 0.816 0.091 0.214 25.7 51.1
4.3 Tree Methods
Table 3 shows the classification
accuracies for various algorithms under
tree method. In addition, the table
provides comparative results for the
kappa statistics, mean absolute error,
root mean squared error, relative
absolute error, and root relative squared
error from the total of 3,840 testing
instances.
Table 3. Classification accuracy using various algorithms under Tree method in WEKA.
Algorithm Accuracy
(%)
Error
Rate
(%)
Kappa
Statistics
Mean
Absolute
Error
Root Mean
Squared
Error
Relative
Absolute
Error (%)
Root Relative
Squared Error
(%)
J48Graft 92.3 7.7 0.849 0.078 0.204 22.1 48.7
J48 92.2 7.8 0.848 0.078 0.204 22.2 48.8
Simple Cart 92.0 8.0 0.844 0.079 0.199 22.3 47.5
Random Forest 91.4 8.6 0.832 0.083 0.205 23.4 49.1
LAD Tree 91.3 8.7 0.830 0.077 0.197 22.0 47.0
1095

REPTree 91.0 9.0 0.825 0.080 0.213 22.8 50.9
Decision Stump 91.0 9.0 0.821 0.108 0.232 30.6 55.3
RandomTree 88.9 11.1 0.787 0.081 0.269 23.0 64.4
ID3 86.7 13.3 0.795 0.072 0.268 21.1 65.2
The J48Graft algorithm achieved the
highest accuracy percentage as
compared to other algorithms. J48Graft
generates a grafted C4.5 decision tree,
whether pruned or unprunned. Grafting
is an inductive process that adds nodes
to the inferred decision tree. Unlike
pruning that uses only information as the
tree grows, grafting uses non-local
information to provide better predictive
accuracy. Figure 3 shows the difference
of tree structure in a J48 tree as well as
the grafted J48 tree.
Figure 3. The top figure is the tree structure for
J48 and the bottom figure is the tree structure for
grafted J48. Grafting adds nodes to the decision
trees to increase the predictive accuracy. In the
grafted J48, new branches are added in the place
of a single leaf or graft within leaves.
Comparing the performance of both
Bayes and tree-based methods, the
J48Graft algorithm achieved the highest
accuracy of 92.3% using the Tracer
Study dataset. The second highest
accuracy is also under Tree method,
which is J48 algorithm with an accuracy
of 92.2%. Bayes method only falls to
number three using WAODE algorithm
with prediction accuracy of 91.3%.
Nonetheless, we found that both
classification approaches were
complementary because the Bayes
methods provide better view of
association or dependencies among the
attributes while the results from the tree
method are easier to interpret.
Figure 4 shows the mapping of root
mean squared error values that resulted
from the classification experiment. This
knowledge could be used in getting
insights on the employment trend of
graduates from local higher institutions.
1096

0
0.05
0.1
0.15
0.2
0.25
0.3
1
2
3
4
5
Bayes Methods
Tree-based Methods
AODE vs.
J48Graft
Naïve
Bayesian
Naïve Bayes
Simple vs.
REPTree
BayesNet
vs.
RandomTr
HNB
vs. ID3
Figure 4. A radial display of the root mean squared error across all algorithms under both Bayes and tree-
based methods relative to accuracy. The smaller the mean squared error, the better is the forecast. Based on
this figure, three out of five tree-based algorithms indicate better forecast as compared to the corresponding
algorithms under the Bayes methods.
6 CONCLUSIONS
As the education sector blooms every
year, graduates are facing stiff
competitions to ensure their
employability in the industry. The sole
purpose of the Tracer Study system is to
aid the higher educational institutions in
preparing their graduates with sufficient
skills to enter the job market. This paper
focussed on identifying attributes that
influenced graduates’ employability
based on actual data from the graduates
themselves after six month of
graduation. Nonetheless, assembling the
dataset was difficult because only 90%
of the attributes made their way to the
classification task. This is due to
confidentiality and sensitivity issues,
hence the remaining 10% of the
attributes are not permitted by the data
owner.
This paper attempts to predict
whether a graduate has been employed,
remains unemployed or in an
undetermined situation within the first
six months after their graduation. The
prediction has been performed through a
series of classification experiments using
various algorithms under Bayes and
decision methods to classify a graduate
profile as employed, unemployed or
others. Results showed that J48, a
variant of decision-tree algorithm
yielded the highest accuracy, which is
92.3% as compared to the average of
91.3% across other Bayes algorithms.
As for future work, we are hoping to
expand the dataset from the Tracer Study
with more attributes and to annotate the
attributes with information like
correlation factor between the current
employer and the previous employer.
We are also looking at integration
dataset from different sources of data,
1097

for instance graduate profiles from the
alumni organization in the respective
educational institutions. Having this,
next we plan to introduce clustering as
part of pre-processing to cluster the
attributes before attribute ranking is
performed. Finally, other data mining
techniques such as anomaly detection or
classification-based association may be
implemented in order to gain more
knowledge on the graduates
employability in Malaysia.
Acknowledgments. Special thanks to
Prof. Dr. Md Yusof Abu Bakar and Puan
Salwati Badaroddin from Ministry of
Higher Education Malaysia (MOHE) for
their help with data gathering as well as
expert opinion.
7 REFERENCES
1. Han, J., Kamber, M.: Data Mining: Concepts
and Techniques. Morgan Kaufman (2006)
2. Shafie, L.A, Nayan, S.: Employability
Awareness among Malaysian
Undergraduates. International Journal of
Business and Management, 5(8):119--123
(2010)
3. Mukhtar, M., Yahya, Y., Abdullah, S.,
Hamdan, A.R., Jailani, N., Abdullah, Z.:
Employability and Service Science: Facing
the Challenges via Curriculum Design and
Restructuring. In: International Conference
on Electrical Engineering and Informatics,
pp. 357--361 (2009)
4. Zaharim, A., Omar, M.Z., Yusoff, Y.M.,
Muhamad, N., Mohamed, A., Mustapha, R.:
Practical Framework of Employability Skills
for Engineering Graduate in Malaysia. In:
IEEE EDUCON Education Engineering
2010: The Future Of Global Learning
Engineering Education, pp. 921--927 (2010)
5. Rees, C., Forbes, P., Kubler, B.: Student
Employability Profiles: A Guide for Higher
Education Practitioners (2006)
6. Wook, M., Yahaya, Y.H., Wahab, N., Isa,
M.R.M.: Predicting NDUM Student’s
Academic Performance using Data Mining
Techniques. In: Second International
Conference on Computer and Electrical
Engineering, pp. 357--361 (2009)
7. Ogor, E.N.: Student Academic Performance
Monitoring and Evaluation Using Data
Mining Techniques. In: Fourth Congress of
Electronics, Robotics and Automotive
Mechanics, pp. 354--359 (2007)
8. Minaei-Bidgoli, B., Kashy, D.A.,
Kortemeyer, G., Punch, W.F.: Predicting
Student Performance: An Application of Data
Mining Methods with an Educational Web-
based System. In: 33rd Frontiers in Education
Conference, pp. 13--18 (2003)
9. Guruler, H., Istanbullu, A., Karahasan, M.: A
New Student Performance Analysing System
using Knowledge Discovery in Higher
Educational Databases. Computers &
Education. 55(1), pp 247--254 (2010)
10. Kumar, V., Chadha, A.: An Empirical Study
of the Applications of Data Mining
Techniques in Higher Education,
International Journal of Advanced Computer
Science and Applications, Vol. 2, No.3,
March 2011, pp 80-84 (2011)
1098
16. L. Jiang, H. Zhang: Weightily Averaged One-
Dependence Estimators. In: Proceedings of
the 9th Biennial Pacific Rim International
Conference on Artificial Intelligence,
PRICAI 2006, pp 970-974 (2006)
15. Mitchell, T.: Machine Learning. McGraw
Hill, New York (1997)
14. Jaynes, E.T.: Probability Theory: The Logic
of Science. Cambridge University Press
(2003)
13. Ian H. Witten, Eibe Frank:Data Mining :
Practical Machine Learning Tools and
Techniques, Morgan Kaufmann (2005)
12. Hall, M., Frank, E., Holmes, G., Pfahringer,
B., Reutemann, P., Witten, I.H.: The WEKA
Data Mining Software: An Update; SIGKDD
Explorations, Volume 11, Issue 1 (2009)
11. Affendey, L.S., Paris, I.H.M., Mustapha, N.,
Sulaiman, M.N., Muda, Z.: Ranking of
Influencing Factors in Predicting Student
Academic Performance. Information
Technology Journal. 9(4):832--837 (2010)

Data Mining Model Predicts Graduate Employment in Malaysia

Recomendados

Recomendados

Más contenido relacionado

Similar a Data Mining Model Predicts Graduate Employment in Malaysia

Similar a Data Mining Model Predicts Graduate Employment in Malaysia (20)

Más de Sandra Long

Más de Sandra Long (20)

Último

Último (20)

Data Mining Model Predicts Graduate Employment in Malaysia