SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
An efficient educational data mining approach to support
e-learning
Padmaja Appalla1 • Venu Madhav Kuthadi2 • Tshilidzi Marwala1
 Springer Science+Business Media New York 2016
Abstract The e-learning is a recent development that has
emerged in the educational system due to the growth of the
information technology. The common challenges involved
in The e-learning platform include the collection and
annotation of the learning materials, organization of the
knowledge in a useful way, the retrieval and discovery of
the useful learning materials from the knowledge space in a
more significant way, and the delivery of the adaptive and
personalized learning materials. In order to handle these
challenges, the proposed system is developed using five
different steps of knowledge input such as the annotation of
the learning materials, creation of knowledge space,
indexing of learning materials using the multi-dimensional
knowledge and XML structure to generate a knowledge
grid and the retrieval of learning materials performed by
matching the user query with the indexed database and
ontology. The process is carried out in two modules such as
the server module and client module. The proposed
approach is evaluated using various parameters such as the
precision, recall and F-measure. Comprehensive results are
achieved by varying the keywords, number of documents
and the K-size. The proposed approach has yielded
excellent results by obtaining the higher evaluation metric,
together with an average precision of 0.81, average recall
of 1 and average F-measure of 0.86 for K = 2.
Keywords E-learning  Data mining  Knowledge
organization  XML  Ontology
1 Introduction
A lodestar in the ever-growing horizon of the distance and
continuing education, the e-learning is gradually conquer-
ing the cosmos with the charisma and consequent dyna-
mism of a victorious king, making its presence felt
everywhere [1]. The new-fangled generation of the web
known as the semantic web has emerged as a talented
technology for executing and boosting the e-learning.
Further, it has become the cynosure of the business mag-
nates, industrial entrepreneurs, the academic cream, and
also the intriguing investigators offering them enough food
for thought regarding its utility and applications. It is gifted
with the amazing acumen of co-coordinating the data in the
mechanized form and harmonizing the current web with
the high-tech computers and experts to function hand in
hand. The semantic web technology is endowed with the
capacity to be extensively executed in diverse domains.
e-learning is one of the areas which are likely to get
manifold advantages from the innovative web technique
[2]. In the course of the recent years, the investigations on
the semantic web have ushered in the ground-breaking
concepts so as to give shape to a novel configuration of
web content which would be significantly valuable to the
modern systems [3]. With the result, various techniques
have been devised for building and expanding the semantic
web. The recent innovation is the Resource Definition
Framework (RDF) [4] and its annexes like OWL [5] to
define metadata schemas, domain ontologies and resource
narratives.
 Padmaja Appalla
padmaja1074@gmail.com
1
Faculty of Engineering, University of Johannesburg,
Johannesburg, South Africa
2
Department of AIS, University of Johannesburg,
Johannesburg, South Africa
123
Wireless Netw
DOI 10.1007/s11276-015-1173-z
Since the e-learning atmosphere necessitates the supply
of sufficient data and learning materials in diverse forms,
the need for the semantic-based explanation of the
e-learning material, effortless streamlining of the e-learn-
ing plan, and personalized deliverance of the e-learning
material is to be overemphasized [6, 7]. The visualization
of the semantic web is concerned with the capacity to
communicate the World Wide Web data in an usual and
recognized language that can be deciphered by the shrewd
agents, thus allowing them, on behalf of the human user, to
find, distribute and amalgamate data in a mechanized
manner. It furnishes a novel structure for the vibrant, dis-
seminated and extensible planned information (ontology)
set up on the prescribed logic. The Ontology is an unam-
biguous design of a conceptualization by the use of an
approved vocabulary and furnishes an affluent set of con-
structs to usher in a further significant stage of knowledge.
The ontologies and their linked realms are the corners
stones of the semantic web venture [8]. The complexity
associated with the number, varieties, and uses of the
computer based artifacts requires a system designing that
lets intelligence disappear into the infrastructure of active
spaces (such as buildings, shopping malls, theatres, and
homes) [9]. As a matter of fact, the challenges faced in the
investigation of the titanic data have not appeared just like
a blitzkrieg from the blue, with a torrential shower of
hassles within a fraction of a second, but have emerged
gradually, step by step, over a large number of years. This
is in view of the fact the generation of data assumes the
posture of a child’s play, while the location of fruitful facts
from the data takes the shape of a Himalayan Task. In the
current age of innovations, the web services technology
holds out immense potential for proficiently performing the
service oriented architecture and its strategic objectives. In
the domain of the classifier applications, the feature choice
is entrusted with the task of short-listing a subset of the
most leading features by steering clear of the entire
extraneous and superfluous so as to significantly scale up
the precision and accelerate the model training duration for
the classifier. When viewed from the general perspective of
the data mining functionality, the data mining emerges as
the most ideal candidate for exploring enthusing data from
the titanic quantity of data parked in the treasure houses of
databases, data warehouses, or other parallel data store-
houses [10–13].
Vastly equipped with the ever-enhancing enthusiasm
and efficiency to ensure an end product of easy and
effective curriculum, the education is reigning as one of the
shining stars in the galaxy of significant applications for the
multimedia. In the semantic e-learning scenario, the mul-
timedia method [14] has enacted a key role which is rather
identical to that of a long-established textbook as a treasure
house of invaluable data. The building of the context-aware
multimedia services in the heterogeneous networks is still a
complex and time consuming affair due to the hetero-
geneity in the context-aware media contents and network
conditions [15]. Anyhow, the feasibility to stage-manage
the text itself by means of an electronic appliance heralds a
newer horizon for the students to interact with the media,
leading to a further fantastic technique analogous to the
customary note taking. The multimedia technology holds
the acumen and efficiency to drown the aspirants in deep
delight, much in the same lines of a typical textbook.
Techniques are galore and varied for offering the learning
material in a multimedia design to the students. The videos
and images are immensely imperative resources in the
educational data mining field which hold out before the
students, the assorted avenues of accessing amazing data
altogether instinctively and efficiently than the text based
learning materials.
The paper presents an efficient educational data mining
approach to support the e-learning. The approach consists
of the knowledge input, annotation of learning, creation of
a multi-dimensional knowledge representation and the
retrieval of learning materials by matching the user query
with the indexed database and ontology. The proposed
approach consists of two modules such as the server
module and the client module. In the server module, the
documents are read from the database and the corre-
sponding knowledge representation is made. The module
consists of several steps such as the initial process, selec-
tion of unique words, k-means and the structure formation.
In client module, the information is retrieved based on the
user. The input query can be of two types like the user
details based and user interest based. There are several
advantages and disadvantages of each class of method for
developing the student models. In the existing works the
time consumption is found to be very high. The informa-
tion retrieval in the existing ontology method has failed
miserably in the retrieval of the information.
The rest of the paper is organized as follows: a brief
review of researches related to the proposed technique is
presented in Sect. 2. A 360-degree view of the proposed
approach appears in Sect. 3 and the detailed experimental
results and discussion are given in Sect. 4. The conclusion
is summed up in Sect. 5.
2 Review of related works
Many Researchers have developed several approaches in
the e-learning environment. Among them, a handful of
significant researches are presented in this section. Lau
et al. [16] have remarkably launched an innovative concept
map generation technique which was characterized by a
context-sensitive text mining approach and a fuzzy domain
Wireless Netw
123
ontology extraction algorithm. The devised system was
able to mechanically build the concept maps in accordance
with the letters forwarded to the online chat rooms. While
accessing the concept maps, the trainer could immediately
scan the improvement of the students and fine-tune the
pedagogical progression on the fly.
Francesco and De Santo [17] have characteristically
conceived an innovative method for the ontology structure.
They heralded a preliminary debate of the function of the
ontologies in the perspective of e-learning. In addition,
their novel technique visualized an ontological foundation
for exploring the learning devices to customize the learn-
ing. Subsequently, a test assessment of the method was
executed by using the authentic student records. Especially,
the technique was incorporated in a device for the evalu-
ation of students in the course of a learning phase. In
essence, the evaluation entrenched on the Bayesian method
went a long way in facilitating an advanced assessment of
student awareness.
Tankeleviciene and Damasevicius [18] have jointly
launched a novel structure, distinguished by the (semi-)
autonomous data gaining, learning and/or analysis with a
view to facilitate the supply of superior services to the stu-
dents, as a system quality feature. They proposed a new
structure for the expansion of the traditional e-learning
mechanism with intelligence faculties. They gave shape to
an intelligent module of the e-learning mechanism, with the
acumen to augment its local domain ontology with con-
ceptions and linkages gathered by placing an enquiry with a
far-flung data base. They were competent to evolve an
intelligent module of the eLearning system capable of
expanding its local ontology enriched with domain data by
means of the conceptions and linkages gathered by querying
a distant data base.
Ferreira-Satler et al. [19] have fervently formulated an
algorithm that facilitated the mechanical production of the
structure of the ontology. This method had been incorpo-
rated into a management device for the learning objects,
where every user profile was constructed from the learning
objects ushered in by the user himself. This technique had
been executed on a management device of the learning
objects viz. the AGORA. Those who had the occasion to
exploit the device have authenticated that the mechanically
made ontologies were broadly in line with their aspirations
and anticipations.
In addition, Sankey et al. [20] have skillfully offered the
conclusions of an investigation to assess the effect of the
multiple versions on the learning effects, along with
the student education excellence and engagement. From
the significant study it was found that multiple versions of
data failed miserably in ensuring a telling upgrade in the
education efficiency, neglecting of course the marginal
efficacy it prompted. However, the students were all
enthused and rejoiced at their exposure to the multimodal
learning modules, claiming that they were very much
benefited by the deeper understanding and useful preser-
vation of the learning material.
Brut et al. [21] have briskly brought forward a solution
to extend the IEEE LOM standard with the ontology-based
semantic annotations for the effective application of the
learning objects outside the learning management systems.
The data brand analogous to the corresponding method was
initially offered. The expanded indexing technique for the
related s brand expansion was launched with an eye on
achieving superior interpretations of the learning materials.
The novel technique developed and integrated two sacred
substitute techniques for the structure-based indexing of
the textual resources such as the mathematical method of
the latent semantic indexing and the linguistic-oriented
Word Net-based text processing. This led to the enhanced
comprehension of the underlying causes for the superb
outcomes turned out by the former method by means of the
linguistically managed options suggested by the latter
technique. The outcomes of the investigation assume sig-
nificance in the backdrop of embracing the semantic web
technologies in the e-learning field, but also as the pro-
jectors of progress in the pathway leading to the ontology-
based indexing of the textual materials.
Deng et al. [22] have dexterously developed an innovative
multimedia data technique for the e-learning. Their method
necessitated adjustable and re-workable backup for the
structuring of the multimedia content models and also
enabled the potential interactive, transmission of streams of
the multimedia information like the audio, video, text and
interpretations by the use of system services. Anyhow, they
assessed the current standards and applications for the mul-
timedia documents models like the HTML, MHEG, SMIL,
HyTime, RealPlay and MS Windows Media that enabled us
comprehend that they were not able to yield enough bases for
the sophisticated recycle and alteration. Therefore, they
launched an innovative technique for the structuring of the
re-workable and changeable multimedia data. In addition,
they conceived an all-inclusive approach for the advanced
multimedia data creation including the backup for recording
the presentation, regaining the data, abridging the presenta-
tion, weaving the presentation and adapting the representa-
tion. Their innovative technique had appreciable effect and
boosted the multimedia presentation authoring functions in
respect of the methodology and commercial features.
Rodicio and Sáncheza [23] have systematically designed
a method to investigate whether the merits of the human
education really survived simultaneously keeping away the
vexed issues of the earlier investigation. In one particular
type of investigation, the participants studied the geology
from a multimedia model integrating one of the three kinds
of backup such as the human education, preserved backup
Wireless Netw
123
or no backup. After studying the model, they were able to
find keys to the preservation and transmission tests. The
outcomes showed that the participant in the human edu-
cation situation outscored those in the other two situations,
though they were exactly identical to one another.
Moreover, Lau et al. [24] have magnificently made an
innovative e-learning specific multimedia method. They
furnished the students with further command over their
learning program and tempo. Above all, the multimedia
technique additionally supplemented the students with the
diverse versions of the media tailored to their learning
behavior, resulting in the boosting of their learning efficacy.
In 2013 Fernando et al. [25] have fascinating conducted
an extensive survey of mobile cloud computing research,
while highlighting the specific concerns in the mobile
cloud computing. They presented a taxonomy based on the
key issues in this area, and discussed the different
approaches employed to tackle these issues.
3 Proposed educational data mining approach
to support e-learning
In this paper, an innovative educational data mining tech-
nique to support the e-learning is presented. The e-learning
has emerged as the centre of attraction, of late. The prob-
lems faced in the e-learning can be related to the
acquirement and annotation of learning materials, organi-
zation of the acquired materials and the retrieval of the
useful learning materials. The approach consists of the
following steps of the knowledge input (where learning
materials such as the text documents, video and images are
collected), annotation of learning materials such as the
Meta data, creation of a multi-dimensional knowledge
representation (using the tree structure, indexing, XML and
the ontology) and the retrieval of the learning materials by
matching the user query with the indexed database and the
ontology. The proposed approach consists of two modules
such as the server module and the client module. The block
diagram of the proposed approach is given in Fig. 1.
3.1 Server module
In this module, the documents are read from the database
and the corresponding knowledge representation is made.
The process includes many steps which are detailed below:
3.1.1 Initial process
Initially all the documents are collected and stored in the
database. The documents habitually include the text doc-
uments, images and the videos files. Let the text documents
be represented by X = {x1, x2, …, xNx}, images by
Y = {y1, y2, …, yNy} and videos by V = {v1, v2, …, vNv}.
Here Nx is the number of text documents under consider-
ation, Ny represents the number of images under consid-
eration and Nv corresponds to the number of videos under
consideration. Here the total documents (database) can be
represented as D = {X, Y, V} and unified represented by:
D ¼ d1; d2; . . .; dNd
f g where di 2 X; Y; V and Nd
¼ Nx þ Ny þ Nv ð1Þ
All the documents (di, where 0  i B Nd) are read and
selected for further processing. The block diagram of the
initial processing is given in Fig. 2.
3.1.2 Selection of unique words
Each of the text documents consists of the words which are
processed. In the case of the image and video documents,
the words in the title are processed. Let the document z
consist of the words represented by:
WZ ¼ wz;1; wz;2; . . .; wz;Nwz
 
ð2Þ
Here, Nwz is the total number of words in document z. In
each document, the frequency of all the words is found out.
The frequency of a word represents the number of times the
word appears in the respective document. Subsequently,
top ten words from each document are selected based on
the frequency count. Let the selected top ten words from
each document be represented as:
SwZ ¼ swz;1; swz;2; . . .; swz;10
 
ð3Þ
where, z represents the document under consideration.
Hence, the selected words are obtained from each docu-
ment. The block diagram for the selection of the unique
words is given in Fig. 3.
After finding the top ten words for each document, the
common unique words to all the documents which come in
the top ten are found out. Here, let the common unique
words be represented by:
CUW ¼ cuw1; cuw2; . . .; cuwNcw
f g;
cuwi 2 Sw1Sw2. . .Swd
f g
ð4Þ
Ncw represents the number of the common unique words
present in all the documents under consideration. Subse-
quently, the frequency of each of the detected common
unique words is found out. Let the frequency of the com-
mon unique word (denoted by cuwz) and represented by
fre(cuwz) be defined by:
fre cuwz
ð Þ ¼ fre Swcumz
1
 
þ fre Swcumz
2
 
þ   
þ fre Swcumz
d
 
ð5Þ
where, fre Swj
i
 
represents the frequency with which the
word ‘‘j’’ appears in the top ten words of the document ‘‘i’’.
Wireless Netw
123
3.1.3 K-means clustering and structure formation
The K-means clustering is a commonly used clustering
algorithm where the input data are grouped into K number of
data clusters. The grouping of the data points to form clusters
depends on the centroid values. The frequency of the com-
mon unique words becomes the input to the K-means clus-
tering in the proposed technique. And the K-means performs
the clustering based on the frequency to cluster the whole
documents into two clusters (as K is taken as 2).
Let there be G number of data points which are denoted
by DP = {dp1, dp2, …, dpG}. Let the centroids be repre-
sented by ceni where 0  i B k. The minimization function
of the algorithm is given by:
Fig. 1 The block diagram of
the proposed approach
Fig. 2 The block diagram of initial processing
Wireless Netw
123
1
G
X
G
j¼1
min dis2
dpj; ceni
 
 
ð6Þ
where, dis(dpj, ceni) is the Euclidean distance between data
point dpj and centroid ceni. Hence the objective can be
stated as to locate k cluster centroids, in which the average
squared Euclidean distance between a data point and its
adjacent cluster centroid is minimized. The steps involved
in the K-Means Algorithm are given as:
1. Initialize k centroids, so as to have one centroid for each cluster
2. Calculate the distance dis(dpj, ceni) of every k centroid from
data points dpj in Db
3. Allocate data point dpj to cluster Cui whose distance is least
compared to other clusters
4. Update centroid values based on the membership values of the
novel clusters
5. Repeat Steps 2 to 4, till is no movement of the data points
among the clusters
Hence, after the clustering based on the common unique
words, two clusters of documents are obtained (as k is
taken as two). In each cluster, the frequency of the words is
found out. From these, top five most frequent words from
each cluster are found out which forms the topic of the
respective cluster. Suppose the clusters are represented by
Cui, then the topic of the cluster Topi is represented as:
Topi ¼ xi;1; xi;2; . . .; xi;5
 
ð7Þ
where xi,j is the jth most frequent word in the ith cluster.
After finding out the topic for all the clusters, the process of
selection of the unique words and K-means clustering is
repeated for the maximum size cluster. That is, the cluster
having the maximum documents at this stage is selected
and processed through the steps again. Hence, after two
iterations of the steps, three clusters are yielded in total and
at this time, all the three clusters are compared to find the
largest and the process is carried out on the largest cluster.
The iteration process is carried out for an arbitrary number
of times. The flow diagram of the process is given in Fig. 4.
The iteration is carried for an arbitrary number of times
to finally result in a tree structure. Tree structure invari-
ably includes the document clusters, topics and levels.
The initial iteration results constitute the top portion of
the tree structure and subsequently formed clusters and
topics form the sub-trees. The number of sub-trees or
levels depends on the number of iteration performed. The
topic found out at the sub-trees forms the sub-topic. The
sub-topic also consists of the top five words of the cluster
based on the frequency count. A sample tree generated is
shown in Fig. 5.
Fig. 3 Block diagram for the
selection of unique words
Wireless Netw
123
In the above figure the tree structure formed for a set
of sample documents is shown. Here it is assumed that
cluster A has more documents than cluster B and cluster
B has more documents than cluster C and D. We can see
that for each of the clusters the respective topic/sub-topic
is fund out. The tree structure is then processed to XML
Fig. 4 The flow diagram of the
process
Fig. 5 Sample tree generation
Wireless Netw
123
format with the use of indexing. The XML file is gen-
erated with the attributes such as the topic, sub-topic,
level and document name. The structure of the XML file
is shown below.
The XML file is processed with one benchmark uni-
versity ontology. The ontology characterizes the informa-
tion as a set of concepts contained by a domain, using
collective terms to represent the types, properties and
interrelationships of those concepts. They are the structural
frameworks for organizing the information. The ontology
consists of the information and data that is taken from the
users. The ontology is created for each user making use of
the attributes of the user such as the major subject, course,
subject they like, languages known, computer knowledge
and so on.
3.2 Client module
In this module, the information is retrieved based on the
user. Initially, e the XML data is read and stored for the
text, images, videos and bench mark university ontology
based on the user. Subsequently, after obtaining the data,
the user is asked for the input query. Let the user be rep-
resented by Usr, corresponding ontology built be repre-
sented as Or, corresponding text, image and video
documents be represented as Xr, Yr and Vr. The retrieval of
learning materials for the users is done adaptively based on
the user query and the ontology which contains the per-
sonalized information of all the users. The input query can
be of two types such as the user details based and user
interest based.
3.2.1 User details based
Here, the information about the particular user is collected
from the ontology. Based on the retrieval, the information
is displayed giving the corresponding text, images and the
videos documents. This is done by first collecting the
information about the user and retrieving the topics and
sub-topics of the user. These collected topics and sub-
topics are searched in our work to find the matching doc-
uments (be it text, image or video). That is, from the
ontology Or for the user Usr, initially the topics are found
out and the corresponding text, image and video documents
(Xr, Yr and Vr) are retrieved.
3.2.2 User interest based
In this case, the user interest is given as input. And based
on the input of the user interest, the topics and sub-topics
are found out to form the ontology. These collected topics
and sub-topics are searched in our work to find the
matching documents (be it text, image or video). The
corresponding text, images and videos documents are dis-
played based on the ontology. Let the user interest be
represented as Uir. That is, based on user interest Uir, the
corresponding text, image and video documents (Xr, Yr and
Vr) are retrieved from the ontology Or for the user Usr. The
Flow diagram of the client module is given in Fig. 6.
4 Results and discussion
In this section, the results obtained for the proposed
approach are given and analysed. In Sect. 4.1, the imple-
mentation details and evaluation metric employed are
Fig. 6 Flow diagram of the client module
Wireless Netw
123
offered. In Sect. 4.2, the implementation screen shots are
presented and in Sect. 4.2, evaluation metric values
obtained for the proposed approach are given.
4.1 Implementation details and evaluation metric
employed
The proposed technique is implemented in JAVA on a
system having 6 GB RAM and 2.9 GHz Intel i-7 processor.
The recall, precision and F-measure are used as the eval-
uation metrics. Intuitively, the recall measures how well
the approach is performing at locating all the relevant data
for a query, and precision measures how well it is per-
forming at rejecting non-relevant data.
The definition of these parameters assumes that, for a
given function, there are two distinct sets of data such as the
retrieved and non-retrieved data (the latter representing the
rest of the data). This obviously applies to the results of a
Boolean search, but the same definition can also be used with
a ranked search, as explained later. If, in addition, the rele-
vance is assumed to be binary, then the results for a query
can be summarized. In Table 1, P represents the relevant set
of data for the query, 
P characterizes the non-relevant set, Q
corresponds to the set of the retrieved data, and 
Q relates to
the set of non-retrieved data. The operator  gives the
intersection of the two sets. For example, P  Q represents
the set of data that are both relevant and retrieved.
The three parameters of particular interest are furnished
below.
Recall R
ð Þ ¼
P  Q
j j
P
j j
Pr ecision E
ð Þ ¼
P  Q
j j
Q
j j
where 
j j gives the size of the set under consideration. In
other words, the recall represents the proportion of the
relevant data that are retrieved, and the precision charac-
terizes the proportion of the retrieved data that are relevant.
The F-measure parameter is an efficiency parameter based
on the recall and precision which is used for evaluating the
classification performance and also for certain search
applications. It has the advantage of summarizing effec-
tiveness in a single number and is defined as the harmonic
mean of the recall and precision which is represented as
follows.
Fmeasure F
ð Þ ¼
1
1
2
1
R þ 1
E
  ¼
2RE
R þ E
ð Þ
4.2 Implementation screen shots
In this section, the implementation screen shot of the
proposed approach is given. The screenshots of the various
stages given here include the first page, tree view, XML
view, ontology, two types such as the user interest and user
view, training result, authentication, retrieved documents,
retrieved images and the retrieved videos. The screen shots
are given in Figs. 7, 8 and 9.
4.3 Results and analysis
In this section, the evaluation metric values obtained for
the proposed technique are given and discussed. The
Table 1 Relevant and retrieved documents
Relevant Non-relevant
Retrieved P  Q 
P  Q
Not retrieved P  
Q 
P  
Q
Fig. 7 First page of the
implementation
Wireless Netw
123
analysis is carried out in three phases of analysis based on
the keywords, number of documents and the K-value.
4.3.1 Analysis based on keywords
Inferences from Tables 2, 3, 4 and 5:
• Tables 2, 3 and 4 give the results values obtained for
the keywords data, database and the mining
respectively.
• The results are taken for the case K = 2.
• The results include the evaluation metric values of the
precision recall and F-measure.
• From the results, we can infer that the proposed
approach has attained good results by achieving high
evaluation metric values.
• Table 5 shows the average values obtained for various
keywords.
• It is seen that among the keywords, the novel approach
has worked best for the keyword data, achieving the
average precision of 0.84, average recall of 1 and the
average F-measure of 0.91.
Fig. 8 Overview of domain
ontology
Fig. 9 Training results
Wireless Netw
123
• Among all the values, the highest precision attained is
roughly 0.95, and the highest F-measure achieved is
approximately 0.97.
4.3.2 Analysis based on documents
In this section, analysis is carried out by finding the eval-
uation metrics based on the number of documents given as
input. Various document sizes taken for evaluation are 20,
40, 60, 80 and 100.
Inferences from Figs. 10, 11, 12 and 13:
• The analysis is carried out by finding the evaluation
metrics based on the number of documents given as
input. Various document sizes taken for evaluation are
20, 40, 60, 80 and 100.
Table 2 Results obtained for keyword: data
Keyword: data
Document
files
Image
files
Video
files
Documents
relevant
Documents
retrieved
Images
relevant
Images
retrieved
Video
relevant
Video
retrieved
Precision Recall F-measure
20 5 5 15 6 4 3 2 2 1 0.52381 0.6875
40 10 10 32 31 5 4 5 5 1 0.952381 0.9756098
60 15 15 49 44 6 6 8 7 1 0.904762 0.95
80 20 20 73 69 13 10 10 9 1 0.916667 0.9565217
100 25 25 89 85 19 16 11 10 1 0.932773 0.9652174
Table 3 Results obtained for keyword: database
Keyword: database
Document
files
Image
files
Video
files
Documents
relevant
Documents
retrieved
Images
relevant
Images
retrieved
Video
relevant
Video
retrieved
Precision Recall F-measure
20 5 5 6 3 1 1 1 1 I 0.625 0.7692308
40 10 10 20 12 4 4 1 1 1 0.68 0.8095238
60 15 15 37 32 4 4 1 1 1 0.880952 0.9367089
80 20 20 54 49 8 7 3 3 1 0.907692 0.9516129
100 25 25 75 68 12 10 7 7 1 0.904255 0.9497207
Table 4 Results obtained for keyword: mining
Keyword: mining
Document
files
Image
files
Video
files
Documents
relevant
Documents
retrieved
Images
relevant
Images
retrieved
Video
relevant
Video
retrieved
Precision Recall F-measure
20 5 5 3 2 2 1 1 1 1 0.666667 0.8
40 10 10 3 2 2 1 4 4 1 0.777778 0.875
60 15 15 9 7 2 1 6 6 1 0.823529 0.9032258
80 20 20 15 13 6 5 10 9 1 0.870968 0.9310345
100 25 25 21 18 7 6 13 12 1 0.878049 0.9350649
Table 5 Average values obtained for various keywords
Keyword Average precision Average recall Average F-measure
Data 0.84 1 0.91
Database 0.79 1 0.88
Mining 0.80 1 0.80
Wireless Netw
123
• The results are taken for the case K = 2.
• Figures 10, 11 and 12 give the evaluation metric values
obtained for the keywords data, database and the
mining respectively.
• The proposed approach ushers in excellent results by
achieving high evaluation metric values for all the cases
irrespective of the number of documents.
• Figure 13 shows the average values obtained for
various keywords.
• From the figure, we can infer that the proposed
approach works well for increasing the number of
documents. The best results have been achieved for the
document size of 100 in our case.
4.3.3 Analysis based on K-value
Inferences from Table 6 and Fig. 14:
• Table 6 and Fig. 14 give the average evaluation metric
values obtained by varying the K-size.
• The various K-sizes taken into consideration are 2, 3
and 4.
• From the results, we can see that the approach has
worked well for all the cases and best results have been
obtained for k = 2.
5 Conclusion
The paper presents an efficient educational data mining
approach to support the e-learning. The proposed approach
consists of two modules, such as the server module and the
client module. In the server module, the documents are
read from the database and the corresponding knowledge
-
0.20
0.40
0.60
0.80
1.00
Precision Recall F-measure
20
40
60
80
100
Fig. 10 Evaluation metric chart for varying number of documents for
keyword: data
0
0.2
0.4
0.6
0.8
1
Precision Recall F-measure
20
40
60
80
100
Fig. 11 Evaluation metric chart for varying number of documents for
keyword: database
0
0.2
0.4
0.6
0.8
1
Precision Recall F-measure
20
40
60
80
100
Fig. 12 Evaluation metric chart for varying number of documents for
keyword: mining
0
0.5
1
Precision Recall F-measure
20
40
60
80
100
Fig. 13 Average evaluation metric chart for varying number of
documents
Table 6 Average evaluation metric values obtained for varying
K-value
K value Average precision Average recall Average F-measure
2 0.81 1 0.86
3 0.78 1 0.84
4 0.74 1 0.79
Fig. 14 Chart of average evaluation metric values obtained for
varying K-value
Wireless Netw
123
representation is made. In the client module, the informa-
tion is retrieved based on the user requirements. The pro-
posed approach is evaluated using various parameters such
as the precision, recall and the F-measure. The compre-
hensive results are obtained by varying the keywords,
number of documents and the K-size. The proposed
approach has yielded amazing outcomes by obtaining high
evaluation metrics, as exemplified by the average precision
of 0.81, average recall of 1 and the average F-measure of
0.86 for K = 2.
References
1. Ghaleb, F. F. M., Daoud, S. S., Hasna, A. M., Jaam, J. M.,  El-
Sofany, H. F. (2006). A web-based e-learning system using semantic
web framework. Journal of Computer Science, 2(8), 619–626.
2. Dutta, B. (2006). Semantic web based e-learning. In DRTC
Conference on ICT for Digital Learning Environment, 11th–13th
January, 2006.
3. Berners-Lee, T., Hendler, J.,  Lassila, O. (2001). The semantic
web. Scientific American, 285(5), 34–44.
4. RDF. (2001). W3C. Semantic web activity: Resource description
framework.
5. OWL. (2003). Web ontology language.
6. Šimić, G., Gasević, D.,  Devedzić, V. (2004). Semantic web
and intelligent learning management systems. In Workshop on
Applications of Semantic Web Technologies for E-Learning.
7. Thyagharajan, K. K.,  Nayak, R. (2007). Adaptive content
creation for personalized e-learning using web services. Journal
of Applied Sciences Research, 3(9), 828–836.
8. Alesso, H. P.,  Smith, C. F. (2006). ‘‘Thinking on the web’’,
Berners-Lee, Gdel and Turing. London: Wiley.
9. Acampora, G., Gaeta, M., Loia, V.,  Vasilakos, A. (2010).
Interoperable and adaptive fuzzy services for ambient intelli-
gence applications. Journal of ACM Transactions on Autonomous
and Adaptive System (TAAS), 5(2), 8.
10. Tsai, C.-W., Lai, C.-F., Chao, H.-C.,  Vasilakos, A. (2015). Big
data analytics: A survey. Journal of Big Data, 2(21), 1–32.
11. Sheng, Q., Qiao, X., Vasilakos, A., Szabo, C., Bourne, S.,  Xu,
X. (2014). Web services composition: A decade’s overview. 280,
218–238.
12. Fong, S., Wong, R.,  Vasilakos, A. (2014). Accelerated PSO
swarm search feature selection for data stream mining big data.
IEEE Transactions on Services Computing. doi:10.1109/TSC.
2015.2439695.
13. Chen, F. Deng, P. Wan, J., Zhang, D., Vasilakos, A.,  Rong, X.
(2015). Data mining for the internet of things: Literature review
and challenges. 50, 11–14.
14. Ando, M.,  Ueno, M. (2008). Cognitive load reduction on mul-
timedia e-learning materials. In Proceedings of IEEE International
Conference on Advanced Learning Technologies, pp. 268–272.
15. Zhou, L., Naixue, X., Lei, S., Vasilakos, A.,  Yeo, S.-S. (2010).
Context-aware middleware for multimedia. Journal of Services in
Heterogeneous Networks, 25(2), 40–47.
16. Lau, R. Y. K., Song, D., Li, Y., Cheung, T. C. H.,  Hao, J.-X.
(2009). Toward a fuzzy domain ontology extraction method for
adaptive e-learning. IEEE Transactions on Knowledge and Data
Engineering, 21(6).
17. Francesco, C.,  De Santo, M. (2010). Ontology for e-learning: A
Bayesian approach. IEEE Transactions on Education, 53(2),
223–233.
18. Tankeleviciene, L.,  Damasevicius, R. (2010). Towards the
development of genuine intelligent ontology-based e-learning
systems. In 5th IEEE International Conference Intelligent Sys-
tems (IS), pp. 79–84.
19. Ferreira-Satler, M., Romero, F. P., Menendez, V. H., Zapata, A.,
 Prieto, M. E. (2010). A fuzzy ontology approach to represent
user profiles in e-learning environments. In IEEE International
Conference on Fuzzy Systems (FUZZ), pp. 1–8.
20. Sankey, M. D., Birch, D.,  Gardiner, M. W. (2011). The impact
of multiple representations of content using multimedia on
learning outcomes across learning styles and modal preferences.
International Journal of Education and Development Using
Information and Communication Technology (IJEDICT), 7(3),
18–35.
21. Brut, M. M., Sedes, F.,  Dumitrescu, S. D. (2011). A semantic-
oriented approach for organizing and developing annotation for
e-learning. IEEE Transactions on Learning Technologies, 4(3).
22. Deng, L. Y., Liu, Y.-J., Lee, D.-L.,  Chen, Y.-H. (2013).
Ontology-based multimedia adaptive learning system for
u-learning. In Information Technology Convergence Lecture
Notes in Electrical Engineering, Vol. 253, pp 669–676.
23. Rodicio, H. G.,  Sáncheza, E. (2013). Aids to computer-based
multimedia learning: A comparison of human tutoring and
computer support. Interactive Learning Environments, 20(5).
24. Lau, R. W. H., Yen, N. Y., Li, F.,  Wah, B. (2014). Recent
development in multimedia e-learning technologies. World Wide
Web, 17(2), 189–198.
25. Fernando, N., Loke, S.,  Rahayu, W. (2013). Mobile cloud
computing: A survey. Journal of Future Generation Computer
Systems, 29, 84–106.
Padmaja Appalla obtained her
Masters Degree in Education
from Open University, UK and
Masters Degree in Business
Administration from Andhra
University, India. She also
completed her Bachelors
Degree in Sciences from St.
Joseph’s College in India. She
has various professional certifi-
cations including PGDSM,
MCSD, OCA, Java certification
to cite a few. Currently, she is
pursuing her D.Phil. in e-learn-
ing from University of Johan-
nesburg. She has over 19 years of experience in the field of
Information Technology and Education and is currently working as
Deputy Pro Vice chancellor Education at Botho University,
Botswana.
Wireless Netw
123
Dr Venu Madhav Kuthadi
currently working with Univer-
sity of Johannesburg, he
obtained his Ph.D. Degree in
Computer Science from MU,
India. He received his Master’s
Degree in Computer Science
from JNTU India. He got
14 years of experience in
research and teaching under-
graduate and postgraduate stu-
dents of Engineering. He holds
B.Tech. in CSE from ANU
India. He has published good
number of articles in interna-
tional journals and conference proceedings. Dr Kuthadi is an Editor
for the International journal IJAEGT.
Professor Tshilidzi Marwala
is a deputy Vice Chancellor at
the University of Johannebsurg.
He was previously the Execu-
tive Dean of the Faculty of
Engineering and the Built
Environment at the University
of Johannesburg, the Head of
Control and Systems Group and
the Carl and Emily Fuchs Pro-
fessor of Electrical Engineering
at the University of the Witwa-
tersrand, Executive Assistant to
the Technical Director at the
South African Breweries, Chair
of the (Telkom) Local Loop Unbundling Committee, Deputy Chair of
Limpopo Business Support Agency, director of the State Information
Technology Agency Pty (Ltd), member of council of Statistics South
Africa and member of council of the National Advisory Council on
Innovation. He has been on the boards of City Power Johannesburg
Pty (Ltd) and EOH Pty (Ltd). He holds a Bachelor of Science in
Mechanical Engineering with a Magna Cum Laude from Case Wes-
tern Reserve University, a Master of Engineering from the University
of Pretoria, a Ph.D. in Computational Intelligence from University of
Cambridge and was a post-doctoral research associate at the
University of London’s Imperial College of Science, Technology and
Medicine. He has received over 40 awards including the Order of
Mapungubwe; has published over 150 articles in refereed interna-
tional journals, conference proceedings and book chapters and has
successfully supervised over 33 master and Ph.D. students.
Wireless Netw
123

Más contenido relacionado

Similar a An efficient educational data mining approach to support e-learning

Educational and Technological Standards of Educational Software Based on Inte...
Educational and Technological Standards of Educational Software Based on Inte...Educational and Technological Standards of Educational Software Based on Inte...
Educational and Technological Standards of Educational Software Based on Inte...iosrjce
 
Deep Learning: The Impact on Future eLearning
Deep Learning: The Impact on Future eLearningDeep Learning: The Impact on Future eLearning
Deep Learning: The Impact on Future eLearningIRJET Journal
 
Effectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemEffectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemIOSR Journals
 
Effectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemEffectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemIOSR Journals
 
A Survey on Autism Spectrum Disorder and E-Learning
A Survey on Autism Spectrum Disorder and E-LearningA Survey on Autism Spectrum Disorder and E-Learning
A Survey on Autism Spectrum Disorder and E-Learningrahulmonikasharma
 
Developing online learning resources: Big data, social networks, and cloud co...
Developing online learning resources: Big data, social networks, and cloud co...Developing online learning resources: Big data, social networks, and cloud co...
Developing online learning resources: Big data, social networks, and cloud co...eraser Juan José Calderón
 
N E T S PowerPoint
N E T S PowerPointN E T S PowerPoint
N E T S PowerPointBethann
 
A Survey on E-Learning System with Data Mining
A Survey on E-Learning System with Data MiningA Survey on E-Learning System with Data Mining
A Survey on E-Learning System with Data MiningIIRindia
 
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITS
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITSAN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITS
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITSijistjournal
 
Using data mining in e learning-a generic framework for military education
Using data mining in e learning-a generic framework for military educationUsing data mining in e learning-a generic framework for military education
Using data mining in e learning-a generic framework for military educationElena Susnea
 
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...ijcseit
 
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...ijcseit
 
Articulo cientifico ijaerv13n12_05
Articulo cientifico ijaerv13n12_05Articulo cientifico ijaerv13n12_05
Articulo cientifico ijaerv13n12_05Nombre Apellidos
 
Blockchain and machine learning in education: a literature review
Blockchain and machine learning in education: a literature reviewBlockchain and machine learning in education: a literature review
Blockchain and machine learning in education: a literature reviewIAESIJAI
 
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.R
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.RBig Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.R
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.Reraser Juan José Calderón
 
Solving The Problem of Adaptive E-Learning By Using Social Networks
Solving The Problem of Adaptive E-Learning By Using Social NetworksSolving The Problem of Adaptive E-Learning By Using Social Networks
Solving The Problem of Adaptive E-Learning By Using Social NetworksEswar Publications
 
Enriching E-Learning with web Services for the Creation of Virtual Learning P...
Enriching E-Learning with web Services for the Creation of Virtual Learning P...Enriching E-Learning with web Services for the Creation of Virtual Learning P...
Enriching E-Learning with web Services for the Creation of Virtual Learning P...IJERDJOURNAL
 
The Promise of Grid Computing Technologies for E-Learning Systems in Kenya
The Promise of Grid Computing Technologies for E-Learning Systems in KenyaThe Promise of Grid Computing Technologies for E-Learning Systems in Kenya
The Promise of Grid Computing Technologies for E-Learning Systems in KenyaUmma Khatuna Jannat
 

Similar a An efficient educational data mining approach to support e-learning (20)

Educational and Technological Standards of Educational Software Based on Inte...
Educational and Technological Standards of Educational Software Based on Inte...Educational and Technological Standards of Educational Software Based on Inte...
Educational and Technological Standards of Educational Software Based on Inte...
 
Deep Learning: The Impact on Future eLearning
Deep Learning: The Impact on Future eLearningDeep Learning: The Impact on Future eLearning
Deep Learning: The Impact on Future eLearning
 
Hypertxt
HypertxtHypertxt
Hypertxt
 
K017157582
K017157582K017157582
K017157582
 
Effectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemEffectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education System
 
Effectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education SystemEffectiveness of Information Communication Technologies for Education System
Effectiveness of Information Communication Technologies for Education System
 
A Survey on Autism Spectrum Disorder and E-Learning
A Survey on Autism Spectrum Disorder and E-LearningA Survey on Autism Spectrum Disorder and E-Learning
A Survey on Autism Spectrum Disorder and E-Learning
 
Developing online learning resources: Big data, social networks, and cloud co...
Developing online learning resources: Big data, social networks, and cloud co...Developing online learning resources: Big data, social networks, and cloud co...
Developing online learning resources: Big data, social networks, and cloud co...
 
N E T S PowerPoint
N E T S PowerPointN E T S PowerPoint
N E T S PowerPoint
 
A Survey on E-Learning System with Data Mining
A Survey on E-Learning System with Data MiningA Survey on E-Learning System with Data Mining
A Survey on E-Learning System with Data Mining
 
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITS
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITSAN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITS
AN OVERVIEW OF CLOUD COMPUTING FOR E-LEARNING WITH ITS KEY BENEFITS
 
Using data mining in e learning-a generic framework for military education
Using data mining in e learning-a generic framework for military educationUsing data mining in e learning-a generic framework for military education
Using data mining in e learning-a generic framework for military education
 
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
 
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
Maximum Spanning Tree Model on Personalized Web Based Collaborative Learning ...
 
Articulo cientifico ijaerv13n12_05
Articulo cientifico ijaerv13n12_05Articulo cientifico ijaerv13n12_05
Articulo cientifico ijaerv13n12_05
 
Blockchain and machine learning in education: a literature review
Blockchain and machine learning in education: a literature reviewBlockchain and machine learning in education: a literature review
Blockchain and machine learning in education: a literature review
 
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.R
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.RBig Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.R
Big Data Analytics and E Learning in Higher Education. Tulasi.B & Suchithra.R
 
Solving The Problem of Adaptive E-Learning By Using Social Networks
Solving The Problem of Adaptive E-Learning By Using Social NetworksSolving The Problem of Adaptive E-Learning By Using Social Networks
Solving The Problem of Adaptive E-Learning By Using Social Networks
 
Enriching E-Learning with web Services for the Creation of Virtual Learning P...
Enriching E-Learning with web Services for the Creation of Virtual Learning P...Enriching E-Learning with web Services for the Creation of Virtual Learning P...
Enriching E-Learning with web Services for the Creation of Virtual Learning P...
 
The Promise of Grid Computing Technologies for E-Learning Systems in Kenya
The Promise of Grid Computing Technologies for E-Learning Systems in KenyaThe Promise of Grid Computing Technologies for E-Learning Systems in Kenya
The Promise of Grid Computing Technologies for E-Learning Systems in Kenya
 

Más de Venu Madhav

A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesVenu Madhav
 
Ant-based distributed denial of service detection technique using roaming vir...
Ant-based distributed denial of service detection technique using roaming vir...Ant-based distributed denial of service detection technique using roaming vir...
Ant-based distributed denial of service detection technique using roaming vir...Venu Madhav
 
Human muscle rigidity identification by human-robot approximation characteris...
Human muscle rigidity identification by human-robot approximation characteris...Human muscle rigidity identification by human-robot approximation characteris...
Human muscle rigidity identification by human-robot approximation characteris...Venu Madhav
 
Attribute‑based data fusion for designing a rational trust model for improvin...
Attribute‑based data fusion for designing a rational trust model for improvin...Attribute‑based data fusion for designing a rational trust model for improvin...
Attribute‑based data fusion for designing a rational trust model for improvin...Venu Madhav
 
Optimized Energy Management Model on Data Distributing Framework of Wireless ...
Optimized Energy Management Model on Data Distributing Framework of Wireless ...Optimized Energy Management Model on Data Distributing Framework of Wireless ...
Optimized Energy Management Model on Data Distributing Framework of Wireless ...Venu Madhav
 
Data security tolerance and portable based energy-efficient framework in sens...
Data security tolerance and portable based energy-efficient framework in sens...Data security tolerance and portable based energy-efficient framework in sens...
Data security tolerance and portable based energy-efficient framework in sens...Venu Madhav
 
Real-time agricultural field monitoring and smart irrigation architecture usi...
Real-time agricultural field monitoring and smart irrigation architecture usi...Real-time agricultural field monitoring and smart irrigation architecture usi...
Real-time agricultural field monitoring and smart irrigation architecture usi...Venu Madhav
 
2 Agronomy Journal Selvarj May 22022.pdf
2 Agronomy Journal Selvarj May 22022.pdf2 Agronomy Journal Selvarj May 22022.pdf
2 Agronomy Journal Selvarj May 22022.pdfVenu Madhav
 
5 Springer Bhasker July 21.pdf
5 Springer Bhasker July 21.pdf5 Springer Bhasker July 21.pdf
5 Springer Bhasker July 21.pdfVenu Madhav
 
4 springer Venu June 21.pdf
4 springer Venu June 21.pdf4 springer Venu June 21.pdf
4 springer Venu June 21.pdfVenu Madhav
 
3 Elsevier Venu Mar 2022.pdf
3 Elsevier Venu Mar 2022.pdf3 Elsevier Venu Mar 2022.pdf
3 Elsevier Venu Mar 2022.pdfVenu Madhav
 
6 Expert Systems - Raj Aug 2021.pdf
6 Expert Systems - Raj  Aug 2021.pdf6 Expert Systems - Raj  Aug 2021.pdf
6 Expert Systems - Raj Aug 2021.pdfVenu Madhav
 
Wireless Personal Communications
Wireless Personal CommunicationsWireless Personal Communications
Wireless Personal CommunicationsVenu Madhav
 
Wireless personal communication
Wireless personal communicationWireless personal communication
Wireless personal communicationVenu Madhav
 

Más de Venu Madhav (14)

A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
 
Ant-based distributed denial of service detection technique using roaming vir...
Ant-based distributed denial of service detection technique using roaming vir...Ant-based distributed denial of service detection technique using roaming vir...
Ant-based distributed denial of service detection technique using roaming vir...
 
Human muscle rigidity identification by human-robot approximation characteris...
Human muscle rigidity identification by human-robot approximation characteris...Human muscle rigidity identification by human-robot approximation characteris...
Human muscle rigidity identification by human-robot approximation characteris...
 
Attribute‑based data fusion for designing a rational trust model for improvin...
Attribute‑based data fusion for designing a rational trust model for improvin...Attribute‑based data fusion for designing a rational trust model for improvin...
Attribute‑based data fusion for designing a rational trust model for improvin...
 
Optimized Energy Management Model on Data Distributing Framework of Wireless ...
Optimized Energy Management Model on Data Distributing Framework of Wireless ...Optimized Energy Management Model on Data Distributing Framework of Wireless ...
Optimized Energy Management Model on Data Distributing Framework of Wireless ...
 
Data security tolerance and portable based energy-efficient framework in sens...
Data security tolerance and portable based energy-efficient framework in sens...Data security tolerance and portable based energy-efficient framework in sens...
Data security tolerance and portable based energy-efficient framework in sens...
 
Real-time agricultural field monitoring and smart irrigation architecture usi...
Real-time agricultural field monitoring and smart irrigation architecture usi...Real-time agricultural field monitoring and smart irrigation architecture usi...
Real-time agricultural field monitoring and smart irrigation architecture usi...
 
2 Agronomy Journal Selvarj May 22022.pdf
2 Agronomy Journal Selvarj May 22022.pdf2 Agronomy Journal Selvarj May 22022.pdf
2 Agronomy Journal Selvarj May 22022.pdf
 
5 Springer Bhasker July 21.pdf
5 Springer Bhasker July 21.pdf5 Springer Bhasker July 21.pdf
5 Springer Bhasker July 21.pdf
 
4 springer Venu June 21.pdf
4 springer Venu June 21.pdf4 springer Venu June 21.pdf
4 springer Venu June 21.pdf
 
3 Elsevier Venu Mar 2022.pdf
3 Elsevier Venu Mar 2022.pdf3 Elsevier Venu Mar 2022.pdf
3 Elsevier Venu Mar 2022.pdf
 
6 Expert Systems - Raj Aug 2021.pdf
6 Expert Systems - Raj  Aug 2021.pdf6 Expert Systems - Raj  Aug 2021.pdf
6 Expert Systems - Raj Aug 2021.pdf
 
Wireless Personal Communications
Wireless Personal CommunicationsWireless Personal Communications
Wireless Personal Communications
 
Wireless personal communication
Wireless personal communicationWireless personal communication
Wireless personal communication
 

Último

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 

Último (20)

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 

An efficient educational data mining approach to support e-learning

  • 1. An efficient educational data mining approach to support e-learning Padmaja Appalla1 • Venu Madhav Kuthadi2 • Tshilidzi Marwala1 Springer Science+Business Media New York 2016 Abstract The e-learning is a recent development that has emerged in the educational system due to the growth of the information technology. The common challenges involved in The e-learning platform include the collection and annotation of the learning materials, organization of the knowledge in a useful way, the retrieval and discovery of the useful learning materials from the knowledge space in a more significant way, and the delivery of the adaptive and personalized learning materials. In order to handle these challenges, the proposed system is developed using five different steps of knowledge input such as the annotation of the learning materials, creation of knowledge space, indexing of learning materials using the multi-dimensional knowledge and XML structure to generate a knowledge grid and the retrieval of learning materials performed by matching the user query with the indexed database and ontology. The process is carried out in two modules such as the server module and client module. The proposed approach is evaluated using various parameters such as the precision, recall and F-measure. Comprehensive results are achieved by varying the keywords, number of documents and the K-size. The proposed approach has yielded excellent results by obtaining the higher evaluation metric, together with an average precision of 0.81, average recall of 1 and average F-measure of 0.86 for K = 2. Keywords E-learning Data mining Knowledge organization XML Ontology 1 Introduction A lodestar in the ever-growing horizon of the distance and continuing education, the e-learning is gradually conquer- ing the cosmos with the charisma and consequent dyna- mism of a victorious king, making its presence felt everywhere [1]. The new-fangled generation of the web known as the semantic web has emerged as a talented technology for executing and boosting the e-learning. Further, it has become the cynosure of the business mag- nates, industrial entrepreneurs, the academic cream, and also the intriguing investigators offering them enough food for thought regarding its utility and applications. It is gifted with the amazing acumen of co-coordinating the data in the mechanized form and harmonizing the current web with the high-tech computers and experts to function hand in hand. The semantic web technology is endowed with the capacity to be extensively executed in diverse domains. e-learning is one of the areas which are likely to get manifold advantages from the innovative web technique [2]. In the course of the recent years, the investigations on the semantic web have ushered in the ground-breaking concepts so as to give shape to a novel configuration of web content which would be significantly valuable to the modern systems [3]. With the result, various techniques have been devised for building and expanding the semantic web. The recent innovation is the Resource Definition Framework (RDF) [4] and its annexes like OWL [5] to define metadata schemas, domain ontologies and resource narratives. Padmaja Appalla padmaja1074@gmail.com 1 Faculty of Engineering, University of Johannesburg, Johannesburg, South Africa 2 Department of AIS, University of Johannesburg, Johannesburg, South Africa 123 Wireless Netw DOI 10.1007/s11276-015-1173-z
  • 2. Since the e-learning atmosphere necessitates the supply of sufficient data and learning materials in diverse forms, the need for the semantic-based explanation of the e-learning material, effortless streamlining of the e-learn- ing plan, and personalized deliverance of the e-learning material is to be overemphasized [6, 7]. The visualization of the semantic web is concerned with the capacity to communicate the World Wide Web data in an usual and recognized language that can be deciphered by the shrewd agents, thus allowing them, on behalf of the human user, to find, distribute and amalgamate data in a mechanized manner. It furnishes a novel structure for the vibrant, dis- seminated and extensible planned information (ontology) set up on the prescribed logic. The Ontology is an unam- biguous design of a conceptualization by the use of an approved vocabulary and furnishes an affluent set of con- structs to usher in a further significant stage of knowledge. The ontologies and their linked realms are the corners stones of the semantic web venture [8]. The complexity associated with the number, varieties, and uses of the computer based artifacts requires a system designing that lets intelligence disappear into the infrastructure of active spaces (such as buildings, shopping malls, theatres, and homes) [9]. As a matter of fact, the challenges faced in the investigation of the titanic data have not appeared just like a blitzkrieg from the blue, with a torrential shower of hassles within a fraction of a second, but have emerged gradually, step by step, over a large number of years. This is in view of the fact the generation of data assumes the posture of a child’s play, while the location of fruitful facts from the data takes the shape of a Himalayan Task. In the current age of innovations, the web services technology holds out immense potential for proficiently performing the service oriented architecture and its strategic objectives. In the domain of the classifier applications, the feature choice is entrusted with the task of short-listing a subset of the most leading features by steering clear of the entire extraneous and superfluous so as to significantly scale up the precision and accelerate the model training duration for the classifier. When viewed from the general perspective of the data mining functionality, the data mining emerges as the most ideal candidate for exploring enthusing data from the titanic quantity of data parked in the treasure houses of databases, data warehouses, or other parallel data store- houses [10–13]. Vastly equipped with the ever-enhancing enthusiasm and efficiency to ensure an end product of easy and effective curriculum, the education is reigning as one of the shining stars in the galaxy of significant applications for the multimedia. In the semantic e-learning scenario, the mul- timedia method [14] has enacted a key role which is rather identical to that of a long-established textbook as a treasure house of invaluable data. The building of the context-aware multimedia services in the heterogeneous networks is still a complex and time consuming affair due to the hetero- geneity in the context-aware media contents and network conditions [15]. Anyhow, the feasibility to stage-manage the text itself by means of an electronic appliance heralds a newer horizon for the students to interact with the media, leading to a further fantastic technique analogous to the customary note taking. The multimedia technology holds the acumen and efficiency to drown the aspirants in deep delight, much in the same lines of a typical textbook. Techniques are galore and varied for offering the learning material in a multimedia design to the students. The videos and images are immensely imperative resources in the educational data mining field which hold out before the students, the assorted avenues of accessing amazing data altogether instinctively and efficiently than the text based learning materials. The paper presents an efficient educational data mining approach to support the e-learning. The approach consists of the knowledge input, annotation of learning, creation of a multi-dimensional knowledge representation and the retrieval of learning materials by matching the user query with the indexed database and ontology. The proposed approach consists of two modules such as the server module and the client module. In the server module, the documents are read from the database and the corre- sponding knowledge representation is made. The module consists of several steps such as the initial process, selec- tion of unique words, k-means and the structure formation. In client module, the information is retrieved based on the user. The input query can be of two types like the user details based and user interest based. There are several advantages and disadvantages of each class of method for developing the student models. In the existing works the time consumption is found to be very high. The informa- tion retrieval in the existing ontology method has failed miserably in the retrieval of the information. The rest of the paper is organized as follows: a brief review of researches related to the proposed technique is presented in Sect. 2. A 360-degree view of the proposed approach appears in Sect. 3 and the detailed experimental results and discussion are given in Sect. 4. The conclusion is summed up in Sect. 5. 2 Review of related works Many Researchers have developed several approaches in the e-learning environment. Among them, a handful of significant researches are presented in this section. Lau et al. [16] have remarkably launched an innovative concept map generation technique which was characterized by a context-sensitive text mining approach and a fuzzy domain Wireless Netw 123
  • 3. ontology extraction algorithm. The devised system was able to mechanically build the concept maps in accordance with the letters forwarded to the online chat rooms. While accessing the concept maps, the trainer could immediately scan the improvement of the students and fine-tune the pedagogical progression on the fly. Francesco and De Santo [17] have characteristically conceived an innovative method for the ontology structure. They heralded a preliminary debate of the function of the ontologies in the perspective of e-learning. In addition, their novel technique visualized an ontological foundation for exploring the learning devices to customize the learn- ing. Subsequently, a test assessment of the method was executed by using the authentic student records. Especially, the technique was incorporated in a device for the evalu- ation of students in the course of a learning phase. In essence, the evaluation entrenched on the Bayesian method went a long way in facilitating an advanced assessment of student awareness. Tankeleviciene and Damasevicius [18] have jointly launched a novel structure, distinguished by the (semi-) autonomous data gaining, learning and/or analysis with a view to facilitate the supply of superior services to the stu- dents, as a system quality feature. They proposed a new structure for the expansion of the traditional e-learning mechanism with intelligence faculties. They gave shape to an intelligent module of the e-learning mechanism, with the acumen to augment its local domain ontology with con- ceptions and linkages gathered by placing an enquiry with a far-flung data base. They were competent to evolve an intelligent module of the eLearning system capable of expanding its local ontology enriched with domain data by means of the conceptions and linkages gathered by querying a distant data base. Ferreira-Satler et al. [19] have fervently formulated an algorithm that facilitated the mechanical production of the structure of the ontology. This method had been incorpo- rated into a management device for the learning objects, where every user profile was constructed from the learning objects ushered in by the user himself. This technique had been executed on a management device of the learning objects viz. the AGORA. Those who had the occasion to exploit the device have authenticated that the mechanically made ontologies were broadly in line with their aspirations and anticipations. In addition, Sankey et al. [20] have skillfully offered the conclusions of an investigation to assess the effect of the multiple versions on the learning effects, along with the student education excellence and engagement. From the significant study it was found that multiple versions of data failed miserably in ensuring a telling upgrade in the education efficiency, neglecting of course the marginal efficacy it prompted. However, the students were all enthused and rejoiced at their exposure to the multimodal learning modules, claiming that they were very much benefited by the deeper understanding and useful preser- vation of the learning material. Brut et al. [21] have briskly brought forward a solution to extend the IEEE LOM standard with the ontology-based semantic annotations for the effective application of the learning objects outside the learning management systems. The data brand analogous to the corresponding method was initially offered. The expanded indexing technique for the related s brand expansion was launched with an eye on achieving superior interpretations of the learning materials. The novel technique developed and integrated two sacred substitute techniques for the structure-based indexing of the textual resources such as the mathematical method of the latent semantic indexing and the linguistic-oriented Word Net-based text processing. This led to the enhanced comprehension of the underlying causes for the superb outcomes turned out by the former method by means of the linguistically managed options suggested by the latter technique. The outcomes of the investigation assume sig- nificance in the backdrop of embracing the semantic web technologies in the e-learning field, but also as the pro- jectors of progress in the pathway leading to the ontology- based indexing of the textual materials. Deng et al. [22] have dexterously developed an innovative multimedia data technique for the e-learning. Their method necessitated adjustable and re-workable backup for the structuring of the multimedia content models and also enabled the potential interactive, transmission of streams of the multimedia information like the audio, video, text and interpretations by the use of system services. Anyhow, they assessed the current standards and applications for the mul- timedia documents models like the HTML, MHEG, SMIL, HyTime, RealPlay and MS Windows Media that enabled us comprehend that they were not able to yield enough bases for the sophisticated recycle and alteration. Therefore, they launched an innovative technique for the structuring of the re-workable and changeable multimedia data. In addition, they conceived an all-inclusive approach for the advanced multimedia data creation including the backup for recording the presentation, regaining the data, abridging the presenta- tion, weaving the presentation and adapting the representa- tion. Their innovative technique had appreciable effect and boosted the multimedia presentation authoring functions in respect of the methodology and commercial features. Rodicio and Sáncheza [23] have systematically designed a method to investigate whether the merits of the human education really survived simultaneously keeping away the vexed issues of the earlier investigation. In one particular type of investigation, the participants studied the geology from a multimedia model integrating one of the three kinds of backup such as the human education, preserved backup Wireless Netw 123
  • 4. or no backup. After studying the model, they were able to find keys to the preservation and transmission tests. The outcomes showed that the participant in the human edu- cation situation outscored those in the other two situations, though they were exactly identical to one another. Moreover, Lau et al. [24] have magnificently made an innovative e-learning specific multimedia method. They furnished the students with further command over their learning program and tempo. Above all, the multimedia technique additionally supplemented the students with the diverse versions of the media tailored to their learning behavior, resulting in the boosting of their learning efficacy. In 2013 Fernando et al. [25] have fascinating conducted an extensive survey of mobile cloud computing research, while highlighting the specific concerns in the mobile cloud computing. They presented a taxonomy based on the key issues in this area, and discussed the different approaches employed to tackle these issues. 3 Proposed educational data mining approach to support e-learning In this paper, an innovative educational data mining tech- nique to support the e-learning is presented. The e-learning has emerged as the centre of attraction, of late. The prob- lems faced in the e-learning can be related to the acquirement and annotation of learning materials, organi- zation of the acquired materials and the retrieval of the useful learning materials. The approach consists of the following steps of the knowledge input (where learning materials such as the text documents, video and images are collected), annotation of learning materials such as the Meta data, creation of a multi-dimensional knowledge representation (using the tree structure, indexing, XML and the ontology) and the retrieval of the learning materials by matching the user query with the indexed database and the ontology. The proposed approach consists of two modules such as the server module and the client module. The block diagram of the proposed approach is given in Fig. 1. 3.1 Server module In this module, the documents are read from the database and the corresponding knowledge representation is made. The process includes many steps which are detailed below: 3.1.1 Initial process Initially all the documents are collected and stored in the database. The documents habitually include the text doc- uments, images and the videos files. Let the text documents be represented by X = {x1, x2, …, xNx}, images by Y = {y1, y2, …, yNy} and videos by V = {v1, v2, …, vNv}. Here Nx is the number of text documents under consider- ation, Ny represents the number of images under consid- eration and Nv corresponds to the number of videos under consideration. Here the total documents (database) can be represented as D = {X, Y, V} and unified represented by: D ¼ d1; d2; . . .; dNd f g where di 2 X; Y; V and Nd ¼ Nx þ Ny þ Nv ð1Þ All the documents (di, where 0 i B Nd) are read and selected for further processing. The block diagram of the initial processing is given in Fig. 2. 3.1.2 Selection of unique words Each of the text documents consists of the words which are processed. In the case of the image and video documents, the words in the title are processed. Let the document z consist of the words represented by: WZ ¼ wz;1; wz;2; . . .; wz;Nwz ð2Þ Here, Nwz is the total number of words in document z. In each document, the frequency of all the words is found out. The frequency of a word represents the number of times the word appears in the respective document. Subsequently, top ten words from each document are selected based on the frequency count. Let the selected top ten words from each document be represented as: SwZ ¼ swz;1; swz;2; . . .; swz;10 ð3Þ where, z represents the document under consideration. Hence, the selected words are obtained from each docu- ment. The block diagram for the selection of the unique words is given in Fig. 3. After finding the top ten words for each document, the common unique words to all the documents which come in the top ten are found out. Here, let the common unique words be represented by: CUW ¼ cuw1; cuw2; . . .; cuwNcw f g; cuwi 2 Sw1Sw2. . .Swd f g ð4Þ Ncw represents the number of the common unique words present in all the documents under consideration. Subse- quently, the frequency of each of the detected common unique words is found out. Let the frequency of the com- mon unique word (denoted by cuwz) and represented by fre(cuwz) be defined by: fre cuwz ð Þ ¼ fre Swcumz 1 þ fre Swcumz 2 þ þ fre Swcumz d ð5Þ where, fre Swj i represents the frequency with which the word ‘‘j’’ appears in the top ten words of the document ‘‘i’’. Wireless Netw 123
  • 5. 3.1.3 K-means clustering and structure formation The K-means clustering is a commonly used clustering algorithm where the input data are grouped into K number of data clusters. The grouping of the data points to form clusters depends on the centroid values. The frequency of the com- mon unique words becomes the input to the K-means clus- tering in the proposed technique. And the K-means performs the clustering based on the frequency to cluster the whole documents into two clusters (as K is taken as 2). Let there be G number of data points which are denoted by DP = {dp1, dp2, …, dpG}. Let the centroids be repre- sented by ceni where 0 i B k. The minimization function of the algorithm is given by: Fig. 1 The block diagram of the proposed approach Fig. 2 The block diagram of initial processing Wireless Netw 123
  • 6. 1 G X G j¼1 min dis2 dpj; ceni ð6Þ where, dis(dpj, ceni) is the Euclidean distance between data point dpj and centroid ceni. Hence the objective can be stated as to locate k cluster centroids, in which the average squared Euclidean distance between a data point and its adjacent cluster centroid is minimized. The steps involved in the K-Means Algorithm are given as: 1. Initialize k centroids, so as to have one centroid for each cluster 2. Calculate the distance dis(dpj, ceni) of every k centroid from data points dpj in Db 3. Allocate data point dpj to cluster Cui whose distance is least compared to other clusters 4. Update centroid values based on the membership values of the novel clusters 5. Repeat Steps 2 to 4, till is no movement of the data points among the clusters Hence, after the clustering based on the common unique words, two clusters of documents are obtained (as k is taken as two). In each cluster, the frequency of the words is found out. From these, top five most frequent words from each cluster are found out which forms the topic of the respective cluster. Suppose the clusters are represented by Cui, then the topic of the cluster Topi is represented as: Topi ¼ xi;1; xi;2; . . .; xi;5 ð7Þ where xi,j is the jth most frequent word in the ith cluster. After finding out the topic for all the clusters, the process of selection of the unique words and K-means clustering is repeated for the maximum size cluster. That is, the cluster having the maximum documents at this stage is selected and processed through the steps again. Hence, after two iterations of the steps, three clusters are yielded in total and at this time, all the three clusters are compared to find the largest and the process is carried out on the largest cluster. The iteration process is carried out for an arbitrary number of times. The flow diagram of the process is given in Fig. 4. The iteration is carried for an arbitrary number of times to finally result in a tree structure. Tree structure invari- ably includes the document clusters, topics and levels. The initial iteration results constitute the top portion of the tree structure and subsequently formed clusters and topics form the sub-trees. The number of sub-trees or levels depends on the number of iteration performed. The topic found out at the sub-trees forms the sub-topic. The sub-topic also consists of the top five words of the cluster based on the frequency count. A sample tree generated is shown in Fig. 5. Fig. 3 Block diagram for the selection of unique words Wireless Netw 123
  • 7. In the above figure the tree structure formed for a set of sample documents is shown. Here it is assumed that cluster A has more documents than cluster B and cluster B has more documents than cluster C and D. We can see that for each of the clusters the respective topic/sub-topic is fund out. The tree structure is then processed to XML Fig. 4 The flow diagram of the process Fig. 5 Sample tree generation Wireless Netw 123
  • 8. format with the use of indexing. The XML file is gen- erated with the attributes such as the topic, sub-topic, level and document name. The structure of the XML file is shown below. The XML file is processed with one benchmark uni- versity ontology. The ontology characterizes the informa- tion as a set of concepts contained by a domain, using collective terms to represent the types, properties and interrelationships of those concepts. They are the structural frameworks for organizing the information. The ontology consists of the information and data that is taken from the users. The ontology is created for each user making use of the attributes of the user such as the major subject, course, subject they like, languages known, computer knowledge and so on. 3.2 Client module In this module, the information is retrieved based on the user. Initially, e the XML data is read and stored for the text, images, videos and bench mark university ontology based on the user. Subsequently, after obtaining the data, the user is asked for the input query. Let the user be rep- resented by Usr, corresponding ontology built be repre- sented as Or, corresponding text, image and video documents be represented as Xr, Yr and Vr. The retrieval of learning materials for the users is done adaptively based on the user query and the ontology which contains the per- sonalized information of all the users. The input query can be of two types such as the user details based and user interest based. 3.2.1 User details based Here, the information about the particular user is collected from the ontology. Based on the retrieval, the information is displayed giving the corresponding text, images and the videos documents. This is done by first collecting the information about the user and retrieving the topics and sub-topics of the user. These collected topics and sub- topics are searched in our work to find the matching doc- uments (be it text, image or video). That is, from the ontology Or for the user Usr, initially the topics are found out and the corresponding text, image and video documents (Xr, Yr and Vr) are retrieved. 3.2.2 User interest based In this case, the user interest is given as input. And based on the input of the user interest, the topics and sub-topics are found out to form the ontology. These collected topics and sub-topics are searched in our work to find the matching documents (be it text, image or video). The corresponding text, images and videos documents are dis- played based on the ontology. Let the user interest be represented as Uir. That is, based on user interest Uir, the corresponding text, image and video documents (Xr, Yr and Vr) are retrieved from the ontology Or for the user Usr. The Flow diagram of the client module is given in Fig. 6. 4 Results and discussion In this section, the results obtained for the proposed approach are given and analysed. In Sect. 4.1, the imple- mentation details and evaluation metric employed are Fig. 6 Flow diagram of the client module Wireless Netw 123
  • 9. offered. In Sect. 4.2, the implementation screen shots are presented and in Sect. 4.2, evaluation metric values obtained for the proposed approach are given. 4.1 Implementation details and evaluation metric employed The proposed technique is implemented in JAVA on a system having 6 GB RAM and 2.9 GHz Intel i-7 processor. The recall, precision and F-measure are used as the eval- uation metrics. Intuitively, the recall measures how well the approach is performing at locating all the relevant data for a query, and precision measures how well it is per- forming at rejecting non-relevant data. The definition of these parameters assumes that, for a given function, there are two distinct sets of data such as the retrieved and non-retrieved data (the latter representing the rest of the data). This obviously applies to the results of a Boolean search, but the same definition can also be used with a ranked search, as explained later. If, in addition, the rele- vance is assumed to be binary, then the results for a query can be summarized. In Table 1, P represents the relevant set of data for the query, P characterizes the non-relevant set, Q corresponds to the set of the retrieved data, and Q relates to the set of non-retrieved data. The operator gives the intersection of the two sets. For example, P Q represents the set of data that are both relevant and retrieved. The three parameters of particular interest are furnished below. Recall R ð Þ ¼ P Q j j P j j Pr ecision E ð Þ ¼ P Q j j Q j j where j j gives the size of the set under consideration. In other words, the recall represents the proportion of the relevant data that are retrieved, and the precision charac- terizes the proportion of the retrieved data that are relevant. The F-measure parameter is an efficiency parameter based on the recall and precision which is used for evaluating the classification performance and also for certain search applications. It has the advantage of summarizing effec- tiveness in a single number and is defined as the harmonic mean of the recall and precision which is represented as follows. Fmeasure F ð Þ ¼ 1 1 2 1 R þ 1 E ¼ 2RE R þ E ð Þ 4.2 Implementation screen shots In this section, the implementation screen shot of the proposed approach is given. The screenshots of the various stages given here include the first page, tree view, XML view, ontology, two types such as the user interest and user view, training result, authentication, retrieved documents, retrieved images and the retrieved videos. The screen shots are given in Figs. 7, 8 and 9. 4.3 Results and analysis In this section, the evaluation metric values obtained for the proposed technique are given and discussed. The Table 1 Relevant and retrieved documents Relevant Non-relevant Retrieved P Q P Q Not retrieved P Q P Q Fig. 7 First page of the implementation Wireless Netw 123
  • 10. analysis is carried out in three phases of analysis based on the keywords, number of documents and the K-value. 4.3.1 Analysis based on keywords Inferences from Tables 2, 3, 4 and 5: • Tables 2, 3 and 4 give the results values obtained for the keywords data, database and the mining respectively. • The results are taken for the case K = 2. • The results include the evaluation metric values of the precision recall and F-measure. • From the results, we can infer that the proposed approach has attained good results by achieving high evaluation metric values. • Table 5 shows the average values obtained for various keywords. • It is seen that among the keywords, the novel approach has worked best for the keyword data, achieving the average precision of 0.84, average recall of 1 and the average F-measure of 0.91. Fig. 8 Overview of domain ontology Fig. 9 Training results Wireless Netw 123
  • 11. • Among all the values, the highest precision attained is roughly 0.95, and the highest F-measure achieved is approximately 0.97. 4.3.2 Analysis based on documents In this section, analysis is carried out by finding the eval- uation metrics based on the number of documents given as input. Various document sizes taken for evaluation are 20, 40, 60, 80 and 100. Inferences from Figs. 10, 11, 12 and 13: • The analysis is carried out by finding the evaluation metrics based on the number of documents given as input. Various document sizes taken for evaluation are 20, 40, 60, 80 and 100. Table 2 Results obtained for keyword: data Keyword: data Document files Image files Video files Documents relevant Documents retrieved Images relevant Images retrieved Video relevant Video retrieved Precision Recall F-measure 20 5 5 15 6 4 3 2 2 1 0.52381 0.6875 40 10 10 32 31 5 4 5 5 1 0.952381 0.9756098 60 15 15 49 44 6 6 8 7 1 0.904762 0.95 80 20 20 73 69 13 10 10 9 1 0.916667 0.9565217 100 25 25 89 85 19 16 11 10 1 0.932773 0.9652174 Table 3 Results obtained for keyword: database Keyword: database Document files Image files Video files Documents relevant Documents retrieved Images relevant Images retrieved Video relevant Video retrieved Precision Recall F-measure 20 5 5 6 3 1 1 1 1 I 0.625 0.7692308 40 10 10 20 12 4 4 1 1 1 0.68 0.8095238 60 15 15 37 32 4 4 1 1 1 0.880952 0.9367089 80 20 20 54 49 8 7 3 3 1 0.907692 0.9516129 100 25 25 75 68 12 10 7 7 1 0.904255 0.9497207 Table 4 Results obtained for keyword: mining Keyword: mining Document files Image files Video files Documents relevant Documents retrieved Images relevant Images retrieved Video relevant Video retrieved Precision Recall F-measure 20 5 5 3 2 2 1 1 1 1 0.666667 0.8 40 10 10 3 2 2 1 4 4 1 0.777778 0.875 60 15 15 9 7 2 1 6 6 1 0.823529 0.9032258 80 20 20 15 13 6 5 10 9 1 0.870968 0.9310345 100 25 25 21 18 7 6 13 12 1 0.878049 0.9350649 Table 5 Average values obtained for various keywords Keyword Average precision Average recall Average F-measure Data 0.84 1 0.91 Database 0.79 1 0.88 Mining 0.80 1 0.80 Wireless Netw 123
  • 12. • The results are taken for the case K = 2. • Figures 10, 11 and 12 give the evaluation metric values obtained for the keywords data, database and the mining respectively. • The proposed approach ushers in excellent results by achieving high evaluation metric values for all the cases irrespective of the number of documents. • Figure 13 shows the average values obtained for various keywords. • From the figure, we can infer that the proposed approach works well for increasing the number of documents. The best results have been achieved for the document size of 100 in our case. 4.3.3 Analysis based on K-value Inferences from Table 6 and Fig. 14: • Table 6 and Fig. 14 give the average evaluation metric values obtained by varying the K-size. • The various K-sizes taken into consideration are 2, 3 and 4. • From the results, we can see that the approach has worked well for all the cases and best results have been obtained for k = 2. 5 Conclusion The paper presents an efficient educational data mining approach to support the e-learning. The proposed approach consists of two modules, such as the server module and the client module. In the server module, the documents are read from the database and the corresponding knowledge - 0.20 0.40 0.60 0.80 1.00 Precision Recall F-measure 20 40 60 80 100 Fig. 10 Evaluation metric chart for varying number of documents for keyword: data 0 0.2 0.4 0.6 0.8 1 Precision Recall F-measure 20 40 60 80 100 Fig. 11 Evaluation metric chart for varying number of documents for keyword: database 0 0.2 0.4 0.6 0.8 1 Precision Recall F-measure 20 40 60 80 100 Fig. 12 Evaluation metric chart for varying number of documents for keyword: mining 0 0.5 1 Precision Recall F-measure 20 40 60 80 100 Fig. 13 Average evaluation metric chart for varying number of documents Table 6 Average evaluation metric values obtained for varying K-value K value Average precision Average recall Average F-measure 2 0.81 1 0.86 3 0.78 1 0.84 4 0.74 1 0.79 Fig. 14 Chart of average evaluation metric values obtained for varying K-value Wireless Netw 123
  • 13. representation is made. In the client module, the informa- tion is retrieved based on the user requirements. The pro- posed approach is evaluated using various parameters such as the precision, recall and the F-measure. The compre- hensive results are obtained by varying the keywords, number of documents and the K-size. The proposed approach has yielded amazing outcomes by obtaining high evaluation metrics, as exemplified by the average precision of 0.81, average recall of 1 and the average F-measure of 0.86 for K = 2. References 1. Ghaleb, F. F. M., Daoud, S. S., Hasna, A. M., Jaam, J. M., El- Sofany, H. F. (2006). A web-based e-learning system using semantic web framework. Journal of Computer Science, 2(8), 619–626. 2. Dutta, B. (2006). Semantic web based e-learning. In DRTC Conference on ICT for Digital Learning Environment, 11th–13th January, 2006. 3. Berners-Lee, T., Hendler, J., Lassila, O. (2001). The semantic web. Scientific American, 285(5), 34–44. 4. RDF. (2001). W3C. Semantic web activity: Resource description framework. 5. OWL. (2003). Web ontology language. 6. Šimić, G., Gasević, D., Devedzić, V. (2004). Semantic web and intelligent learning management systems. In Workshop on Applications of Semantic Web Technologies for E-Learning. 7. Thyagharajan, K. K., Nayak, R. (2007). Adaptive content creation for personalized e-learning using web services. Journal of Applied Sciences Research, 3(9), 828–836. 8. Alesso, H. P., Smith, C. F. (2006). ‘‘Thinking on the web’’, Berners-Lee, Gdel and Turing. London: Wiley. 9. Acampora, G., Gaeta, M., Loia, V., Vasilakos, A. (2010). Interoperable and adaptive fuzzy services for ambient intelli- gence applications. Journal of ACM Transactions on Autonomous and Adaptive System (TAAS), 5(2), 8. 10. Tsai, C.-W., Lai, C.-F., Chao, H.-C., Vasilakos, A. (2015). Big data analytics: A survey. Journal of Big Data, 2(21), 1–32. 11. Sheng, Q., Qiao, X., Vasilakos, A., Szabo, C., Bourne, S., Xu, X. (2014). Web services composition: A decade’s overview. 280, 218–238. 12. Fong, S., Wong, R., Vasilakos, A. (2014). Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Transactions on Services Computing. doi:10.1109/TSC. 2015.2439695. 13. Chen, F. Deng, P. Wan, J., Zhang, D., Vasilakos, A., Rong, X. (2015). Data mining for the internet of things: Literature review and challenges. 50, 11–14. 14. Ando, M., Ueno, M. (2008). Cognitive load reduction on mul- timedia e-learning materials. In Proceedings of IEEE International Conference on Advanced Learning Technologies, pp. 268–272. 15. Zhou, L., Naixue, X., Lei, S., Vasilakos, A., Yeo, S.-S. (2010). Context-aware middleware for multimedia. Journal of Services in Heterogeneous Networks, 25(2), 40–47. 16. Lau, R. Y. K., Song, D., Li, Y., Cheung, T. C. H., Hao, J.-X. (2009). Toward a fuzzy domain ontology extraction method for adaptive e-learning. IEEE Transactions on Knowledge and Data Engineering, 21(6). 17. Francesco, C., De Santo, M. (2010). Ontology for e-learning: A Bayesian approach. IEEE Transactions on Education, 53(2), 223–233. 18. Tankeleviciene, L., Damasevicius, R. (2010). Towards the development of genuine intelligent ontology-based e-learning systems. In 5th IEEE International Conference Intelligent Sys- tems (IS), pp. 79–84. 19. Ferreira-Satler, M., Romero, F. P., Menendez, V. H., Zapata, A., Prieto, M. E. (2010). A fuzzy ontology approach to represent user profiles in e-learning environments. In IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–8. 20. Sankey, M. D., Birch, D., Gardiner, M. W. (2011). The impact of multiple representations of content using multimedia on learning outcomes across learning styles and modal preferences. International Journal of Education and Development Using Information and Communication Technology (IJEDICT), 7(3), 18–35. 21. Brut, M. M., Sedes, F., Dumitrescu, S. D. (2011). A semantic- oriented approach for organizing and developing annotation for e-learning. IEEE Transactions on Learning Technologies, 4(3). 22. Deng, L. Y., Liu, Y.-J., Lee, D.-L., Chen, Y.-H. (2013). Ontology-based multimedia adaptive learning system for u-learning. In Information Technology Convergence Lecture Notes in Electrical Engineering, Vol. 253, pp 669–676. 23. Rodicio, H. G., Sáncheza, E. (2013). Aids to computer-based multimedia learning: A comparison of human tutoring and computer support. Interactive Learning Environments, 20(5). 24. Lau, R. W. H., Yen, N. Y., Li, F., Wah, B. (2014). Recent development in multimedia e-learning technologies. World Wide Web, 17(2), 189–198. 25. Fernando, N., Loke, S., Rahayu, W. (2013). Mobile cloud computing: A survey. Journal of Future Generation Computer Systems, 29, 84–106. Padmaja Appalla obtained her Masters Degree in Education from Open University, UK and Masters Degree in Business Administration from Andhra University, India. She also completed her Bachelors Degree in Sciences from St. Joseph’s College in India. She has various professional certifi- cations including PGDSM, MCSD, OCA, Java certification to cite a few. Currently, she is pursuing her D.Phil. in e-learn- ing from University of Johan- nesburg. She has over 19 years of experience in the field of Information Technology and Education and is currently working as Deputy Pro Vice chancellor Education at Botho University, Botswana. Wireless Netw 123
  • 14. Dr Venu Madhav Kuthadi currently working with Univer- sity of Johannesburg, he obtained his Ph.D. Degree in Computer Science from MU, India. He received his Master’s Degree in Computer Science from JNTU India. He got 14 years of experience in research and teaching under- graduate and postgraduate stu- dents of Engineering. He holds B.Tech. in CSE from ANU India. He has published good number of articles in interna- tional journals and conference proceedings. Dr Kuthadi is an Editor for the International journal IJAEGT. Professor Tshilidzi Marwala is a deputy Vice Chancellor at the University of Johannebsurg. He was previously the Execu- tive Dean of the Faculty of Engineering and the Built Environment at the University of Johannesburg, the Head of Control and Systems Group and the Carl and Emily Fuchs Pro- fessor of Electrical Engineering at the University of the Witwa- tersrand, Executive Assistant to the Technical Director at the South African Breweries, Chair of the (Telkom) Local Loop Unbundling Committee, Deputy Chair of Limpopo Business Support Agency, director of the State Information Technology Agency Pty (Ltd), member of council of Statistics South Africa and member of council of the National Advisory Council on Innovation. He has been on the boards of City Power Johannesburg Pty (Ltd) and EOH Pty (Ltd). He holds a Bachelor of Science in Mechanical Engineering with a Magna Cum Laude from Case Wes- tern Reserve University, a Master of Engineering from the University of Pretoria, a Ph.D. in Computational Intelligence from University of Cambridge and was a post-doctoral research associate at the University of London’s Imperial College of Science, Technology and Medicine. He has received over 40 awards including the Order of Mapungubwe; has published over 150 articles in refereed interna- tional journals, conference proceedings and book chapters and has successfully supervised over 33 master and Ph.D. students. Wireless Netw 123