SlideShare una empresa de Scribd logo
1 de 93
Descargar para leer sin conexión
Data Analytics process in
Learning and Academic
Analytics projects
Day 1: Data selection and capture
Alex Rayón Jerez
alex.rayon@deusto.es
DeustoTech Learning – Deusto Institute of Technology – University of Deusto
Avda. Universidades 24, 48007 Bilbao, Spain
www.deusto.es
Objectives
How to tackle an AA/LA project
1. Objectives: what do I want to improve?
2. Data: automated processes for data discovery
and later processing
3. Integration, not substitution
4. Technology
5. KPIs: define and test
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
○ Questions
○ Data model
○ Data sources
○ Use cases
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
ETL approach
Definition and characteristics
● An ETL tool is a tool that
○ Extracts data from various data sources (usually
legacy data)
○ Transforms data
■ from → being optimized for transaction
■ to → being optimized for reporting and analysis
■ synchronizes the data coming from different
databases
■ data cleanses to remove errors
○ Loads data into a data warehouse
ETL approach
Why do I need it?
● ETL tools save time and money when
developing a data warehouse by removing
the need for hand-coding
● It is very difficult for database administrators
to connect between different brands of
databases without using an external tool
● In the event that databases are altered or new
databases need to be integrated, a lot of hand-
coded work needs to be completely redone
ETL approach
Kettle
Project Kettle
Powerful Extraction, Transformation and
Loading (ETL) capabilities using an
innovative, metadata-driven approach
ETL approach
Kettle (II)
● It uses an innovative meta-driven approach
● It has a very easy-to-use GUI
● Strong community of 13,500 registered
users
● It uses a stand-alone Java engine that
process the tasks for moving data between
many different databases and files
ETL approach
Kettle (III)
ETL approach
Kettle (IV)
Source: http://download.101com.com/tdwi/research_report/2003ETLReport.pdf
ETL approach
Kettle (V)
Source: Pentaho Corporation
ETL approach
Kettle (VI)
● Datawarehouse and datamart loads
● Data integration
● Data cleansing
● Data migration
● Data export
● etc.
ETL approach
Transformations
● String and Date Manipulation
● Data Validation / Business Rules
● Lookup / Join
● Calculation, Statistics
● Cryptography
● Decisions, Flow control
● Scripting
● etc.
ETL approach
What is good for?
● Mirroring data from master to slave
● Syncing two data sources
● Processing data retrieved from multiple
sources and pushed to multiple
destinations
● Loading data to RDBMS
● Datamart / Datawarehouse
○ Dimension lookup/update step
● Graphical manipulation of data
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
Data Analytics cycle
Challenges
● Data is everywhere
● Data is inconsistent
○ Records are different in each system
● Performance issues
○ Running queries to summarize data for
stipulated long period takes operating
system for task
○ Brings the OS on max load
● Data is never all in Data Warehouse
○ Excel sheet, acquisition, new application
Data Analytics cycle
Challenges (II)
● Data is incomplete
● Certain types of usage data are not logged
● Data are not aggregated following a
didactical perspective
● Users are afraid that they could draw
unsound inferences from some of the data
[Mazza2012]
Data Analytics cycle
Academic Analytics Model
1) Capture
2) Report5) Refine
4) Act 3) Predict
Academic Analytics
[CampbellOblinger2007]
Data Analytics cycle
Learning Analytics Model
1) Select
2) Capture
3) Aggregate
4) Process
5) Visualize
On the design of collective applications
[DronAnderson2009]
Data Analytics cycle
Learning Analytics Model (II)
On the design of collective applications
[DronAnderson2009]
1) Select
2) Capture
3) Aggregate
4) Process
5) Visualize
Day 1
Day 2
Day 3
Day 4
Data Analytics cycle
Learning Analytics Model (III)
● As [Clow2012] states, it is
necessary to close the
feedback loop through
appropriate interventions
unmistakable
● It also draws on the wider
educational literature,
seeking to place learning
analytics on an established
theoretical base, and
develops a number of
insights for learning
analytics practice
Data Analytics cycle
Learning Analytics Model (IV)
Data Analytics cycle
Learning Analytics Model (III)
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
Architecture principles
A model for adoption, use and improvement of analytics
A framework of characteristics for Analytics
Adam Cooper, 2012 [Cooper2012]
Architecture principles
Development of common language for data exchange
The IEEE defines interoperability to be:
“The ability of two or more systems or
components to exchange information and
to use the information that has been
exchanged”
Architecture principles
Development of common language for data exchange (II)
Architecture principles
Development of common language for data exchange (III)
● The most difficult challenges with achieving
interoperability are typically found in
establishing common meanings to the data
● Sometimes this is a matter of technical
precision
○ But culture – regional, sector-specific, and
institutional – and habitual practices also affect
meaning
Architecture principles
Development of common language for data exchange (IV)
● Potential benefits
○ Efficiency and timeliness
■ No need for a persona to intervene to re-enter, re-
format or transform data
○ Independence
■ Resilience
○ Adaptability
■ Faster, cheaper and less disruptive to change
○ Innovation and market growth
■ Interoperability combined with modularity makes
it easier to build IT systems that are better
matched to local culture without needing to create
Architecture principles
Development of common language for data exchange (V)
● Potential benefits
○ Durability of data
■ Structures and formats change over time
■ The changes are rarely properly documented
○ Aggregation
■ Data joining might be supported by a common set
of definitions around course structure, combined
with a unified identification scheme
○ Sharing
■ Specially when there are multiple parties involved
Architecture principles
Development of common language for data exchange (VI)
[LACE2013]
Architecture principles
Development of common language for data exchange (VII)
[LACE2013]
In our case?
Architecture principles
Development of common language for data exchange (VIII)
[LACE2013]
In our case?
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
Requirements
● Usability: prepare an understandable user interface
(UI), appropriate methods for data visualization, and
guide the user through the analytics process.
● Usefulness: provide relevant, meaningful indicators
that help teachers to gain insight in the learning
behavior of their students and support them in
reflecting on their teaching.
● Interoperability: ensure compatibility for any kind
of VLE by allowing for integration of different data
sources.
[Dyckhoff2010]
Requirements (II)
● Extensibility: allow for incremental extension of
analytics functionality after the system has been
deployed without rewriting code.
● Reusability: target for a building-block approach to
make sure that re-using simpler ones can implement
more complex functions.
● Real-time operation: make sure that the toolkit can
return answers within microseconds to allow for an
exploratory user experience
● Data Privacy: preserve confidential user information
and protect the identities of the users at all times
[Dyckhoff2010]
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Components
● Data process
Components
● Process
○ A systematic process of educational data analysis
● Model
○ The definition of a suitable model to represent the
knowledge domain
● Tool/platform
○ The design and implementation of a monitoring
and presentation tool based on the Process and
Model
[Mazza2012]
Table of contents
● ETL approach
● Data analytics cycle
● Architecture principles
● Requirements
● Data process
Data process
Introduction
“Measurement, collection, analysis and
reporting of data about learners and
their contexts, for purposes of
understanding and optimising learning
and the environments in which it
occurs”
First international conference on Learning Analytics and
Knowledge, Alberta, 2011 [LAK2011]
Data process
Introduction (II)
However, the challenge is to determine
which data are of interest
We are now in an era where gaining
access to data is not the problem;
the challenge lies in determining
which data are significant and why
Data process
Introduction (III)
“The basic question
is not what can we
measure? The basic
question is what
does a good
education look like?
Big questions”
Data process
Introduction (IV)
“More data does not mean more knowledge”
[Jenkins2013]
Searching for the evidence in a mass of data
requires knowing what kind of evidence is
needed
Knowledge of the domain and understanding
and interpretation of the patterns we see
Data process
Introduction (V)
Data process
Introduction (VI)
Source: http://www.learningfrontiers.eu/?q=story/will-analytics-transform-education
Data process
Introduction (VII)
A brief comparison of the two fields (George Siemens and Ryan Baker [SiemensBaker2012])
Data process
Introduction (VIII)
First of all, education is a highly collaborative space and it represents a social good. Keeping a valuable secret
that might help students succeed is antithetical to the nature of education. Second, education is a complex
ecosystem of people, processes, policies, content, etc. I would have strong doubts about anyone who claimed to
have a formula that worked for a wide variety of institutions.
Mike Sharkey, 2014
Data process
Questions
Source: http://www.slideshare.net/sbs/learning-analytics-uts-2013
Data process
Questions (II)
The question depends on who is making it ;)
Data process
Questions (III)
Horizon Report 2012 [HR2012]
Data process
Questions (IV)
1) Adaptive testing, tracking and reporting
● Progress summary, daily activity report, class
goals report, progress report, student activity
report, student focus report, etc [Khan2012]
● By using various analytics tools, students can
review their learning progress and teachers
are also supported in how to personalise
learning for students in need for more help in
specific areas
Data process
Questions (V)
2) Analytics tools for early alert, intervention
and collaboration
● Integrating their data collected from a variety
of information management systems
○ Allowing educators to assess the risk, initiate early
interventions and support collaborative learning
Data process
Questions (VI)
2) Analytics tools for early alert, intervention
and collaboration
● For example, the Signals project at Purdue
University utilizes the data collected from
student information systems, learning
management systems, and the grade book for
a specific course to track students’
performances and identify at-risk students in
real time
Data process
Questions (VII)
2) Analytics tools for early alert, intervention
and collaboration
● The LOCO-Analyst provides teachers with
charts, graphs, and other data representations
that help them see how their students are
performing and how students interact with
one another in web-based learning
environments to help the teacher determine
how to engage their students online
Data process
Questions (VIII)
2) Analytics tools for early alert, intervention
and collaboration
● Social Networks Adapting Pedagogical
Practice (SNAPP), a network visualization tool
developed by researchers at the University of
Wollongong, can analyse students’
interactions in a forum and display it in a
visualised diagram which help teachers to
identify the key connections and disconnected
students and support collaborative learning in
a web-based learning environment
Data process
Questions (IX)
3) Analytics projects for institutional
efficiency and effectiveness
● There are a number of institutional analytics
initiatives which enable institutions to
improve the effectiveness of operations,
including admission management and drop-
out prevention, resource management,
financial planning, etc
○ Student Experience Traffic Lighting (SETL)
○ The Enhancing Student Centred Administration
Placement Experience (ESCAPES)
Data process
Questions (X)
Learning Analytics
are not neutral
Data process
Questions (XI)
“Accounting tools… do not simply
aid the measurement of economic
activity, they shape the reality they
measure”
[GayPryke2002]
Data process
Questions (XII)
Fuente: http://mfeldstein.com/harvard-mit-learn-university-phoenix-analytics/
Data process
Questions (XIII)
● The Harvard and MIT data ignores student
goals or any information giving a clue on
whether students desired to complete the
course, get a good grade, get a certificate, or
just sample some material
● Without this information, the actual
aggregate behavior is missing context
○ We don’t know if a certain student intended to just
audit a course, sample it, or attempt to complete it.
○ We don’t know if students started the course intended
to complete but became frustrated
Data process
Questions (XIV)
● The value of learner behavior patterns, which
can only be learned by viewing data patterns
over time
● If you want to “share best practices to improve
teaching and learning”, then you need data
organized around the learner
○ With transactions captured over time – not just in
aggregate
○ What we have now is an honest start, but a very
limited data set
Data process
Data model
So, which is our data
model to answer to
our questions?
Data process
Data model (II)
The data model, or the concept map,
describes the concepts and their
relationships used by the organization
in its daily work, expressed in its own
language
It enables the whole organization to
participate in the maintenance of it
Data process
Data model (III)
Source: http://www.economist.com/news/finance-and-economics/21578041-containers-have-been-more-important-globalisation-freer-trade-
humble
Source: http://www.economist.com/blogs/economist-explains/2013/05/economist-explains-14
Data process
Data model (IV)
The best approach that we have
found for this task is constituted by
the theory of eLearning functions
Reinmann [Reinmann2006]
Data process
Data model (V)
[Reinmann2006]
Data process
Data model (VI)
Example
[Mazza2012]
Data process
Data model (VII)
Example
This model answers the monitoring questions:
● Which way of eLearning enables to reach the
given objectives?
● By which means (functions, tools) does the
LMS enable these ways of learning?
● How is the use of these means traced in the log
files (activity log codes)?
[Mazza2012]
Data process
Data model (VIII)
[Mazza2012]
Data process
Data sources
Today we have so much data
that come in an unstructured
or semi-structured form that
may nonetheless be of value in
understanding more about our
learners
Data process
Data sources (II)
“Learning is a complex social activity”
[Siemens2012]
Lots of data
Lots of tools
Humans to make sense
Data process
Data sources (III)
Traditional data sources:
● Student data: demographics,
qualification aim, modules taken,
results, etc.
● Student feedback data: end of
module survey and others
● Student activity data: delivery data,
completion, pass rates, etc.
Data process
Data sources (IV)
● The world of technology has changed
[Eaton2012]
○ 80% of the world’s information is unstructured
○ Unstructured data are growing at 15 times the rate
of structured information
○ Raw computational power is growing at such an
enormous rate that we almost have a supercomputer
in our hands
○ Access to information is available to all
Data process
Data sources (V)
Source: http://www.bigdata-startups.com/BigData-startup/understanding-sources-big-data-infographic/
Data process
Data sources (VI)
● RDBMS (SQL Server, DB2, Oracle, MySQL,
PostgreSQL, Sybase IQ, etc.)
● NoSQL Data: HBase, Cassandra, MongoDB
● OLAP (Mondrian, Palo, XML/A)
● Web (REST, SOAP, XML, JSON)
● Files (CSV, Fixed, Excel, etc.)
● ERP (SAP, Salesforce, OpenERP)
● Hadoop Data: HDFS, Hive
● Web Data: Twitter, Facebook, Log Files, Web Logs
● Others: LDAP/Active Directory, Google Analytics,
etc.
Data process
Use cases
1) Student data: XML
Data process
Use cases
1) Student data: XML
Data process
Use cases
1) Student data: XML
Data process
Use cases (II)
2) Moodle: MySQL database
mdl_forum
- id
- course
- name
mdl_user
- id
- username
- firstname
- lastname
mdl_forum_discussions
- id
- name
- userid
- timemodified
- usermodified
mdl_forum_posts
- id
- userid
- discussion
- message
- modified
- created
Data process
Use cases (III)
3) MediaWiki: MySQL database
user
- user_real_name
- user_editcount recentchanges
- rc_old_len
- rc_new_len
revision
- rev_timestamp page
- page_counter
- page_len
rev_user = user_id
rev_page = page_id
user_id = rc_user
Data process
Use cases (IV)
4) Google Doc: Google API
Data process
Use cases (IV)
4) Google Doc: Google API
Data process
Use cases (IV)
4) Google Doc: Google API
Data process
Use cases (V)
4) Google Doc: Google API
Data process
Use cases (VI)
4) Google Doc: Google API
Data process
Use cases (VII)
4) Google Doc: Google API
Data process
Use cases (VIII)
4) Google Doc: Google API
Data process
Use cases (IX)
4) Google Doc: Google API
Data process
Use cases (X)
4) Google Doc: Google API
Data process
Use cases (XI)
4) Google Doc: Google API
Data process
Use cases (VI)
5) Twitter: API
References
[CampbellOblinger2007] Campbell, John P., Peter B. DeBlois, and Diana G. Oblinger. "Academic analytics: A new tool for a new era." Educause
Review 42.4 (2007): 40.
[Clow2012] Clow, Doug. "The learning analytics cycle: closing the loop effectively." Proceedings of the 2nd International Conference on Learning
Analytics and Knowledge. ACM, 2012.
[Cooper2012] Cooper, Adam. "What is analytics? Definition and essential characteristics." CETIS Analytics Series 1.5 (2012): 1-10.
[DronAnderson2009] Dron, J., & Anderson, T. (2009). On the design of collective applications. In Proceedings of the 2009 International Conference
on Computational Science and Engineering, 4, 368–374.
[Dyckhoff2010] Dyckhoff, Anna Lea, et al. "Design and Implementation of a Learning Analytics Toolkit for Teachers." Educational Technology &
Society 15.3 (2012): 58-76.
[Eaton2012] Chris Eaton, Dirk Deroos, Tom Deutsch, George Lapis & Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class
Hadoop and Streaming Data”, p.XV. McGraw-Hill, 2012.
[GayPryke2002] Cultural Economy: Cultural Analysis and Commercial Life (Culture, Representation and Identity series) Paul du Gay (Editor),
Michael Pryke. 2002.
[HR2012] NMC Horizon Report 2012 http://www.nmc.org/publications/horizon-report-2012-higher-ed-edition
[Jenkins2013] BBC Radio 4, Start the Week, Big Data and Analytics, first broadcast 11 February 2013 http://www.bbc.co.
uk/programmes/b01qhqfv
[Khan2012] http://www.emergingedtech.com/2012/04/exploring-the-khan-academys-use-of-learning-data-and-learning-analytics/
[LACE2013] Learning Analytics Community Exchange http://www.laceproject.eu/
[LAK2011] 1st International Conference on Learning Analytics and Knowledge, 27 February - 1 March 2011, Banff, Alberta, Canada https://tekri.
athabascau.ca/analytics/
[Mazza2006] Mazza, Riccardo, et al. "MOCLog–Monitoring Online Courses with log data." Proceedings of the 1st Moodle Research Conference. 2012.
[Reinmann2006] Reinmann, G. (2006). Understanding e-learning: an opportunity for Europe? European Journal of Vocational Training, 38, 27-42.
[SiemensBaker2012] Siemens & Baker (2012). Learning Analytics and Educational Data Mining: Towards Communication and Collaboration.
Learning Analytics and Knowledge 2012. Available in .pdf format at http://users.wpi.edu/~rsbaker/LAKs%20reformatting%20v2.pdf
Data Analytics process in
Learning and Academic
Analytics projects
Day 1: Data selection and capture
Alex Rayón Jerez
alex.rayon@deusto.es
DeustoTech Learning – Deusto Institute of Technology – University of Deusto
Avda. Universidades 24, 48007 Bilbao, Spain
www.deusto.es

Más contenido relacionado

Destacado

Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
Rajiv Advani
 

Destacado (20)

Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
Predictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chaptersPredictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chapters
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
 
Rapidminer
RapidminerRapidminer
Rapidminer
 
predictive models
predictive modelspredictive models
predictive models
 
Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7Introduction to RapidMiner Studio V7
Introduction to RapidMiner Studio V7
 
Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7
 
My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)My First Data Science Project (using Rapid Miner)
My First Data Science Project (using Rapid Miner)
 
Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6
 
Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016 Predictive Analytics World Berlin 2016
Predictive Analytics World Berlin 2016
 
RapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid MinerRapidMiner: Introduction To Rapid Miner
RapidMiner: Introduction To Rapid Miner
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1
 
Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Classification and impression techniques of implants/ dentistry dental implants
Classification and impression techniques of implants/ dentistry dental implantsClassification and impression techniques of implants/ dentistry dental implants
Classification and impression techniques of implants/ dentistry dental implants
 
Introduction To Predictive Analytics Part I
Introduction To Predictive Analytics   Part IIntroduction To Predictive Analytics   Part I
Introduction To Predictive Analytics Part I
 
RapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid MinerRapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 

Similar a Data Analytics.01. Data selection and capture

Education Data Standards Overview
Education Data Standards OverviewEducation Data Standards Overview
Education Data Standards Overview
Frank Walsh
 
Oracle Data Integrator 11g Integration and Administration
Oracle Data Integrator 11g  Integration and AdministrationOracle Data Integrator 11g  Integration and Administration
Oracle Data Integrator 11g Integration and Administration
Md. Noor Alam
 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
Ayele40
 

Similar a Data Analytics.01. Data selection and capture (20)

Enhancing educational data quality in heterogeneous learning contexts using p...
Enhancing educational data quality in heterogeneous learning contexts using p...Enhancing educational data quality in heterogeneous learning contexts using p...
Enhancing educational data quality in heterogeneous learning contexts using p...
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration tool
 
MIDESS
MIDESSMIDESS
MIDESS
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Enterprise Data Warehouse
Enterprise Data Warehouse Enterprise Data Warehouse
Enterprise Data Warehouse
 
MODERN DATA PIPELINE
MODERN DATA PIPELINEMODERN DATA PIPELINE
MODERN DATA PIPELINE
 
BbW2012 - LN
BbW2012 - LNBbW2012 - LN
BbW2012 - LN
 
Education Data Standards Overview
Education Data Standards OverviewEducation Data Standards Overview
Education Data Standards Overview
 
Towards Generating Policy-compliant Datasets (poster)
Towards GeneratingPolicy-compliant Datasets (poster)Towards GeneratingPolicy-compliant Datasets (poster)
Towards Generating Policy-compliant Datasets (poster)
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processing
 
Oracle Data Integrator 11g Integration and Administration
Oracle Data Integrator 11g  Integration and AdministrationOracle Data Integrator 11g  Integration and Administration
Oracle Data Integrator 11g Integration and Administration
 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
 
Open Data Initiatives – Empowering Students to Make More Informed Choices? - ...
Open Data Initiatives – Empowering Students to Make More Informed Choices? - ...Open Data Initiatives – Empowering Students to Make More Informed Choices? - ...
Open Data Initiatives – Empowering Students to Make More Informed Choices? - ...
 
Business Intelligence Module 3
Business Intelligence Module 3Business Intelligence Module 3
Business Intelligence Module 3
 
RDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the DataRDM Roadmap to the Future, or: Lords and Ladies of the Data
RDM Roadmap to the Future, or: Lords and Ladies of the Data
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Research Data Shared Services
Research Data Shared ServicesResearch Data Shared Services
Research Data Shared Services
 
fundamentals of data warehouse. initial level.
fundamentals of data warehouse. initial level.fundamentals of data warehouse. initial level.
fundamentals of data warehouse. initial level.
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 

Más de Alex Rayón Jerez

Más de Alex Rayón Jerez (20)

El Big Data en la dirección comercial: market(ing) intelligence
El Big Data en la dirección comercial: market(ing) intelligenceEl Big Data en la dirección comercial: market(ing) intelligence
El Big Data en la dirección comercial: market(ing) intelligence
 
Herramientas y metodologías Big Data para acceder a datos no estructurados
Herramientas y metodologías Big Data para acceder a datos no estructuradosHerramientas y metodologías Big Data para acceder a datos no estructurados
Herramientas y metodologías Big Data para acceder a datos no estructurados
 
Las competencias digitales como método de observación de competencias genéricas
Las competencias digitales como método de observación de competencias genéricasLas competencias digitales como método de observación de competencias genéricas
Las competencias digitales como método de observación de competencias genéricas
 
El Big Data en mi empresa ¿de qué me sirve?
El Big Data en mi empresa  ¿de qué me sirve?El Big Data en mi empresa  ¿de qué me sirve?
El Big Data en mi empresa ¿de qué me sirve?
 
Aplicación del Big Data a la mejora de la competitividad de la empresa
Aplicación del Big Data a la mejora de la competitividad de la empresaAplicación del Big Data a la mejora de la competitividad de la empresa
Aplicación del Big Data a la mejora de la competitividad de la empresa
 
Análisis de Redes Sociales (Social Network Analysis) y Text Mining
Análisis de Redes Sociales (Social Network Analysis) y Text MiningAnálisis de Redes Sociales (Social Network Analysis) y Text Mining
Análisis de Redes Sociales (Social Network Analysis) y Text Mining
 
Marketing intelligence con estrategia omnicanal y Customer Journey
Marketing intelligence con estrategia omnicanal y Customer JourneyMarketing intelligence con estrategia omnicanal y Customer Journey
Marketing intelligence con estrategia omnicanal y Customer Journey
 
Modelos de propensión en la era del Big Data
Modelos de propensión en la era del Big DataModelos de propensión en la era del Big Data
Modelos de propensión en la era del Big Data
 
Customer Lifetime Value Management con Big Data
Customer Lifetime Value Management con Big DataCustomer Lifetime Value Management con Big Data
Customer Lifetime Value Management con Big Data
 
Big Data: the Management Revolution
Big Data: the Management RevolutionBig Data: the Management Revolution
Big Data: the Management Revolution
 
Optimización de procesos con el Big Data
Optimización de procesos con el Big DataOptimización de procesos con el Big Data
Optimización de procesos con el Big Data
 
La economía del dato: transformando sectores, generando oportunidades
La economía del dato: transformando sectores, generando oportunidadesLa economía del dato: transformando sectores, generando oportunidades
La economía del dato: transformando sectores, generando oportunidades
 
Cómo crecer, ser más eficiente y competitivo a través del Big Data
Cómo crecer, ser más eficiente y competitivo a través del Big DataCómo crecer, ser más eficiente y competitivo a través del Big Data
Cómo crecer, ser más eficiente y competitivo a través del Big Data
 
El poder de los datos: hacia una sociedad inteligente, pero ética
El poder de los datos: hacia una sociedad inteligente, pero éticaEl poder de los datos: hacia una sociedad inteligente, pero ética
El poder de los datos: hacia una sociedad inteligente, pero ética
 
Búsqueda, organización y presentación de recursos de aprendizaje
Búsqueda, organización y presentación de recursos de aprendizajeBúsqueda, organización y presentación de recursos de aprendizaje
Búsqueda, organización y presentación de recursos de aprendizaje
 
Deusto Knowledge Hub como herramienta de publicación y descubrimiento de cono...
Deusto Knowledge Hub como herramienta de publicación y descubrimiento de cono...Deusto Knowledge Hub como herramienta de publicación y descubrimiento de cono...
Deusto Knowledge Hub como herramienta de publicación y descubrimiento de cono...
 
Fomentando la colaboración en el aula a través de herramientas sociales
Fomentando la colaboración en el aula a través de herramientas socialesFomentando la colaboración en el aula a través de herramientas sociales
Fomentando la colaboración en el aula a través de herramientas sociales
 
Utilizando Google Drive y Google Docs en el aula para trabajar con mis estudi...
Utilizando Google Drive y Google Docs en el aula para trabajar con mis estudi...Utilizando Google Drive y Google Docs en el aula para trabajar con mis estudi...
Utilizando Google Drive y Google Docs en el aula para trabajar con mis estudi...
 
Procesamiento y visualización de datos para generar nuevo conocimiento
Procesamiento y visualización de datos para generar nuevo conocimientoProcesamiento y visualización de datos para generar nuevo conocimiento
Procesamiento y visualización de datos para generar nuevo conocimiento
 
El Big Data y Business Intelligence en mi empresa: ¿de qué me sirve?
El Big Data y Business Intelligence en mi empresa: ¿de qué me sirve?El Big Data y Business Intelligence en mi empresa: ¿de qué me sirve?
El Big Data y Business Intelligence en mi empresa: ¿de qué me sirve?
 

Último

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

Data Analytics.01. Data selection and capture

  • 1. Data Analytics process in Learning and Academic Analytics projects Day 1: Data selection and capture Alex Rayón Jerez alex.rayon@deusto.es DeustoTech Learning – Deusto Institute of Technology – University of Deusto Avda. Universidades 24, 48007 Bilbao, Spain www.deusto.es
  • 2. Objectives How to tackle an AA/LA project 1. Objectives: what do I want to improve? 2. Data: automated processes for data discovery and later processing 3. Integration, not substitution 4. Technology 5. KPIs: define and test
  • 3. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process ○ Questions ○ Data model ○ Data sources ○ Use cases
  • 4. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process
  • 5. ETL approach Definition and characteristics ● An ETL tool is a tool that ○ Extracts data from various data sources (usually legacy data) ○ Transforms data ■ from → being optimized for transaction ■ to → being optimized for reporting and analysis ■ synchronizes the data coming from different databases ■ data cleanses to remove errors ○ Loads data into a data warehouse
  • 6. ETL approach Why do I need it? ● ETL tools save time and money when developing a data warehouse by removing the need for hand-coding ● It is very difficult for database administrators to connect between different brands of databases without using an external tool ● In the event that databases are altered or new databases need to be integrated, a lot of hand- coded work needs to be completely redone
  • 7. ETL approach Kettle Project Kettle Powerful Extraction, Transformation and Loading (ETL) capabilities using an innovative, metadata-driven approach
  • 8. ETL approach Kettle (II) ● It uses an innovative meta-driven approach ● It has a very easy-to-use GUI ● Strong community of 13,500 registered users ● It uses a stand-alone Java engine that process the tasks for moving data between many different databases and files
  • 10. ETL approach Kettle (IV) Source: http://download.101com.com/tdwi/research_report/2003ETLReport.pdf
  • 11. ETL approach Kettle (V) Source: Pentaho Corporation
  • 12. ETL approach Kettle (VI) ● Datawarehouse and datamart loads ● Data integration ● Data cleansing ● Data migration ● Data export ● etc.
  • 13. ETL approach Transformations ● String and Date Manipulation ● Data Validation / Business Rules ● Lookup / Join ● Calculation, Statistics ● Cryptography ● Decisions, Flow control ● Scripting ● etc.
  • 14. ETL approach What is good for? ● Mirroring data from master to slave ● Syncing two data sources ● Processing data retrieved from multiple sources and pushed to multiple destinations ● Loading data to RDBMS ● Datamart / Datawarehouse ○ Dimension lookup/update step ● Graphical manipulation of data
  • 15. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process
  • 16. Data Analytics cycle Challenges ● Data is everywhere ● Data is inconsistent ○ Records are different in each system ● Performance issues ○ Running queries to summarize data for stipulated long period takes operating system for task ○ Brings the OS on max load ● Data is never all in Data Warehouse ○ Excel sheet, acquisition, new application
  • 17. Data Analytics cycle Challenges (II) ● Data is incomplete ● Certain types of usage data are not logged ● Data are not aggregated following a didactical perspective ● Users are afraid that they could draw unsound inferences from some of the data [Mazza2012]
  • 18. Data Analytics cycle Academic Analytics Model 1) Capture 2) Report5) Refine 4) Act 3) Predict Academic Analytics [CampbellOblinger2007]
  • 19. Data Analytics cycle Learning Analytics Model 1) Select 2) Capture 3) Aggregate 4) Process 5) Visualize On the design of collective applications [DronAnderson2009]
  • 20. Data Analytics cycle Learning Analytics Model (II) On the design of collective applications [DronAnderson2009] 1) Select 2) Capture 3) Aggregate 4) Process 5) Visualize Day 1 Day 2 Day 3 Day 4
  • 21. Data Analytics cycle Learning Analytics Model (III) ● As [Clow2012] states, it is necessary to close the feedback loop through appropriate interventions unmistakable ● It also draws on the wider educational literature, seeking to place learning analytics on an established theoretical base, and develops a number of insights for learning analytics practice
  • 22. Data Analytics cycle Learning Analytics Model (IV)
  • 23. Data Analytics cycle Learning Analytics Model (III)
  • 24. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process
  • 25. Architecture principles A model for adoption, use and improvement of analytics A framework of characteristics for Analytics Adam Cooper, 2012 [Cooper2012]
  • 26. Architecture principles Development of common language for data exchange The IEEE defines interoperability to be: “The ability of two or more systems or components to exchange information and to use the information that has been exchanged”
  • 27. Architecture principles Development of common language for data exchange (II)
  • 28. Architecture principles Development of common language for data exchange (III) ● The most difficult challenges with achieving interoperability are typically found in establishing common meanings to the data ● Sometimes this is a matter of technical precision ○ But culture – regional, sector-specific, and institutional – and habitual practices also affect meaning
  • 29. Architecture principles Development of common language for data exchange (IV) ● Potential benefits ○ Efficiency and timeliness ■ No need for a persona to intervene to re-enter, re- format or transform data ○ Independence ■ Resilience ○ Adaptability ■ Faster, cheaper and less disruptive to change ○ Innovation and market growth ■ Interoperability combined with modularity makes it easier to build IT systems that are better matched to local culture without needing to create
  • 30. Architecture principles Development of common language for data exchange (V) ● Potential benefits ○ Durability of data ■ Structures and formats change over time ■ The changes are rarely properly documented ○ Aggregation ■ Data joining might be supported by a common set of definitions around course structure, combined with a unified identification scheme ○ Sharing ■ Specially when there are multiple parties involved
  • 31. Architecture principles Development of common language for data exchange (VI) [LACE2013]
  • 32. Architecture principles Development of common language for data exchange (VII) [LACE2013] In our case?
  • 33. Architecture principles Development of common language for data exchange (VIII) [LACE2013] In our case?
  • 34. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process
  • 35. Requirements ● Usability: prepare an understandable user interface (UI), appropriate methods for data visualization, and guide the user through the analytics process. ● Usefulness: provide relevant, meaningful indicators that help teachers to gain insight in the learning behavior of their students and support them in reflecting on their teaching. ● Interoperability: ensure compatibility for any kind of VLE by allowing for integration of different data sources. [Dyckhoff2010]
  • 36. Requirements (II) ● Extensibility: allow for incremental extension of analytics functionality after the system has been deployed without rewriting code. ● Reusability: target for a building-block approach to make sure that re-using simpler ones can implement more complex functions. ● Real-time operation: make sure that the toolkit can return answers within microseconds to allow for an exploratory user experience ● Data Privacy: preserve confidential user information and protect the identities of the users at all times [Dyckhoff2010]
  • 37. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Components ● Data process
  • 38. Components ● Process ○ A systematic process of educational data analysis ● Model ○ The definition of a suitable model to represent the knowledge domain ● Tool/platform ○ The design and implementation of a monitoring and presentation tool based on the Process and Model [Mazza2012]
  • 39. Table of contents ● ETL approach ● Data analytics cycle ● Architecture principles ● Requirements ● Data process
  • 40. Data process Introduction “Measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs” First international conference on Learning Analytics and Knowledge, Alberta, 2011 [LAK2011]
  • 41. Data process Introduction (II) However, the challenge is to determine which data are of interest We are now in an era where gaining access to data is not the problem; the challenge lies in determining which data are significant and why
  • 42. Data process Introduction (III) “The basic question is not what can we measure? The basic question is what does a good education look like? Big questions”
  • 43. Data process Introduction (IV) “More data does not mean more knowledge” [Jenkins2013] Searching for the evidence in a mass of data requires knowing what kind of evidence is needed Knowledge of the domain and understanding and interpretation of the patterns we see
  • 45. Data process Introduction (VI) Source: http://www.learningfrontiers.eu/?q=story/will-analytics-transform-education
  • 46. Data process Introduction (VII) A brief comparison of the two fields (George Siemens and Ryan Baker [SiemensBaker2012])
  • 47. Data process Introduction (VIII) First of all, education is a highly collaborative space and it represents a social good. Keeping a valuable secret that might help students succeed is antithetical to the nature of education. Second, education is a complex ecosystem of people, processes, policies, content, etc. I would have strong doubts about anyone who claimed to have a formula that worked for a wide variety of institutions. Mike Sharkey, 2014
  • 49. Data process Questions (II) The question depends on who is making it ;)
  • 50. Data process Questions (III) Horizon Report 2012 [HR2012]
  • 51. Data process Questions (IV) 1) Adaptive testing, tracking and reporting ● Progress summary, daily activity report, class goals report, progress report, student activity report, student focus report, etc [Khan2012] ● By using various analytics tools, students can review their learning progress and teachers are also supported in how to personalise learning for students in need for more help in specific areas
  • 52. Data process Questions (V) 2) Analytics tools for early alert, intervention and collaboration ● Integrating their data collected from a variety of information management systems ○ Allowing educators to assess the risk, initiate early interventions and support collaborative learning
  • 53. Data process Questions (VI) 2) Analytics tools for early alert, intervention and collaboration ● For example, the Signals project at Purdue University utilizes the data collected from student information systems, learning management systems, and the grade book for a specific course to track students’ performances and identify at-risk students in real time
  • 54. Data process Questions (VII) 2) Analytics tools for early alert, intervention and collaboration ● The LOCO-Analyst provides teachers with charts, graphs, and other data representations that help them see how their students are performing and how students interact with one another in web-based learning environments to help the teacher determine how to engage their students online
  • 55. Data process Questions (VIII) 2) Analytics tools for early alert, intervention and collaboration ● Social Networks Adapting Pedagogical Practice (SNAPP), a network visualization tool developed by researchers at the University of Wollongong, can analyse students’ interactions in a forum and display it in a visualised diagram which help teachers to identify the key connections and disconnected students and support collaborative learning in a web-based learning environment
  • 56. Data process Questions (IX) 3) Analytics projects for institutional efficiency and effectiveness ● There are a number of institutional analytics initiatives which enable institutions to improve the effectiveness of operations, including admission management and drop- out prevention, resource management, financial planning, etc ○ Student Experience Traffic Lighting (SETL) ○ The Enhancing Student Centred Administration Placement Experience (ESCAPES)
  • 57. Data process Questions (X) Learning Analytics are not neutral
  • 58. Data process Questions (XI) “Accounting tools… do not simply aid the measurement of economic activity, they shape the reality they measure” [GayPryke2002]
  • 59. Data process Questions (XII) Fuente: http://mfeldstein.com/harvard-mit-learn-university-phoenix-analytics/
  • 60. Data process Questions (XIII) ● The Harvard and MIT data ignores student goals or any information giving a clue on whether students desired to complete the course, get a good grade, get a certificate, or just sample some material ● Without this information, the actual aggregate behavior is missing context ○ We don’t know if a certain student intended to just audit a course, sample it, or attempt to complete it. ○ We don’t know if students started the course intended to complete but became frustrated
  • 61. Data process Questions (XIV) ● The value of learner behavior patterns, which can only be learned by viewing data patterns over time ● If you want to “share best practices to improve teaching and learning”, then you need data organized around the learner ○ With transactions captured over time – not just in aggregate ○ What we have now is an honest start, but a very limited data set
  • 62. Data process Data model So, which is our data model to answer to our questions?
  • 63. Data process Data model (II) The data model, or the concept map, describes the concepts and their relationships used by the organization in its daily work, expressed in its own language It enables the whole organization to participate in the maintenance of it
  • 64. Data process Data model (III) Source: http://www.economist.com/news/finance-and-economics/21578041-containers-have-been-more-important-globalisation-freer-trade- humble Source: http://www.economist.com/blogs/economist-explains/2013/05/economist-explains-14
  • 65. Data process Data model (IV) The best approach that we have found for this task is constituted by the theory of eLearning functions Reinmann [Reinmann2006]
  • 66. Data process Data model (V) [Reinmann2006]
  • 67. Data process Data model (VI) Example [Mazza2012]
  • 68. Data process Data model (VII) Example This model answers the monitoring questions: ● Which way of eLearning enables to reach the given objectives? ● By which means (functions, tools) does the LMS enable these ways of learning? ● How is the use of these means traced in the log files (activity log codes)? [Mazza2012]
  • 69. Data process Data model (VIII) [Mazza2012]
  • 70. Data process Data sources Today we have so much data that come in an unstructured or semi-structured form that may nonetheless be of value in understanding more about our learners
  • 71. Data process Data sources (II) “Learning is a complex social activity” [Siemens2012] Lots of data Lots of tools Humans to make sense
  • 72. Data process Data sources (III) Traditional data sources: ● Student data: demographics, qualification aim, modules taken, results, etc. ● Student feedback data: end of module survey and others ● Student activity data: delivery data, completion, pass rates, etc.
  • 73. Data process Data sources (IV) ● The world of technology has changed [Eaton2012] ○ 80% of the world’s information is unstructured ○ Unstructured data are growing at 15 times the rate of structured information ○ Raw computational power is growing at such an enormous rate that we almost have a supercomputer in our hands ○ Access to information is available to all
  • 74. Data process Data sources (V) Source: http://www.bigdata-startups.com/BigData-startup/understanding-sources-big-data-infographic/
  • 75. Data process Data sources (VI) ● RDBMS (SQL Server, DB2, Oracle, MySQL, PostgreSQL, Sybase IQ, etc.) ● NoSQL Data: HBase, Cassandra, MongoDB ● OLAP (Mondrian, Palo, XML/A) ● Web (REST, SOAP, XML, JSON) ● Files (CSV, Fixed, Excel, etc.) ● ERP (SAP, Salesforce, OpenERP) ● Hadoop Data: HDFS, Hive ● Web Data: Twitter, Facebook, Log Files, Web Logs ● Others: LDAP/Active Directory, Google Analytics, etc.
  • 76. Data process Use cases 1) Student data: XML
  • 77. Data process Use cases 1) Student data: XML
  • 78. Data process Use cases 1) Student data: XML
  • 79. Data process Use cases (II) 2) Moodle: MySQL database mdl_forum - id - course - name mdl_user - id - username - firstname - lastname mdl_forum_discussions - id - name - userid - timemodified - usermodified mdl_forum_posts - id - userid - discussion - message - modified - created
  • 80. Data process Use cases (III) 3) MediaWiki: MySQL database user - user_real_name - user_editcount recentchanges - rc_old_len - rc_new_len revision - rev_timestamp page - page_counter - page_len rev_user = user_id rev_page = page_id user_id = rc_user
  • 81. Data process Use cases (IV) 4) Google Doc: Google API
  • 82. Data process Use cases (IV) 4) Google Doc: Google API
  • 83. Data process Use cases (IV) 4) Google Doc: Google API
  • 84. Data process Use cases (V) 4) Google Doc: Google API
  • 85. Data process Use cases (VI) 4) Google Doc: Google API
  • 86. Data process Use cases (VII) 4) Google Doc: Google API
  • 87. Data process Use cases (VIII) 4) Google Doc: Google API
  • 88. Data process Use cases (IX) 4) Google Doc: Google API
  • 89. Data process Use cases (X) 4) Google Doc: Google API
  • 90. Data process Use cases (XI) 4) Google Doc: Google API
  • 91. Data process Use cases (VI) 5) Twitter: API
  • 92. References [CampbellOblinger2007] Campbell, John P., Peter B. DeBlois, and Diana G. Oblinger. "Academic analytics: A new tool for a new era." Educause Review 42.4 (2007): 40. [Clow2012] Clow, Doug. "The learning analytics cycle: closing the loop effectively." Proceedings of the 2nd International Conference on Learning Analytics and Knowledge. ACM, 2012. [Cooper2012] Cooper, Adam. "What is analytics? Definition and essential characteristics." CETIS Analytics Series 1.5 (2012): 1-10. [DronAnderson2009] Dron, J., & Anderson, T. (2009). On the design of collective applications. In Proceedings of the 2009 International Conference on Computational Science and Engineering, 4, 368–374. [Dyckhoff2010] Dyckhoff, Anna Lea, et al. "Design and Implementation of a Learning Analytics Toolkit for Teachers." Educational Technology & Society 15.3 (2012): 58-76. [Eaton2012] Chris Eaton, Dirk Deroos, Tom Deutsch, George Lapis & Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, p.XV. McGraw-Hill, 2012. [GayPryke2002] Cultural Economy: Cultural Analysis and Commercial Life (Culture, Representation and Identity series) Paul du Gay (Editor), Michael Pryke. 2002. [HR2012] NMC Horizon Report 2012 http://www.nmc.org/publications/horizon-report-2012-higher-ed-edition [Jenkins2013] BBC Radio 4, Start the Week, Big Data and Analytics, first broadcast 11 February 2013 http://www.bbc.co. uk/programmes/b01qhqfv [Khan2012] http://www.emergingedtech.com/2012/04/exploring-the-khan-academys-use-of-learning-data-and-learning-analytics/ [LACE2013] Learning Analytics Community Exchange http://www.laceproject.eu/ [LAK2011] 1st International Conference on Learning Analytics and Knowledge, 27 February - 1 March 2011, Banff, Alberta, Canada https://tekri. athabascau.ca/analytics/ [Mazza2006] Mazza, Riccardo, et al. "MOCLog–Monitoring Online Courses with log data." Proceedings of the 1st Moodle Research Conference. 2012. [Reinmann2006] Reinmann, G. (2006). Understanding e-learning: an opportunity for Europe? European Journal of Vocational Training, 38, 27-42. [SiemensBaker2012] Siemens & Baker (2012). Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Learning Analytics and Knowledge 2012. Available in .pdf format at http://users.wpi.edu/~rsbaker/LAKs%20reformatting%20v2.pdf
  • 93. Data Analytics process in Learning and Academic Analytics projects Day 1: Data selection and capture Alex Rayón Jerez alex.rayon@deusto.es DeustoTech Learning – Deusto Institute of Technology – University of Deusto Avda. Universidades 24, 48007 Bilbao, Spain www.deusto.es