This document outlines the roles and relationships between data scientists, data engineers, and other roles in analyzing data to generate insights. It shows how data scientists articulate questions and hypotheses to direct experiments and analytical methods. Data engineers implement these methods by determining appropriate data sources and tools. Results and insights are then shared with relevant parties, informing future questions and refinement of models and methods.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Empirical discovery concept model
1. Data Sets
Data Sources
Models
Analytical Tools
Empirical Method
Experiments
Hypotheses
Results
Questions or
beliefs
Predictions
Conclusions
Domain
Analytical
Methods
Insight
Consumer
Data
Scientist
Articulates
Directs
& applies
Creates & refines
Effected by
Lead to
Tested by
Use / require
Motivate
Creates & refines
Generate
Achieves
Informed by & shares
Inform
Understands
Defines & evolves
Informs
Data
Engineer
Implements
Determines
Applied to
Validates
Applied to
Development
Corpus
External
Sources
Production
Corpus
Mirrors
Applied to
Reference Initial Interim New
Drawn from
Implemented as
Implements
Informs
What is the question?
How will we answer the question?
What data will we use?
What analytical method will we use?
What tools will we use?
What are the results?
What do the results mean?
What did we learn / discover?
Who should we inform?
What is the next question?
Manages
Manages
Published as
Empirical Discovery Method
Concept Model
Joe Lamantia
Product Strategist: Big Data and Discovery
Oracle
Joe.Lamantia@Oracle.com v 4.2 | June 2014
Understands
Exploratory Investigative
Model
Bullding
Validation Training
Informs
Production
Models
Insights
Data Products
Measure TestAlgorithm