SlideShare a Scribd company logo
1 of 126
Download to read offline
From sports to scientific research, a surprising range
of industries will begin to find value in big data.....
Digital Health Technologies
These are some of the most important DIGITAL HEALTH CATEGORIES.....
• Digital Imaging – (MRI / CTI / X-Ray / Ultrasound)
• Robotic Surgery – (Microsurgery / Remote Surgery)
• Patient Monitoring – (Clinical Trials / Health / Wellbeing)
• Biomedical Data – (Data Streaming / Biomedical Analytics)
• Epidemiology – (Disease Transmission / Contact Management)
• Emergency Incident Management – (Response Teams / Alerts and Alarms)
Here are a few of the most important DIGITAL MONITORING SMART APPS.....
• Activity Monitor – (Pedometer / GPS)
• Position Monitor – (Falling / Fainting / Fitting)
• Breathing Monitor – (Breathing Rate / SATS Level)
• Sleep Monitor – (Light Sleep / Deep Sleep / REM / Apnoea)
• Blood Monitor – (Glucose / Oxygen / Hormones / Organ Function)
• Cardiac Monitor – (Heart Rhythm / Blood Pressure / Cardiac Events)
Digital Health Technologies
These are some of the most influential FUTURE DIGITAL HEALTH leaders: -
– Huawei - John Frieslaar (Digital Futures)
– Cisco - Andrew Green (Digital Healthcare)
– ElationEMR - Kyna Fong (Digital Imaging)
– Microsoft - John Coplin (Digital Healthcare)
– Google - Eze Vidra (Head of Campus at Tech City)
– GE Healthcare - Catherine Yang (Digital Healthcare)
– MIT – Prof Alex “Sandy” Pentland (Digital Epidemiology)
– Telefónica Digital – Mathew Key – CEO (Digital Healthcare)
– Open University – Dr. Blain Price (Digital Patient Monitoring)
– UCLA – Prof. Larry Smarr (FuturePatient – Digital Patient Monitoring)
– Telefónica – Dr. Mike Short CBE (Digital Futures and the Smart Ward)
– Thames Valley Health Innovation and Education Cluster – David Doughty
– Department of Business, Industry & Skills – Richard Foggie, KTN Executive
– Science City Research Alliance – Sarah Knaggs (Strategic Project Manager)
Digital Healthcare – Executive Summary
• Digital Healthcare is a cluster of new and emerging applications and technologies that exploit digital, mobile
and cloud platforms for treating and supporting patients. The term "Digital Healthcare" is necessarily broad
and generic as this novel and exciting Bioinformatics and Medical Analytics innovation driven approach is
applied to a very wide range of social and health problems - from monitoring patients in intensive care,
general wards, in convalescence or at home – to helping general practitioners make better informed and
more accurate diagnoses, improving the effect of prescription and referral decisions for clinical treatment.
• Bioinformatics and Medical Analytics utilises Data Science to provide actionable clinical insights. Digital
Healthcare has evolved from the need for more proactive and efficient healthcare service delivery, and
seeks to offer new and improved types of pro-active and preventive monitoring and medical care at reduced
cost – using methods that are only possible thanks to emerging SMAC Digital Technology.
Digital Healthcare Technologies – Bioinformatics and Medical Analytics: -
– Digital Patient Monitoring •
– Biomedical Data Streaming •
– Biomedical Data Science and Analytics •
– Epidemiology, Clinical Trials, Morbidity and Actuarial Outcomes •
• Novel and emerging high-impact Biomedical Health Technologies such as Bioinformatics and Medical
Analytics are transforming the way that Healthcare Service Providers can deliver Digital Healthcare globally
– Digital Health Technology entrepreneurs, investors and researchers becoming increasingly interested in
and attracted to this important and rapidly expanding Life Sciences industry sector.
Digital Healthcare – Executive Summary
• While many industries can benefit from SMAC digital technology – Smart Devices, Mobile Platforms,
Analytics and the Cloud – this is especially the case for Life Sciences, Pharma and Healthcare
industry sectors – resulting in more accurate diagnosis, improved treatment regimes, more reliable
prognosis, better patient monitoring, care and clinical outcomes. Let’s take a look at some of the
Digital Technologies that are bringing significant improvements and benefits to Healthcare
• Today, thanks to the regulatory compliance requirements for HIPAA, HITEC, PCI DSS and ISO
27001, the reluctance to adopt Digital Technology has been overcome, and Digital Healthcare
adoption is gaining increased traction. Many of the security features required for data protection and
patient confidentiality are being addressed by Digital Healthcare service providers, therefore relieving
healthcare delivery organizations from tedious and complex security and data protection frameworks.
Biomedical Data Analytics:
• The exploitation of data by applying analytical methods such as statistics, predictive and quantitative
models to patient segments or groups of the population will provide better insights and achieve better
outcomes. As far back as 2010, there was evidence that: “93 percent of healthcare providers
identified the digital information explosion as the major factor which will drive organizational change
over the next 5 years.”
(Related article: Cloud and healthcare: A revolution is coming)
Digital Healthcare – Executive Summary
Data Security and Privacy:
• Today, thanks to the regulatory compliance requirements for HIPAA, HITEC, PCI DSS and
ISO 27001, reluctance to adopt emerging technologies is starting to be addressed and digital
technology is beginning to gain traction - bear in mind also that many of the security features
required for data security and protection are addressed by the service providers, therefore
relieving the healthcare organization from tedious and complex security frameworks.
Mobility:
• Mobility Services, where Smart Devices, Smart Apps, Mobile Platforms and Cloud
Infrastructure is providing the backbone for medical personnel to access all sorts of patient
information from any place, any where - and from a wide range of mobile devices.
Collaboration with patients:
• Mobility means that complete patient records are now available to healthcare professionals
anytime, anywhere – allowing physicians to access historical patient case records , images
and clinical data to fine-tune their diagnosis and make informed decisions on treatment –
thus reducing diagnosis latency, increasing accuracy and improving patient care and clinical
outcomes from initial consultation to specialist referrals. Some scenarios are illustrated in
the following: -
• Physician Collaboration Solutions (PCS) •
• PCS solutions offers video conferencing to facilitate remote consultations and care
continuity, allowing patients to be viewed remotely. PCS allows physicians to consult with
patients and even perform remote robotic surgery. This is dubbed “tele-health solutions.”
Digital Healthcare – Executive Summary
• Electronic Medical Records (EMR) •
• Every piece of information pertaining to a specific is recorded and stored. The solution is
designed to capture and provide a patient’s data at any time of the patient’s monitoring
cycle, including the complete medical records and history.
• Patient Information Exchange (PIE) •
• This allows for the healthcare information to be shared electronically across organizations
within a region, community or hospital system. There are currently several Digital
Healthcare cloud service providers addressing this market, taking the role of collecting and
distributing medical information from and among multiple organizations.
• The New York Times has published an interesting article illustrating the use of the cloud
in healthcare - leveraging big data in the cloud to manage patient relationships and clinical
outcomes.
Collaboration among peers:
• Technology can provide medical assistance to doctors in the field, b e it in remote areas or
in emergency relief operations through satellite communications. Refer to the Remote
Assistance for Medical Teams Deployed Abroad (T4MOD project) which could easily
find its place in the Digital Healthcare cloud space.
4D Geospatial Analytics in Digital Healthcare
GIS Mapping and Spatial Analysis
• 4D Geospatial Analytics is the
Geographic profiling and analysis of
large aggregated datasets in order to
determine a ‘natural’ structure of
clusters or groupings – this provides an
important basic technique for many
statistical and analytic applications.
• Environmental and Demographic
Geospatial Cluster Analysis – based
on geographic distribution or profile
similarities – is a statistical method
whereby no prior assumptions are
made concerning the nature of internal
data structures (the number and type of
groups and hierarchies). Geo-spatial
and geodemographic techniques are
frequently used in order to profile and
segment populations using ‘natural’
groupings such as shared or common
behavioural traits – Medical, Clinical
Trial, Morbidity or Actuarial outcomes -
along with many other common factors
and shared characteristics.....
GIS Mapping and Spatial Analysis
• GIS MAPPING and SPATIAL DATA ANALYSIS •
• A Geographic Information System (GIS) integrates hardware, software and digital data
capture and streaming devices – including machine generated data capture such as Computer-
aided Design (CAD) information from land and building surveys, Global Positioning System
(GPS) terrestrial location data, wearable technology and biomedical data streams – in order to
acquire, manage, analyse, distribute, communicate and display every type of static and mobile
geographically dependant location data, along with data streams such as imaging data feeds –
including personal, transportation and environment , HDCCTV, aerial and satellite image data.....
• Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial (Geographic)
data and location (Positional) object data overlays. GIS Software that implements spatial data
analysis techniques requires access to both the locations of objects and their physical attributes.
Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial
Data Analysis provides techniques to describe the distribution of data in a geographic space
(descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster
analysis), identify and measure spatial relationships (clusters and spatial regression), and create
3D surface models from sampled data (spatial interpolation, often categorised as geo-statistics).
• The results of spatial data analysis are largely dependent upon the type, quantity,
distribution and data quality of the spatial objects which are subject to analysis…
GIS Mapping and Spatial Analysis
GIS Gazetteer –
Biomedical Clusters
The Cone™ – Actionable Clinical Insights
The Cone™ – Patient Model
The Cone™ - Patient Model
turning Biomedical Data Streams, Social Intelligence, Patient
Monitoring and Analytics – into Actionable Clinical Insights…
• Acute – (10%) Active Patient Monitoring – Alerts and Alarms
• Chronic – (20%) Passive Monitoring – Biomedical Data Streaming
• Casuals – (30%) Walk-in – on-demand Monitoring and Treatment
• Indifferent – (40%) Annual Screening – Health-check and Review
Electronic Medical Records
(G-cloud EMR)
The Cone™ - Patient Clusters
Acute - 10%
Chronic - 20%
Casuals - 30%
Indifferent - 40%
The Cone™ Patient
Biomedical Analytics
Actionable Clinical Insights
Presentation
Clustering
Biomedical Profile
Biomedical Epidemiology – Groups (Streams), Types (Segments)
Hybrid Cone – 3 Dimensions
Biomedical Analytics
The Cone™ - Eight Primitives
Primitive Domain Function Product
Who ? People – Patient Patient Information System Electronic Medical
Records (EMR)
Where ? Places – Location 1st Responders, Emergency
Services, GP, Nurse, Doctor
Command / Control /
Geospatial Analytics
When ? Medical Incident / Event Event Type - Referral, Walk-in,
Appointment, Emergency
Incident Management –
Event Type / Time / Date
What ? Emergency / Medical /
Clinical Procedure
Investigate / Test / Diagnose /
Treatment / Follow-up
Patient Administration /
Patient Care Systems
Why ? Reason / Motivation /
Cause / Outcome
Triage Patient Status - Acute,
Chronic, Casual, Indifferent
Biomedical Information
Streaming and Analytics
How ? Patient Medical Data Automatic Streaming of
Biomedical Data to Cloud
Mobile Platforms / IoT,
Smart Devices / Apps
Which ? Investigation / Test /
Observe / Diagnosis
Healthcare Provider - GPs
Surgery, Clinics, Hospitals
Patient Administration /
Patient Care Systems
Via ? Referral Channel / Health
Service Delivery Partner
Healthcare Service Provider –
Surgery, Clinics, Hospitals
Healthcare Service
Partner / Procedure
The Patient Cone™ – EIGHT PRIMITIVES
Event
Dimension
Party
Dimension
Geographic
Dimension
Motivation
Dimension
Time
Dimension
Data
Dimension
Cone™
MEDICAL
FACT
WHO ? WHAT ? WHERE ?
HOW ?WHEN ?WHY ?
• Indifferent
• Casuals
• Chronic
• Acute
• Clinical Notes
• Images / Graphs
• Biomedical Data
• Lab Test Results
• Cardiac Activity
• Brain Activity
• Consultation
• Clinical Tests
• Diagnosis
• Treatment
• Appointment
• Attendance
• Phone Call
• Letter
• Location
• Attitude
• Movement
• Region / Country
• State / County
• City / Town
• Street / Building
• Postcode
• Person
• Organisation
Procedure
Dimension
WHICH ?
• Procedure
• Prescription
Channel
Dimension
VIA ?
• Channel / Partner
• Hospital / Clinic
Patient Data
Delivery Channel
Environment
Data
Subject
Location
Biomedical Data
Event
• Walk-in
• Emergency
• Referral
• Follow-upMotivation
Patient
Time / Date
Version 3 –
Healthcare
The Biomedical Cone™
Converting Data Streams into Actionable Insights
Salesforce
Anomaly 42
Cone
Unica
End User
BIG DATA
ANALYTICS
BIOMEDICAL DATA
Patient Monitoring
Platform
INTERVENTION
• Treatment
• Smart Apps
The Cone™ Patient
Biomedical Analytics
Actionable Medical Insights
Electronic Medical Records
(EMR)
• Individuals
• Households
• Geo-demographics
• Patient Streaming
• Patient Segmentation
PATIENT RECORDS
• Medical History
• Key Events
Insights
InsightsInsights
Anomaly
42
Unica
Biomedical
Data Streaming
People, Places
and Events
Health
Campaigns
• Clinical and Biomedical Data
• Images – X-Ray, CTI, MRI
• Procedures and Interventions
• Prescriptions and Treatment
Social
Media
Monitoring
EXPERIAN
MOSAIC
Proof-of-concept and Prototype
The Patient Pyramid™ approach is lean, agile, smart and creative: -
• We start by providing a custom Pyramid™ Enterprise Application as a proof of concept.
We then work with client key stakeholders to scope a detailed brief which articulates a
business problem domain that the Patient Pyramid™ can help understand and resolve.
• We then harvest all current and past patient records along with any other available internal
and public domain biomedical data – in order to establish a baseline Patient Pyramid™.
• This is augmented by overlaying external data - Social Intelligence and other live
streamed Biomedical and Patient Lifestyle Data that drives our new real-time Patient
Pyramid™ view describing the six primitives - who / what / why / where / when and how.
• Finally, we exploit social intelligence for Patient Lifestyle Understanding – creating new
actionable insights to inform creative medical campaign solutions against the agreed brief.
• Post proof-of-concept, we can then agree a Pyramid™ Enterprise Application fixed term
licence along with Patient Pyramid™ add-ons, enhancements, consulting, mentoring,
training and support – on-line, on-site, on-demand - whenever and wherever required.
4D Geospatial Analytics in
Digital Healthcare
Digital Futures: - Creating new roles and value chains
Novel and emerging Biomedical Health Technologies are transforming the way that
Healthcare Providers can deliver Healthcare globally – with Digital Health
Technology entrepreneurs and investors becoming increasingly attracted to this
rapidly growing industry sector.
Healthcare Delivery is currently undergoing a global transformation – with Digital
Health Technologies leading the way. Companies such as BT Health, Blueprint
Health, BUPA, Microsoft (John Coplin), Telefonica Digital (Dr. Mike Shaw) and
Rockhealth - are all shaping novel and emerging Digital Healthcare Technologies -
bringing new and innovative business propositions to market.
4D Geospatial Analytics
Geo-spatial and geodemographic
techniques are frequently used to
profile, stream and segment human
populations using ‘natural’ groupings
such as shared or common
behavioural traits – Medical, Clinical
Trial, Morbidity or Actuarial outcomes
– along with many other common
factors and shared characteristics.....
The profiling and analysis of large
aggregated datasets in order to
determine a ‘natural’ structure of
clusters or groupings, provides an
important basic technique for many
statistical and analytic applications.
Based on geographic distribution or
profile similarities – Geospatial
Clustering is a statistical method
whereby no prior assumptions are
made concerning the nature of
internal data structures (the number
and type of groups and hierarchies).
4D Geospatial Analytics
GIS Gazetteer –
Biomedical Clusters
The Flow of Information through Time
• Space-Time is a four-dimensional (4D) integrated dimensional cluster consisting of the
three Spatial dimensions (x, y and z axes) plus Time (the fourth dimension - t). Space-
Time exists in discrete packages (Temporal Planes) - with the whole of Space-Time
existing as an endless stack of Temporal Planes extending from the remote Past, through
into our Present, and onwards to the distant Future. Events exist as a line through this
stack of Temporal Planes. Thus Time Present is always inextricably woven into both Time
Past and Time Future. Every item of Global Content in the Present is somehow connected
with both Past and Future temporal planes in a timeline which is composed of a sequence
of temporal planes stacked one on top of another. The “arrow of time” governs the flow of
Space-Time which can only flow in a single direction - relentlessly towards the future.
• Space-Time does not flow uniformly – the path of the “arrow of time” may be deflected or
changed by various factors – gravitational fields, dark matter, dark energy, dark flow,
hidden dimensions or unknown Membranes in Hyperspace. There may also exist “hidden
external forces” (unseen interactions) that create disturbance in the temporal plane stack
which marks the passage of time - with the potential to create eddies, vortices and
whirlpools along the trajectory of Time (chaos, disorder and uncertainty) – which in turn
posses the capacity to generate ripples and waves (randomness and disruption) – thus
changing the course of the Space-Time continuum. “Weak Signals” are “Ghosts in the
Machine” – echoes of these subliminal temporal interactions – that may contain within
insights or clues about possible future “Wild card” or “Black Swan” random events
The Flow of Information through Time
• String Theory physicists and mathematicians postulate that Space-Time exists in discrete
packages (Temporal Planes) - with the whole of Space-Time existing as an endless stack
of Temporal Planes extending from the remote Past, through into our Present, and
onwards to the distant Future. Thus Time Present is always inextricably woven into both
Time Past and Time Future. This yields the intriguing possibility of glimpses through the
mists of time into the outcomes of future Event Paths – both isolated Events and linked
Event Clusters – as any item of Data or Information (Global Content) may contain faint
traces which offer insights into the future trajectory of Past, Present and Future Events.
• If all future timelines were linear in nature - then every event would unfold in an unerringly
predictable manner towards a known and certain conclusion. The future is, however, both
unknown and unknowable (Hawking Paradox). Events exist as a line through this stack of
Temporal Planes. Future timelines are non-linear (branched) with an infinite multitude of
possible alternative futures – rendering future outcomes as uncertain and unpredictable.
Chaos Theory suggests to us that even the most ethereal and subliminal system inputs
originating from invisible random events in the Space-Time continuum, are able to project
minute unknown forces so small as to be undetectable, which may then simply disappear
– or become amplified over time through numerous system cycles to grow in influence and
impact – slowly deviating predicted Space-Time trajectories far away from their original
estimated path – thus fundamentally altering the flow and outcome of Future Events.
4D Geospatial Analytics – The Temporal Wave
• The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration
of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic)
context. The problems encountered in exploring and analysing vast volumes of spatial–
temporal information in today's data-rich landscape – are becoming increasingly difficult to
manage effectively. In order to overcome the problem of data volume and scale in a Time
(history) and Space (location) context requires not only traditional location–space and
attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the
additional dimension of time–space analysis. The Temporal Wave supports a new method
of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context.
• This time-visualisation approach integrates Geospatial (location) data within a Temporal
(timeline) data along with data visualisation techniques - thus improving accessibility,
exploration and analysis of the huge amounts of geo-spatial data used to support geo-visual
“Big Data” analytics. The Temporal Wave combines the strengths of both linear timeline
and cyclical wave-form analysis – and is able to represent data both within a Space
(geographic) and Time (history) context simultaneously – and even at different levels of
granularity. Linear and cyclic trends in space-time data may be represented in combination
with other graphic representations typical for location–space and attribute–space data-
types. The Temporal Wave can be used in multiple roles for exploring very large scale
datasets containing Geospatial (location) data within a Temporal (timeline) context - as an
integrated Space-Time data reference system, as a Space-Time continuum representation
and animation tool, and as Space-Time interaction, simulation and analysis tool.
4D Geospatial Analytics – The Temporal Wave
• The problems encountered in exploring, analysing and extracting insights from the vast
volumes of spatial–temporal information in today's data-rich landscape are becoming
increasingly difficult to manage effectively. In order to overcome the problem of data
volume and scale in an integrated Time (history) and Space (location) context requires
not only traditional location–space and attribute–space analysis common in GIS Mapping
and Spatial Analysis - but now with the additional dimension of Space-Time analysis. The
Temporal Wave supports a new method of Visual Exploration for Geospatial (location)
data within a Temporal (timeline) context. The Temporal Wave is a novel and innovative
method for Visual Modelling, Exploration and Analysis of the Space-Time dimension
fundamental to understanding Geospatial “Big Data” – through simultaneously visualising
and displaying complex data within a Time (history) and Space (geographic) context.
Simplexity
Ordered
Complexity
Disordered
Complexity
Complex Adaptive
Systems (CAS)
Linear
Systems
ComplexitySimplicity (increasing element and interaction density)
ChaosOrder
EntropyEnthalpy The “arrow of time”
4D Geospatial Analytics – The Temporal Wave
• The Temporal Wave time-visualisation approach integrates Geospatial (location) data
within a Temporal (timeline) dataset - along with other data visualisation techniques - thus
improving accessibility, exploration and analysis of the huge amounts of geo-spatial data
used to support geo-visual “Big Data” analytics. The Temporal Wave combines the
strengths of both linear timeline and cyclical wave-form analysis – and is able to represent
complex data both within a Time (history) and Space (geographic) context simultaneously
– even at different levels of granularity. Linear and cyclic trends in space-time data may be
represented in combination with other graphic representations typical for location–space
and attribute–space data-types. The Temporal Wave can be deployed and used in roles
as diverse as a Space-Time data reference system, as a Space-Time continuum
representation tool, and as Space-Time display / interaction / simulation / analysis tool.
Simplexity
Ordered
Complexity
Disordered
Complexity
Complex Adaptive
Systems (CAS)
Linear
Systems
ComplexitySimplicity (increasing element and interaction density)
ChaosOrder
EntropyEnthalpy The “arrow of time”
Digital Healthcare – Technical Appendices
4D Geospatial Analytics – London Timeline
4D Geospatial Analytics – London Timeline
• How did London evolve from its creation as a Roman city in 43AD into the
crowded, chaotic cosmopolitan megacity we see today? The London Evolution
Animation takes a holistic view of what has been constructed in the capital over
different historical periods – what has been lost, what saved and what protected.
• Greater London covers 600 square miles. Up until the 17th century, however,
the capital city was crammed largely into a single square mile which today is
marked by the skyscrapers which are a feature of the financial district of the City.
• This visualisation, originally created for the Almost Lost exhibition by the Bartlett
Centre for Advanced Spatial Analysis (CASA), explores the historic evolution of
the city by plotting a timeline of the development of the road network - along with
documented buildings and other features – through 4D geospatial analysis of a
vast number of diverse geographic, archaeological and historic data sets.
• Unlike other historical cities such as Athens or Rome, with an obvious patchwork
of districts from different periods, London's individual structures scheduled sites
and listed buildings are in many cases constructed gradually by parts assembled
during different periods. Researchers who have tried previously to locate and
document archaeological structures and research historic references will know
that these features, when plotted, appear scrambled up like pieces of different
jigsaw puzzles – all scattered across the contemporary London cityscape.
History of Digital Epidemiology
• Doctor John Snow (15 March 1813 – 16
June 1858) was an English physician and a
leading figure in the adoption of anaesthesia
and medical hygiene. John Snow is largely
credited with sparking and pursuing a total
transformation in Public Health and epidemic
disease management and is considered one
of the fathers of modern epidemiology in part
because of his work in tracing the source of
a cholera outbreak in Soho, London, in 1854.
• John Snows’ investigation and findings into
the Broad Street cholera outbreak - which
occurred in 1854 near Broad Street in the
London district of Soho in England - inspired
fundamental changes in both the clean and
waste water systems of London, which led to
further similar changes in other cities, and a
significant improvement in understanding of
Public Health around the whole of the world.
History of Digital Epidemiology
• The Broad Street cholera outbreak of
1854 was a major cholera epidemic or
severe outbreak of cholera which
occurred in 1854 near Broad Street in
the London district of Soho in England .
• This cholera outbreak is best known for
statistical analysis and study of the
epidemic by the physician John Snow
and his discovery that cholera is spread
by contaminated water. This knowledge
drove improvement in Public Health with
mass construction of sanitation facilities
from the middle of the19th century.
• Later, the term "focus of infection" would
be used to describe factors such as the
Broad Street pump – where Social and
Environmental conditions may result in
the outbreak of local infectious diseases.
History of Digital Epidemiology
• It was the study of
cholera epidemics,
particularly in
Victorian England
during the middle of
the 19th century,
which laid the
foundation for
epidemiology - the
applied observation
and surveillance of
epidemics and the
statistical analysis of
public health data.
• This discovery came
at a time when the
miasma theory of
disease transmission
by noxious “foul air”
prevailed in the
medical community.
History of Digital Epidemiology
Modern epidemiology has its origin with the study of Cholera
Broad Street cholera outbreak of 1854
History of Digital Epidemiology
Modern epidemiology has its origin with the study of Cholera.
• It was the study of cholera epidemics, particularly in Victorian England
during the middle of the 19th century, that laid the foundation for the science
of epidemiology - the applied observation and surveillance of epidemics and
the statistical analysis of public health data. It was during a time when the
miasma theory of disease transmission prevailed in the medical community.
• John Snow is largely credited with sparking and pursuing a transformation in
Public Health and epidemic disease management from the extant paradigm
in which communicable illnesses were thought to have been carried by
bad, malodorous airs, or "miasmas“ - towards a new paradigm which would
begin to recognize that virulent contagious and infectious diseases are
communicated by various other means – such as water being polluted by
human sewage. This new approach to disease management recognised that
contagious diseases were either directly communicable through contact with
infected individuals - or via vectors of infection (water, in the case of cholera)
which are susceptible to contamination by viral and bacterial agents.
History of Digital Epidemiology
• This map is John Snow’s
famous plot of the 1854
Broad Street Cholera
Outbreak in London. By
plotting epidemic data on a
map like this, John Snow
was able to identify that the
outbreak was centred on a
specific water pump.
• Interviews confirmed that
outlying cases were from
people who would regularly
walk past the pump and
take a drink. He removed
the handle off the water
pump and the outbreak
ended almost overnight.
• The cause of cholera
(bacteria Vibria cholerae)
was unknown at the time,
and Snow’s important work
with cholera in London
during the 1850s is
considered the beginning of
modern epidemiology.
Some have even gone so
far as to describe Snow’s
Broad Street Map as the
world’s first GIS.
History of Digital Epidemiology
Broad Street cholera outbreak of 1854
Clinical Risk Types
Clinical Risk Types
Clinical
Risk Group
Employee
or Service
Provider
Patient
B
A
Human
Risk Process
Risk
D
Morbidity Risk Types
Morbidity
Risk Group
C
Legal
Risk
F
3rd Party
Risk
G
C
Technology
Risk
Trauma
Risk
E
Morbidity Risk
H E
J
G
A
I D
Immunological
System Risk
Sponsorship
Risk
Stakeholders
Disease
Risk
Shock
Risk
Cardiovascular
System Risk
Pulmonary
System Risk
Toxicity
Risk
Organ Failure
Risk
- Airways
- Cognitive
- Bleeding
Triage Risk
- Performance
- Finance
- Standards
Compliance Risk
H
Patient
Risk
Neurological
System Risk
F
B
Predation
Risk
Environment
Risk
Patients
Risk Complexity Map
• Case Study • Pandemics
• Case Study • Pandemics
• Pandemics - during a pandemic episode, such as the recent Ebola outbreak, current
policies emphasise the need to ground decision-making on empiric evidence. This section
studies the tension that remains in decision-making processes when their is a sudden and
unpredictable change of course in an outbreak – or when key evidence is weak or ‘silent’.
• The current focus in epidemiology is on the ‘known unknowns’ - factors with which we
are familiar in the pandemic risk assessment processes. These risk processes cover, for
example, monitoring the course of the pandemic, estimating the most affected age groups,
and assessing population-level clinical and pharmaceutical interventions. This section
looks for the ‘unknown unknowns’ - factors with a lack of, or silence, of evidence, which
we have only limited or weak understanding in the pandemic risk assessment processes.
• Pandemic risk assessment shows, that any developing, new and emerging or sudden and
unpredictable change in the pandemic situation does not accumulate a robust body of
evidence for decision making. These uncertainties may be conceptualised as ‘unknown
unknowns’, or “silent evidence”. Historical and archaeological pandemic studies indicate
that there may well have been evidence that was not discovered, known or recognised.
This section looks at a new method to discover “silent evidence” - unknown factors - that
affect pandemic risk assessment - by focusing on the tension under pressure that impacts
upon the actions of key decision-makers in the pandemic risk decision-making process.
Antonine Plague (Smallpox ) AD 165-180
Pandemic Black Swan Events
Black Swan Pandemic Type / Location Impact Date
Malaria
For the entirety of human history,
Malaria has been a pathogen
The Malaria pathogen kills more
humans than any other disease
20 kya – present
Smallpox (Antonine Plague) Smallpox Roman Empire / Italy Smallpox is the 2nd worst killer 165-180
Black Death (Plague of Justinian) Bubonic Plague – Roman Empire 50 million people died 6th century
Black Death (Late Middle Ages) Bubonic Plague – Europe 75 to 200 million people died 1340–1400
Smallpox Amazonian Basin Indians 90% Amazonian Indians died 16th century
Tuberculosis Western Europe, 18th - 19th c 900 deaths per 100,000 pop. 18th - 19th c
Syphilis Global pandemic – invariably fatal 10% of Victorian men carriers 19th century
1st Cholera Pandemic Global pandemic Started in the Bay of Bengal 1817-1823
2nd Cholera Pandemic Global pandemic (arrived in London in 1832) 1826-1837
Spanish Flu Global pandemic 50 million people died 1918
Smallpox Global pandemic 300 million people died in 20th c Eliminated 20th c
Poliomyelitis Global pandemic
Contracted by up to 500,000
persons per year 1950’s/1960’s
1950’s -1960’s
AIDS Global pandemic – mostly fatal 10% Sub-Saharans are carriers Late 20th century
Ebola West African epidemic – 50% fatal Sub-Saharan Africa epicentre Late 20th century
For the entirety of human history, Malaria has
been the most lethal pathogen to attack man
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
1 Malaria Parasitic
Biological
Disease
The Malaria pathogen has killed more humans than any other disease. Malaria
may have been a human pathogen for the entire history of our species. Human
malaria most likely originated in Africa and has coevolved along with its hosts,
mosquitoes and non-human primates. Humans could have originally caught
Plasmodium falciparum from gorillas. The first evidence of malaria parasites are
approximately 30 million years old, found in mosquitoes preserved in amber from
the Palaeogene period.About 10,000 years ago, a period which coincides with the
development of agriculture (Neolithic revolution) - malaria started having a major
impact on human survival. A consequence was natural selection for sickle-cell
disease, thalassaemias, glucose-6-phosphate dehydrogenase deficiency,
ovalocytosis, elliptocytosis and loss of the Gerbich antigen (glycophorin C) and
the Duffy antigen on erythrocytes because such blood disorders confer a selective
advantage against malarial infection (balancing selection). The first description of
malaria dates back 4000 years to 2700 B.C. from China, where ancient writings
refer to symptoms now commonly associated with malaria. Early anti-malarial
treatments were first developed in China from the Quinghao plant, which contains
the active ingredient artemisinin, re-discovered and still used in anti-malaria drugs
today. The three major types of inherited genetic resistance to malaria (sickle-cell
disease, thalassaemias, and glucose-6-phosphate dehydrogenase deficiency)
were all present in the Mediterranean world 2,000 years ago, at the peak of the
Roman Empire. The role of epidemics and disease in the ultimate decline and fall
of the Roman Empire has been largely overlooked by Epidemiology researchers.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
2 Smallpox Viral
Biological
Disease
The history of smallpox holds a unique place in medical history. One of the
deadliest viral diseases known to man, it is the first disease to be treated by
vaccination - and also the only disease to have been eradicated from the
face of the earth by vaccination. Smallpox plagued human populations for
thousands of years. Researchers who examined the mummy of Egyptian
pharaoh Ramses V (died 1157 BCE) observed scarring similar to that from
smallpox on his remains. Ancient Sanskrit medical texts, dating from about
1500 BCE, describe a smallpox-like illness. Smallpox was most likely
present in Europe by about 300 CE. – although there are no unequivocal
records of smallpox in Europe before the 6th century CE. It has been
suggested that it was a major component of the Plague of Athens that
occurred in 430 BCE, during the Peloponnesian Wars, and was described
by Thucydides. A recent analysis of the description of clinical features
provided by Galen during the Antonine Plague that swept through the
Roman Empire and Italy in 165–180, indicates that the probable cause was
smallpox. In 1796, after noting Smallpox immunity amongst milkmaids –
Edward Jenner carried out his now famous experiment on eight-year-old
James Phipps, using Cow Pox as a vaccine to confer immunity to Smallpox.
Some estimates indicate that 20th century worldwide deaths from smallpox
numbered more than 300 million. The last known case of wild smallpox
occurred in Somalia in 1977 – until recent outbreaks in Pakistan and Syria.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
3 Bubonic
Plague
Bacterial
Biological
Disease
The Bubonic Plague – or Black Death – was one of the most devastating
pandemics in human history, killing an estimated 75 to 200 million people
and peaking in Europe in the years 1348–50 CE. The Bubonic Plague is a
bacterial disease – spread by fleas carried by Asian Black Rats - which
originated in or near China and then travelled to Italy, overland along the Silk
Road, or by sea along the Silk Route. From Italy the Black Death spread
onwards through other European countries. Research published in 2002
suggests that the Black Death began in the spring of 1346 in the Russian
steppe region, where a plague reservoir stretched from the north-western
shore of the Caspian Sea into southern Russia. Although there were
several competing theories as to the etiology of the Black Death, analysis of
DNA from victims in northern and southern Europe published in 2010 and
2011 indicates that the pathogen responsible was the Yersinia pestis
bacterium, possibly causing several forms of plague. The first recorded
epidemic ravaged the Byzantine Empire during the sixth century, and was
named the Plague of Justinian after emperor Justinian I, who was infected
but survived through extensive treatment. The epidemic is estimated to have
killed approximately 50 million people in the Roman Empire alone. During
the Late Middle Ages (1340–1400) Europe experienced the most deadly
disease outbreak in history when the Black Death, the infamous pandemic
of bubonic plague, peaked in 1347, killing one third of the human population.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
4 Syphilis Bacterial
Biological
Disease
Syphilis - the exact origin of syphilis is unknown. There are two primary
hypotheses: one proposes that syphilis was carried from the Americas to
Europe by the crew of Christopher Columbus, the other proposes that
syphilis previously existed in Europe but went unrecognized. These are
referred to as the "Columbian" and "pre-Columbian" hypotheses. In late 2011
newly published evidence suggested that the Columbian hypothesis is valid.
The appearance of syphilis in Europe at the end of the 1400s heralded
decades of death as the disease raged across the continent. The first
evidence of an outbreak of syphilis in Europe were recorded in 1494/1495
in Naples, Italy, during a French invasion. First spread by returning French
troops, the disease was known as the “French Pox”, and it was not until
1530 that the term "syphilis" was first applied by the Italian physician and
poet Girolamo Fracastoro. By the 1800s it had become endemic, carried by
as many as 10% of men in some areas - in late Victorian London this may
have been as high as 20%. Invariably fatal, associated with extramarital sex
and prostitution, syphilis was accompanied by enormous social stigma. The
secretive nature of syphilis helped it spread - disgrace was such that many
sufferers hid their symptoms, while others carrying the latent form of the
disease were unaware they even had it. Treponema pallidum, the syphilis
causal organism, was first identified by Fritz Schaudinn and Erich Hoffmann
in 1905. The first effective treatment (Salvarsan) was developed in 1910
by Paul Ehrlich which was followed by the introduction of penicillin in 1943.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
5 Tuberculosis Bacterial
Biological
Disease
Tuberculosis - the evolutionary origins of the Mycobacterium tuberculosis
indicates that the most recent common ancestor was a human-specific
pathogen, which encountered an evolutionary bottleneck leading to
diversification. Analysis of mycobacterial interspersed repetitive units has
allowed dating of this evolutionary bottleneck to approximately 40,000 years
ago, which corresponds to the period subsequent to the expansion of Homo
sapiens out of Africa. This analysis of mycobacterial interspersed repetitive
units also dated the Mycobacterium bovis lineage as dispersing some 6,000
years ago. Tuberculosis existed 15,000 to 20,000 years ago, and has been
found in human remains from ancient Egypt, India, and China. Human
bones from the Neolithic show the presence of the bacteria, which may be
linked to early farming and animal domestication. Evidence of tubercular
decay has been found in the spines of Egyptian mummies, and TB was
common both in ancient Greece and Imperial Rome. Tuberculosis reached
its peak the 18th century in Western Europe with a prevalence as high as
900 deaths per 100,000 - due to malnutrition and overcrowded housing with
poor ventilation and sanitation. Although relatively little is known about its
frequency before the 19th century, the incidence of Scrofula (consumption)
“the captain of all men of death” is thought to have peaked between the end
of the 18th century and the end of the 19th century. With advent of HIV there
has been a dramatic resurgence of tuberculosis with more than 8 million
new cases reported each year worldwide and more than 2 million deaths.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
6 Cholera Bacterial
Biological
Disease
Cholera is a severe infection in the small intestine caused by the bacterium
vibrio cholerae, contracted by drinking water or eating food contaminated
with the bacterium. Cholera symptoms include profuse watery diarrhoea and
vomiting. The primary danger posed by cholera is severe dehydration, which
can lead to rapid death. Cholera can now be treated with re-hydration and
prevented by vaccination. Cholera outbreaks in recorded history have
indeed been explosive and the global proliferation of the disease is seen by
most scholars to have occurred in six separate pandemics, with the seventh
pandemic still rampant in many developing countries around the world. The
first recorded instance of cholera was described in 1563 in an Indian medical
report. In modern times, the story of the disease begins in 1817 when it
spread from its ancient homeland of the Ganges Delta in the bay of Bengal
in North East India - to the rest of the world. The first cholera pandemic
raged from 1817-1823, the second from 1826-1837 The disease reached
Britain during October 1831 - and finally arrived in London in 1832 (13,000
deaths) with subsequent major outbreaks in 1841, 1848 (21,000 deaths)
1854 (15,000 deaths) and 1866. Surgeon John Snow – by studying the
outbreak cantered around the Broad Street well in 1854 – traced the source
of cholera to drinking water which was contaminated by infected human
faeces – ending the “miasma” or “bad air” theory of cholera transmission.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
7 Poliomyelitis Viral
Biological
Disease
The history of poliomyelitis (polio) infections extends into prehistory.
Ancient Egyptian paintings and carvings depict otherwise healthy people
with withered limbs, and children walking with canes at a young age.[3] It is
theorized that the Roman Emperor Claudius was stricken as a child, and this
caused him to walk with a limp for the rest of his life. Perhaps the earliest
recorded case of poliomyelitis is that of Sir Walter Scott. At the time, polio
was not known to medicine. In 1773 Scott was said to have developed "a
severe teething fever which deprived him of the power of his right leg." The
symptoms of poliomyelitis have been described as: Dental Paralysis,
Infantile Spinal Paralysis, Essential Paralysis of Children, Regressive
Paralysis, Myelitis of the Anterior Horns and Paralysis of the Morning.
In 1789 the first clinical description of poliomyelitis was provided by the
British physician Michael Underwood as "a debility of the lower extremities”.
Although major polio epidemics were unknown before the 20th century, the
disease has caused paralysis and death for much of human history. Over
millennia, polio survived quietly as an endemic pathogen until the 1880s
when major epidemics began to occur in Europe; soon after, widespread
epidemics appeared in the United States. By 1910, frequent epidemics
became regular events throughout the developed world, primarily in cities
during the summer months. At its peak in the 1940s and 1950s, polio would
maim, paralyse or kill over half a million people worldwide every year
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
8 Typhus Bacterial
Biological
Disease
Typhoid fever (jail fever) is an acute illness associated with a high fever that
is most often caused by the Salmonella typhi bacteria. Typhoid may also be
caused by Salmonella paratyphi, a related bacterium that usually leads to a
less severe illness. The bacteria are spread via deposition in water or food
by a human carrier. An estimated 16–33 million cases of typhoid fever occur
annually. Its incidence is highest in children and young adults between 5 and
19 years old. These cases as of 2010 caused about 190,000 deaths up from
137,000 in 1990. Historically, in the pre-antibiotic era, the case fatality rate of
typhoid fever was 10-20%. Today, with prompt treatment, it is less than 1%.
9 Dysentery Bacterial /
Parasitic
Biological
Disease
Dysentery (the Flux or the bloody flux) is a form of gastroenteritis – a type
inflammatory disorder of the intestine, especially of the colon, resulting in
severe diarrhea containing blood and mucus in the feces accompanied by
fever, abdominal pain and rectal tenesmus (feeling incomplete defecation),
caused by any kind of gastric infection. Conservative estimates suggest
that 90 million cases of Bacterial Dysentery (Shigellosis) are contracted
annually, killing at least 100,000. Amoebic Dysentery (Amebiasis) infects
some 50 million people each year, with over 50,000 cases resulting in death.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
10 Spanish
Flu
Viral
Biological
Disease
In the United States, the Spanish Flu was first observed in Haskell County,
Kansas, in January 1918, prompting a local doctor, Loring Miner to warn the
U.S. Public Health Service's academic journal. On 4th March 1918, army cook
Albert Gitchell reported sick at Fort Riley, Kansas. A week later on 11th March
1918, over 100 soldiers were in hospital and the Spanish Flu virus had now
reached Queens New York. Within days, 522 men had reported sick at the
army camp. In August 1918, a more virulent strain appeared simultaneously
in Brest, Brittany-France, in Freetown, Sierra Leone, and in the U.S, in Boston,
Massachusetts. It is estimated that in 1918, between 20-40% of the worlds
population became infected by Spanish Flu - with 50 million deaths globally.
11 HIV / AIDS Viral
Biological
Disease
AIDS was first reported in America in 1981 – and provoked reactions which
echoed those associated for so long with syphilis. Many of the earliest cases
were among homosexual men - creating a climate of prejudice and moral
panic. Fear of catching this new and terrifying disease was also widespread
among the public. The observed time-lag between contracting HIV and the
onset of AIDS, coupled with new drug treatments, changed perceptions.
Increasingly it was seen as a chronic but manageable disease. The global
story was very different - by the mid-1980s it became clear that the virus had
spread, largely unnoticed, throughout the rest of the world. The nature of this
global pandemic varies from region to region, with poorer areas hit hardest. In
parts of sub-Saharan Africa nearly 1 in 10 adults carries the virus - a statistic
which is reminiscent of the spread of syphilis in parts of Europe in the 1800s.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
12 Ebola Haemorrhagic
Viral
Biological
Disease
Ebola is a highly lethal Haemorrhagic Viral Biological Disease, which has
caused at least 16 confirmed outbreaks in Africa between 1976 and 2015.
Ebola Virus Disease (EVD) is found in wild great apes and kills up to 90% of
humans infected - making it one of the deadliest diseases known to man. It is
so dangerous that it is considered to be a potential Grade A bioterrorism agent
– on a par with anthrax, smallpox, and bubonic plague. The current outbreak
of EVD has seen confirmed cases in Guinea, Liberia and Sierra Leone,
countries in an area of West Africa where the disease has not previously
occurred. There were also a handful of suspected cases in neighbouring Mali,
but these patients were found to have contracted other diseases
For each epidemic, transmission was quantified in different settings (illness in
the community, hospitalization, and traditional burial) and predictive analytics
simulated various epidemic scenarios to explore the impact of medical control
interventions on an emerging epidemic. A key medical parameter was the
rapid institution of control measures. For both epidemic profiles identified,
increasing the rate of hospitalization reduced the predicted epidemic size.
Over 4000 suspected cases of EVD have been recorded, with the majority of
them in Guinea. The current outbreak has currently resulted in over 2000
deaths. These figures will continue to rise as more patients die and as test
results confirm that they were infected with Ebola.
Pandemic Black Swan Event Types
Ebola is a highly lethal Haemorrhagic Viral Biological Disease, which has
caused at least 16 confirmed outbreaks in Africa between 1976 and 2015.
Pandemic Black Swan Event Types
Type Force Epidemiology Black Swan Event
13 Future
Bacterial
Pandemic
Infections
Bacterial
Biological
Disease
Bacteria were most likely the real killers in the 1918 H1N1 Flu Pandemic - the
vast majority of deaths in the 1918–1919 influenza pandemic resulted directly
from secondary bacterial pneumonia, caused by common upper respiratory-
tract bacteria. Less substantial data from the subsequent 1957 and 1968 Flu
pandemics are consistent with these findings. If severe pandemic influenza is
largely a problem of viral-bacterial co-pathogenesis, pandemic planning needs
to go beyond addressing the viral cause alone (influenza vaccines and
antiviral drugs). The diagnosis, prophylaxis, treatment and prevention of
secondary bacterial pneumonia - as well as stockpiling of antibiotics and
bacterial vaccines – should be high priorities for future pandemic planning.
14 Future
Viral
Pandemic
infections
Viral
Biological
Disease
What was Learned from Reconstructing the 1918 Spanish Flu Virus
Comparing pandemic H1N1 influenza viruses at the molecular level yields key
insights into pathogenesis – the way animal viruses mutate to cross species.
The availability of these two H1N1 virus genomes separated by over 90 years,
provided an unparalleled opportunity to study and recognise genetic properties
associated with virulent pandemic viruses - allowing for a comprehensive
assessment of emerging influenza viruses with human pandemic potential.
There are only four to six mutations required within the first three days of viral
infection in a new human host, to change an animal virus to become highly
virulent and infectious to human beings. Candidate viral gene pools for future
possible Human Pandemics include Anthrax, Ebola, Lassa Fever, Rift Valley
Fever, SARS, MIRS, H1N1 Swine Flu (2009) and H7N9 Avian / Bat Flu (2013).
Clustering in “Big Data”
“A Cluster is a group of the same or similar data elements
which are aggregated – or closely distributed – together”
Clustering is a technique used to explore content and
understand information in every business sector and scientific
field that collects and processes very large volumes of data
Clustering is an essential tool for any “Big Data” problem
Multiple Factor Regression Analysis
In a multivariate regression case, where
there are two or more independent
variables, then the resultant regression
plane cannot be visualised within the
constraints of a two dimensional plane…..
Multiple Factor Regression Analysis
In a multivariate regression case, where there are two
or more independent variables, then the resultant
regression plane cannot be visualised within the
constraints of a two dimensional plane…..
Data Visualisation - Tufte in R
"The idea behind Tufte in R is to use R - the easiest and most powerful
open-source statistical analysis programming language - to replicate
the excellent data visualisation practices developed by Edward Tufte“
- Diego Marinho de Oliveira - Lead Data Scientist / Ph.D. candidate
• “Big Data” refers to vast aggregations (super sets) consisting of numerous individual
datasets (structured and unstructured) - whose size and scope is beyond the capability of
conventional transactional (OLTP) or analytics (OLAP) Database Management Systems
and Enterprise Software Tools to capture, store, analyse and manage. Examples of “Big
Data” include the vast and ever changing amounts of data generated in social networks
where we maintain Blogs and have conversations with each other, news data streams,
geo-demographic data, internet search and browser logs, as well as the ever-growing
amount of machine data generated by pervasive smart devices - monitors, sensors and
detectors in the environment – captured via the Smart Grid, then processed in the Cloud –
and delivered to end-user Smart Phones and Tablets via Intelligent Agents and Alerts.
• Data Set Mashing and “Big Data” Global Content Analysis – drives Horizon Scanning,
Monitoring and Tracking processes by taking numerous, apparently un-related RSS and
other Information Streams and Data Feeds, loading them into Very large Scale (VLS)
DWH Structures and Document Management Systems for Real-time Analytics – searching
for and identifying possible signs of relationships hidden in data (Facts/Events)– in order to
discover and interpret previously unknown Data Relationships driven by hidden Clustering
Forces – revealed via “Weak Signals” indicating emerging and developing Application
Scenarios, Patterns and Trends - in turn predicating possible, probable and alternative
global transformations which may unfold as future “Wild Card” or “Black Swan” events.
“Big Data”
Clustering in “Big Data”
• The profiling and analysis of
large aggregated datasets in
order to determine a ‘natural’
structure of groupings provides
an important technique for many
statistical and analytic
applications. Cluster analysis
on the basis of profile similarities
or geographic distribution is a
method where no prior
assumptions are made
concerning the number of
groups or group hierarchies and
internal structure. Geo-
demographic techniques are
frequently used in order to
profile and segment populations
by ‘natural’ groupings - such as
common behavioural traits,
Clinical Trial, Morbidity or
Actuarial outcomes - along with
many other shared
characteristics and common
factors.....
Clustering in “Big Data”
• "BIG DATA” ANALYTICS – PROFILING, CLUSTERING and 4D GEOSPATIAL ANALYSIS •
• The profiling and analysis of large aggregated datasets in order to determine a ‘natural’
structure of data relationships or groupings, is an important starting point forming the basis of
many mapping, statistical and analytic applications. Cluster analysis of implicit similarities -
such as time-series demographic or geographic distribution - is a critical technique where no
prior assumptions are made concerning the number or type of groups that may be found, or
their relationships, hierarchies or internal data structures. Geospatial and demographic
techniques are frequently used in order to profile and segment populations by ‘natural’
groupings. Shared characteristics or common factors such as Behaviour / Propensity or
Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore
previously unknown, concealed or unrecognised insights, patterns, trends or data relationships.
• PREDICTIVE ANALYITICS and EVENT FORECASTING •
• Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring
methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting
and Propensity Models in order to anticipate a wide range of business. economic, social and
political Future Events – ranging from micro-economic Market phenomena such as forecasting
Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal
phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events
- such as Monetary System shocks.
Digital Healthcare - Patient Experience and Journey
• The last decade has seen an unprecedented explosion in mobile platforms
as the internet and mobile worlds came of age. It is no longer acceptable
just to have a bricks-and-mortar clinical presence only – patient-focused
healthcare providers are now expected to deliver their Patient Experience
and Journey via internet websites, mobile phones and more recently tablets.
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume Data Flows
– Mobile Enterprise Platforms (MEAP’s)
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
– Data Delivery and Consumption
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Presentation and Display
Excel
Web
Mobile
– Data Management Processes
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
– Performance Acceleration
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Management Tools
DataFlux
Embarcadero
Informatica
Talend
– Info. Management Tools
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
– Data Warehouse Appliances
Ab Initio
Ascential
Genio
Orchestra
Social Intelligence – The Emerging Big Data Stack
GIS MAPPING and SPATIAL DATA ANALYSIS
• A Geographic Information System (GIS) integrates hardware, software and
digital data capture devices for acquiring, managing, analysing, distributing and
displaying all forms of geographically dependant location data – including
machine generated data such as Computer-aided Design (CAD) data from land
and building surveys, Global Positioning System (GPS) terrestrial location data -
as well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
GIS Mapping and Spatial Analysis
• GIS MAPPING and SPATIAL DATA ANALYSIS •
• A Geographic Information System (GIS) integrates hardware, software and
digital data capture devices for acquiring, managing, analysing, distributing and
displaying all forms of geographically dependant location data – including machine
generated data such as Computer-aided Design (CAD) data from land and
building surveys, Global Positioning System (GPS) terrestrial location data - as
well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
• Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial
(Geographic) data and location (Positional) object data overlays. Software that
implements spatial analysis techniques requires access to both the locations of
objects and their physical attributes. Spatial statistics extends traditional statistics
to support the analysis of geographic data. Spatial Data Analysis provides
techniques to describe the distribution of data in the geographic space (descriptive
spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster
analysis), identify and measure spatial relationships (spatial regression), and
create a surface from sampled data (spatial interpolation, usually categorized as
geo-statistics).
• The results of spatial data analysis are largely dependent upon the type,
quantity, distribution and data quality of the spatial objects under analysis.
World-wide Visitor Count – GIS Mapping
Geo-demographic Clustering in “Big Data”
• GEODEMOGRAPHIC PROFILING – CLUSTERING IN“BIG DATA” •
• The profiling and analysis of large aggregated datasets in order to determine a
‘natural’ or implicit structure of data relationships or groupings where no prior
assumptions are made concerning the number or type of groups discovered or group
relationships, hierarchies or internal data structures - in order to discover hidden data
relationships - is an important starting point forming the basis of many statistical and
analytic applications. The subsequent explicit Cluster Analysis as of discovered data
relationships is a critical technique which attempts to explain the nature, cause and
effect of those implicit profile similarities or geographic distributions. Demographic
techniques are frequently used in order to profile and segment populations using
‘natural’ groupings - such as common behavioural traits, Clinical, Morbidity or Actuarial
outcomes, along with many other shared characteristics and common factors – and
then attempt to understand and explain those natural group affinities and geographical
distributions using methods such as Causal Layer Analysis (CLA).....
GIS Mapping and Spatial Analysis
• A Geographic Information System (GIS) integrates hardware, software and digital
data capture devices for acquiring, managing, analysing, distributing and displaying all
forms of geographically dependant location data – including machine generated data
such as Computer-aided Design (CAD) data from land and building surveys, Global
Positioning System (GPS) terrestrial location data - as well as all kinds of data
streams - HDCCTV, aerial and satellite image data.....
• Spatial Data Analysis is a set of techniques for analysing spatial (Geographic)
location data. The results of spatial analysis are dependent on the locations of
the objects being analysed. Software that implements spatial analysis techniques
requires access to both the locations of objects and their physical attributes.
• Spatial statistics extends traditional statistics to support the analysis of geographic
data. Spatial Data Analysis provides techniques to describe the distribution of data in
the geographic space (descriptive spatial statistics), analyse the spatial patterns of the
data (spatial pattern or cluster analysis), identify and measure spatial relationships
(spatial regression), and create a surface from sampled data (spatial interpolation,
usually categorized as geo-statistics).
BTSA Induction Cluster Map
Geo-Demographic Profile Clusters
Targeting – Map / Reduce
Consume – End-User Data
Data Acquisition – High-Volume
– Mobile Enterprise Platforms (MEAP’s)
– Data Delivery and Consumption
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Management Processes
– Performance Acceleration
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
Data Extract, Transform, Load
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Presentation and Display
– Data Management Tools
– Info. Management Tools
– Data Warehouse Appliances
Excel
Web
Mobile
DataFlux
Embarcadero
Informatica
Talend
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now EMC2)
Extreme Data xdg
Zybert Gridbox
Ab Initio
Ascential
Genio
Orchestra
Clustering Phenomena in “Big Data”
“A Cluster is a group of profiled data similarities aggregated closely together”
• Cluster Analysis is a technique which is used to explore very large volumes of
structured and unstructured data - transactional, machine generated (automatic)
social media and internet content and geo-demographic information - in order to
discover previously unknown, unrecognised or hidden logical data relationships.
Event Clusters and Connectivity
A
B
C
D
E
G
H
F
The above is an illustration of Event relationships - how Events might be connected. Any detailed,
intimate understanding of the connection between Events may help us to answer questions such as: -
• If Event A occurs does it make Event B or H more or less likely to occur ?
• If Event B occurs what effect does it have on Events C,D,E, F and G ?
Answering questions such as these allows us to plan our Event Management approach and Risk
mitigation strategy – and to decide how better to focus our Incident / Event resources and effort…..
Event Clusters and Connectivity
• Aggregated Event includes coincident, related, connected and interconnected Event: -
• Coincident - two or more Events appear simultaneously in the same domain –
but they arise from different triggers (unrelated causal events)
• Related - two more Events materialise in the same domain sharing common
Event features or characteristics (may share a possible hidden common trigger or
cause – and so are candidates for further analysis and investigation)
• Connected - two more Events materialise in the same domain due to the same
trigger (common cause)
• Interconnected - two more Events materialise together in a Event cluster, series
or “storm” - the previous (prior) Event event triggering the subsequent (next) event
in an Event Series…..
• A series of Aggregated Events may result in a significant cumulative impact - and are
therefore frequently identified incorrectly as Wild-card or Black Swan Events - rather
than just simply as event clusters or event “storms”.....
Event Clusters and Connectivity
1
2
3
4
5
7
8
6
The above is an illustration of Event relationships - how Risk Events might be connected. A detailed and
intimate understanding of Event clusters and the connection between Events may help us to understand: -
• What is the relationship between Events 1 and 8, and what impact do they have on Events 2 - 7 ?
• Events 2 - 5 and Events 6 and 7 occur in clusters – what are the factors influencing these clusters ?
Answering questions such as these allows us to plan our Risk Event management approach and mitigation
strategy – and to decide how to better focus our resources and effort on Risk Events and fraud management.
Claimant 1
Risk Event
Claimant 2
Residence
Vehicle
Event
Cluster
Aggregated Event Types
ATrigger A
Coincident Events
BTrigger B
Event
Event
CTrigger 1
Related Events
DTrigger 2
Event
Event
E
Trigger
Connected Events
Event
EventF
GTrigger
Inter-connected Events
Event Event
H
Event Complexity Map
From sports to scientific research, a surprising range
of industries will begin to find value in big data.....
Big Data – Products
The MapReduce technique has spilled over into many other disciplines that process vast
quantities of information including science, industry, and systems management. The Apache
Hadoop Library has become the most popular implementation of MapReduce – with
framework implementations from Cloudera, Hortonworks and MAPR
“Big Data” Applications
• Science and Technology
– Pattern, Cycle and Trend Analysis
– Horizon Scanning, Monitoring and Tracking
– Weak Signals, Wild Cards, Black Swan Events
• Multi-channel Retail Analytics
– Customer Profiling and Segmentation
– Human Behaviour / Predictive Analytics
• Global Internet Content Management
– Social Media Analytics
– Market Data Management
– Global Internet Content Management
• Smart Devices and Smart Apps
– Call Details Records
– Internet Content Browsing
– Media / Channel Selections
– Movies, Video Games and Playlists
• Broadband / Home Entertainment
– Call Details Records
– Internet Content Browsing
– Media / Channel Selections
– Movies, Video Games and Playlists
• Smart Metering / Home Energy
– Energy Consumption Details Records
• Civil and Military Intelligence
 Digital Battlefields of the Future – Data Gathering
 Future Combat Systems - Intelligence Database
 Person of Interest Database – Criminal Enterprise,
Political organisations and Terrorist Cell networks
 Remote Warfare - Threat Viewing / Monitoring /
Identification / Tracking / Targeting / Elimination
 HDCCTV Automatic Character/Facial Recognition
• Security
 Security Event Management - HDCCTV, Proximity
and Intrusion Detection, Motion and Fire Sensors
 Emergency Incident Management - Response
Services Command, Control and Co-ordination
• Biomedical Data Streaming
 Care in the Community
 Assisted Living at Home
 Smart Hospitals and Clinics
• Internet of Things (IOT)
 SCADA Remote Sensing, Monitoring and Control
 Smart Grid Data (machine generated data)
 Vehicle Telemetry Management
 Intelligent Building Management
 Smart Homes Automation
Comparing Data in RDBMS, Appliances and Hadoop
RDBMS DWH DWH Appliance Hadoop Cluster
Data size Gigabytes Terabytes Petabytes
Access Interactive and batch Interactive and batch Batch
Structure Fixed schema Fixed schema Unstructured schema
Language SQL SQL Non-procedural Languages
(NoSQL, Hive, Pig, etc)
Data Integrity High High Low
Architecture Shared memory - SMP Shared nothing - MPP Hadoop DFS
Virtualisation Partitions / Regions MPP / Nodal MPP / Clustered
Scaling Nonlinear Nodal / Linear Clustered / Linear
Updates Read and write Write once, read many Write once, read many
Selects Row-based Set-based Column-based
Latency Low – Real-time Low – Near Real-time High – Historic Information
Figure 1: Comparing RDBMS to MapReduce
“Big Data” – Analysing and Informing
• “Big Data” is now a torrent raging through every aspect of the global economy – both the
public sector and private industry. Global enterprises generate enormous volumes of
transactional data – capturing trillions of bytes of information from the internal and
external environment. Data Sources include Social Media, Internet Content, Remote
Sensors, Monitors and Controllers, and transactions from their own internal business
operations – global markets. supply chain, business partners, customers and suppliers.
1. SENSE LAYER – Remote Monitoring and Control Devices – WHAT and WHEN?
2. COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE) – VIA ?
3. SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – WHY?
4. GEO-DEMOGRAPHIC LAYER – Social Media, People and Places – WHO and WHERE ?
5. INFORMATION LAYER – “Big Data” and Internet Content data set “mashing” – HOW ?
6. INFRASTRUCTURE LAYER – Cloud Services / Hadoop Clusters / GPGPUs / SSDs
“Big Data” – Analysing and Informing
COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE)
Biomedical Smart Apps – VIA ?
SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – HOW ?
INFORMATION LAYER – “Big Data” Analytics MapReduce / Data Set “mashing”
Data Science / Causal Layer Analysis – WHY ?
INFRASTRUCTURE LAYER – Cloud Service Platforms
Hadoop Clusters / GPGPUs / SSDs
SENSE LAYER – Remote Monitoring and Control Devices – WHAT and WHEN ?
GEO-DEMOGRAPHIC LAYER – People and Places – WHO and WHERE?
“Big Data” – Analysing and Informing
• SENSE LAYER – Remote Monitoring and Control – WHAT and WHEN?
– Remote Sensing – Sensors, Monitors, Detectors, Smart Appliances / Devices
– Remote Viewing – Satellite. Airborne, Mobile and Fixed HDCCTV
– Remote Monitoring, Command and Control – SCADA
• GEO-DEMOGRAPHIC LAYER – People and Places – WHO and WHERE?
– Person and Social Network Directories - Personal and Social Media Data
– Location and Property Gazetteers - Building Information Models (BIM)
– Mapping and Spatial Analysis - Topology, Landscape, Global Positioning Data
• COMMUNICATION LAYER – Mobile Enterprise Platforms and the Smart Grid
– Connectivity - Smart Devices, Smart Apps, Smart Grid
– Integration - Mobile Enterprise Application Platforms (MEAPs)
– Backbone – Wireless and Optical Next Generation Network (NGE) Architectures
“Big Data” – Analysing and Informing
SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – WHY?
COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE)
Biomedical Smart Apps – VIA ?
Market
Survey DataTV Set-top Box
Channel Selections
Smart App
Playlists
Geographic &
Demographic
Survey Data
EntertainmentFactory Office &
Warehouse
Wearable &
Personal
Technology
Transport Public Buildings Smart
Homes
Public house
Mall, Shop,
Store
Smart
Kiosks &
Cubicles
Mobile
Smart
Apps
CCTV /
ANPR
Social
Intelligence
Campaign
Management
e-Business
Smart Apps
Big Data Analytics
The Pyramid™
Customer Loyalty
& Brand Affinity
The Pyramid™
Analytics
Smart Apps
INFRASTRUCTURE LAYER – Cloud Services
Hadoop Clusters / GPGPUs / SSDs
SENSE LAYER – Remote Monitoring, Data and Control Devices – WHAT and WHEN ?
“Big Data” – Analysing and Informing
• SERVICE LAYER – Real-time Analytics – WHY?
– Global Mapping and Spatial Analysis
– Service Aggregation, Intelligent Agents and Alerts
– Data Analysis, Data Mining and Statistical Analysis
– Optical and Wave-form Analysis and Recognition, Pattern and Trend Analysis
– Big Data - Hadoop Clusters / GPGPUs / SSDs
• INFORMATION LAYER – “Big Data” and Data Set “mashing” – HOW?
– Content – Structured and Unstructured Data and Content
– Information – Atomic Data, Aggregated, Ordered and Ranked Information
– Transactional Data Streams – Smart Devices, EPOS, Internet, Mobile Networks
• INFRASTRUCTURE LAYER – Cloud Service Platforms
– Cloud Models – Public, Private, Mixed / Hybrid, Enterprise, Secure and G-Cloud
– Infrastructure – Network, Storage and Servers
– Applications – COTS Software, Utilities, Enterprise Services
– Security – Principles, Policies, Users, Profiles and Directories, Data Protection
“DATA SCIENCE” – my own special area of Business expertise
Targeting – Split / Map / Shuffle / Reduce
Consume – End-User Data
Data Provisioning – High-Volume Data Flows
– Mobile Enterprise Platforms (MEAP’s)
Apache Hadoop Framework
HDFS, MapReduce, Metlab “R”
Autonomy, Vertica
Smart Devices
Smart Apps
Smart Grid
Clinical Trial, Morbidity and Actuarial Outcomes
Market Sentiment and Price Curve Forecasting
Horizon Scanning,, Tracking and Monitoring
Weak Signal, Wild Card and Black Swan Event Forecasting
– Data Delivery and Consumption
News Feeds and Digital Media
Global Internet Content
Social Mapping
Social Media
Social CRM
– Data Discovery and Collection
– Analytics Engines - Hadoop
– Data Presentation and Display
Excel
Web
Mobile
– Data Management Processes
Data Audit
Data Profile
Data Quality Reporting
Data Quality Improvement
Data Extract, Transform, Load
– Performance Acceleration
GPU’s – massive parallelism
SSD’s – in-memory processing
DBMS – ultra-fast data replication
– Data Management Tools
DataFlux
Embarcadero
Informatica
Talend
– Info. Management Tools
Business Objects
Cognos
Hyperion
Microstrategy
Biolap
Jedox
Sagent
Polaris
Teradata
SAP HANA
Netezza (now IBM)
Greenplum (now Pivotal)
Extreme Data xdg
Zybert Gridbox
– Data Warehouse Appliances
Ab Initio
Ascential
Genio
Orchestra
The Emerging “Big Data” Stack
Information Management Strategy
Data Acquisition Strategy
Big Data – Process Overview
Analytics
Big Data
Management
Big Data
Provisioning
Big Data
Platform
Big Data
Consumption
Data Stream
Data ScientistsData Architects
Data Analysts
Big Data
Administration
Revenue Stream
Data Administrators
Data Managers
Hadoop Platform
Engineering Team
Insights
Split-Map-Shuffle-Reduce Process
Big Data
Consumers
Split Map Shuffle Reduce
Key / Value Pairs Actionable InsightsData Provisioning Raw Data
Apache Hadoop Component Stack
HDFS
MapReduce
Pig
Zookeeper
Hive
HBase
Oozie
Mahoot
Hadoop Distributed File System (HDFS)
Scalable Data Applications Framework
Procedural Language – abstracts low-level MapReduce operators
High-reliability distributed cluster co-ordination
Structured Data Access Management
Hadoop Database Management System
Job Management and Data Flow Co-ordination
Scalable Knowledge-base Framework
Data Management Component Stack
Informatica
Drill
Millwheel
Informatica Big Data Edition / Vibe Data Stream
Data Analysis Framework
Data Analytics on-the-fly + Extract – Transform – Load Framework
Flume
Sqoop
Scribe
Extract – Transform - Load
Extract – Transform - Load
Extract – Transform - Load
Talend Extract – Transform - Load
Pentaho Extract – Transform – Load Framework + Data Reporting on-the-fly
Big Data Storage Platforms
Autonomy
Vertica
MongoDB
HP Unstructured Data DBMS
HP Columnar DBMS
High-availability DBMS
CouchDB
Couchbase Database Server for Big Data with NoSQL / Hadoop
Integration
Pivotal Pivotal Big Data Suite – GreenPlum, GemFire, SQLFire, HAWQ
Cassandra
Cassandra Distributed Database for Big Data with NoSQL and
Hadoop Integration
NoSQL NoSQL Database for Oracle, SQL/Server, Couchbase etc.
Riak
Basho Technologies Riak Big Data DBMS with NoSQL / Hadoop
Integration
Big Data Analytics Engines and Appliances
Alpine
Karmasphere
Kognito
Alpine Data Studio - Advanced Big Data Analytics
Karmasphere Studio and Analyst – Hadoop Customer Analytics
Kognito In-memory Big Data Analytics MPP Platform
Skytree
Redis
Skytree Server Artificial Intelligence / Machine Learning Platform
Redis is an open source key-value database for AWS, Pivotal etc.
Teradata Teradata Appliance for Hadoop
Neo4j Crunchbase Neo4j - Graphical Database for Big Data
InfiniDB Columnar MPP open-source DB version hosted on GitHub
Big Data Analytics Engines / Appliances
Big Data Analytics and Visualisation Platforms
Tableaux Tableaux - Big Data Visualisation Engine
Eclipse Symentec Eclipse - Big Data Visualisation
Mathematica Mathematical Expressions and Algorithms
StatGraphics Statistical Expressions and Algorithms
FastStats Numerical computation, visualization and programming toolset
MatLab
R
Data Acquisition and Analysis Application Development Toolkit
“R” Statistical Programming / Algorithm Language
Revolution Revolution Analytics Framework and Library for “R”
Hadoop / Big Data Extended Infrastructure Stack
SSD Solid State Drive (SSD) – configured as cached memory / fast HDD
CUDA CUDA (Compute Unified Device Architecture)
GPGPU GPGPU (General Purpose Graphical Processing Unit Architecture)
IMDG IMDG (In-memory Data Grid – extended cached memory)
Vibe
Splunk
High Velocity / High Volume Machine / Automatic Data Streaming
High Velocity / High Volume Machine / Automatic Data Streaming
Ambari High-availability distributed cluster co-ordination
YARN Hadoop Resource Scheduling
Big Data Extended Architecture Stack
Cloud-based Big-Data-as-a-Service and Analytics
AWS
Amazon Web Services (AWS) – Big Data-as-a-Service (BDaaS)
Elastic Compute Cloud (ECC) and Simple Storage Service (S3)
1010 Data Big Data Discovery, Visualisation and Sharing Cloud Platform
SAP HANA SAP HANA Cloud - In-memory Big Data Analytics Appliance
Azure Microsoft Azure Data-as-a-Service (DaaS) and Analytics
Anomaly 42 Anomaly 42 Smart-Data-as-a-Service (SDaaS) and Analytics
Workday Workday Big-Data-as-a-Service (BDaaS) and Analytics
Google Cloud
Google Cloud Platform – Cloud Storage, Compute Platform,
Firebrand API Resource Framework
Apigee Apigee API Resource Framework
Data Warehouse Appliance / Real-time
Analytics Engine Price Comparison
Manufacturer
Server
Configuration
Cached Memory
Server
Type
Software
Platform
Cost (est.)
SAP HANA 32-node (4
Channels x 8 CPU)
1.3 Terabytes SMP Proprietary $ 6,000,,000
Teradata 20-node (2
Channels x 10 CPU)
1 Terabyte MPP Proprietary $ 1,000,000
Netezza
(now IBM)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Proprietary $ 180,000
IBM ex5 (non-HANA
configuration)
32-node (4
Channels x 8 CPU)
1.3 Terabytes SMP Proprietary $ 120,000
Greenplum (now
Pivotal)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Open Source $ 20,000
XtremeData xdb
(BO BW)
20-node (2
Channels x 10 CPU)
1 Terabyte MPP Open Source $ 18,000
Zybert Gridbox 48-node (4
Channels x 12 CPU)
20 Terabytes SMP Open Source $ 60,000
Apache Hadoop - Framework Distributions
FEATURE Hortonworks Teradata
Hadoop
Cloudera MAPR Pivotal
Open Source Hadoop Library Hcatalog (Hortonworks) Impala MAPR HD
Support Yes Yes Yes Yes Yes
Professional Services Yes Yes Yes Yes Yes
Catalogue Extensions Yes Yes Yes Yes Yes
Management Extensions Yes Yes Yes
Architecture Extensions Yes Yes
Infrastructure Extensions Yes Yes
Teradata Cloudera MAPR Pivotal HD
Library
Support
Services
Catalogue
Management
Library
Support
Services
Catalogue
Library
Support
Services
Catalogue
Management
Resilience
Availability
Performance
Library
Support
Services
Catalogue
Management
Resilience
Availability
Performance
Library
Support
Services
Catalogue
Hortonworks
Cloudera with Impala
EMC Pivotal HD distribution
Hortonworks Hcatalog System
MAPR with MAPR Control System
Gartner Magic Quadrant for BI and Analytics Platforms
Apache Hadoop - Framework Distributions
FEATURE Intel
Hadoop
Microsoft HD
Hindsight
Informatica
Vibe
IBM
BigInsights
DataStax
Enterprise
Open Source Hadoop Library Distribution (Hortonworks) Vibe Symphony Analytics
Support Yes Yes Yes Yes Yes
Professional Services Yes Yes Yes Yes Yes
Catalogue Extensions Yes Yes Yes Yes Yes
Management Extensions Yes Yes Yes
Architecture Extensions Yes Yes
Infrastructure Extensions Yes Yes
Hortonworks Vibe Symphony
Library
Support
Services
Catalogue
Management
Library
Support
Services
Catalogue
Library
Support
Services
Catalogue
Management
Resilience
Availability
Performance
Library
Support
Services
Catalogue
Intel Hadoop DataStax
Library
Support
Services
Catalogue
Management
Resilience
Availability
Performance
Intel HD
Microsoft HD
IBM BigInsights
Informatica Vibe
DataStax Enterprise
Gartner Magic Quadrant for BI
Apache Hadoop – Cloud Hadoop Platforms
FEATURE HP HAVEn AWS EMR SAP HANA Mono-Clustered
Big Data Cloud Solution
Open Source Hadoop Library HP HAVEn Elastic
MapReduce
SAP HANA
Support Yes Yes Yes
Professional Services Yes Yes Yes
Catalogue Extensions Yes Yes Yes
Management Extensions Yes
Architecture Extensions Yes
Infrastructure Extensions Yes
AWS EMR SAP HANA
Library
Support
Services
Catalogue
HP HAVEn
HP HAVEn
AWS EMR
SAP HANA Mono-Clustered
Big Data Cloud Solution
HP HAVEn Big Data Platform
IBM BigInsights
IBM Platform Symphony: -
Parallel Computing and Application Grid management solution
Informatica / Hortonworks Vibe
Telco 2.0 “Big Data” Analytics Architecture
SAP HANA Hortonworks Real-time Big Data Architecture
Turing Institute
Turing Institute
• In his Budget announcement, the chancellor, George Osborne pledged government
support for the Turing Institute, a specialist centre named after the great computer
pioneer Alan Turing – which will provide a British home for studying Data Science and
Big Data Analytics. Clustering and Wave-form algorithms in Big Data are the key to
unlocking Cycles, Patterns and Trends in complex (non-linear) systems – Cosmology,
Climate and Weather, Economics and Fiscal Policy – in order to forecast future trends,
outcomes and events with far greater accuracy.
• The chancellor, George Osborne has announced a £42m Alan Turing Institute is to be
founded to ensure that Britain leads the way in Data Science, Big Data Analytics for
studying complex (non-linear) systems - Clustering and Wave-form algorithmic research
in both Deterministic (human activity) and Stochastic (random, chaotic) processes.
• Drawing on the name of the famous British mathematician and computer pioneer Alan
Turing - who led the Enigma code-breaking work during the second world war at
Bletchley Park - the institute is intended to help British companies by bringing together
expertise and experience in tackling the challenges of understanding both deterministic
and stochastic systems – such as Weather, Climate, Economics, Econometrics and the
impact of Fiscal Policy – which require massive data sets and computational power.
Enigma Machine
Turing Institute
• The Turing Institute comes at a time when Data Science, Big Data Analytics and
complex system algorithm research is front and centre on the commercial stage. The
Turing Institute will be the first step to realising the UKs’ digital innovation potential.
Exploitation of big data by applying analytical methods - statistical analysis, predictive
and quantitative modelling - provides deeper insights and achieves brighter outcomes.
• The UK needs a centre of excellence capable of nurturing the talent required to make
British Data Science and Big Data Technology world-class. The cornerstone for the
new digital technologies isn’t just infrastructure, but the talent that’s needed to found,
innovate and grow technology firms and create a knowledge-based digital economy.
• The tender to house the institute will be produced this year. It may be a brand-new
facility or use existing facilities and space in a university, a Treasury spokesman said.
Its funding will come from the Department for Business, Innovation and Skills, and its
chief will report to the science minister, David Willetts. Executive appointments and
establishment numbers for the Turing Institute have yet to be announced.
• "The intention is for this work to benefit British companies to take a critical advantage
in the field of Data Science – algorithms, analytics and big data," said the spokesman.
The “Bombe” at Bletchley Park
Turing Institute
• Alan Turing was a pivotal figure in mathematics and computing and has long been
recognised as such by fellow mathematicians and computer scientists for his ground-
breaking work on Computational Theory. There already exists a Turing Institute at
Glasgow University, and an Alan Turing Institute in the Netherlands, as well as the Alan
Turing building at the Manchester Institute for Mathematical Sciences.
• Alan Turing’s code-breaking work using “the Bombe” - an electromechanical decryption
system - led to the de-ciphering of the German "Enigma" codes, which used very highly
complex encryption. His crypto-analysis work is claimed to have saved hundreds or even
thousands of lives and shortened WWII by as much as two years. Turing later formalised
Computational Theory which underpins modern computer science by the separation of
data from algorithms – sequences of instructions – in computer. programming languages.
• Osborne's announcement marks further official rehabilitation of a scientist who many see
as having been badly treated by the British establishment after his work during WWII.
Turing, who was homosexual, was convicted of indecency in March 1952, and lost his
security clearance with GCHQ - the successor to Bletchley Park. Turing killed himself in
June 1954 - but was only given an official pardon by the UK government in December
2013 after a series of public campaigns for recognition of his achievements.
Digital Village – Strategic Partners
• Digital Village is a consortium of Future Management and Future Systems Consulting firms for
Digital Marketing and Lifestyle Strategy – Social Media / Big Data Analytics / Mobile / Cloud
Computing / GPS/GIS / Next Generation Enterprise (NGE) / Digital Business Transformation
• Colin Mallett Former Chief Scientist @ BT Laboratories, Martlesham Heath
– Board Member @ SH&BA and Visiting Fellow @ University of Hertfordshire
– Telephone: (Mobile)
– (Office)
– Email: (Office)
• Ian Davey Founder and MD @ Atlantic Forces
– Telephone: +44 (0) 203 4026 225 (Mobile)
– +44 (0) 7581 178414 (Office)
– Email: Ian@atlanticforce.co
• Nigel Tebbutt 奈杰尔 泰巴德
– Future Business Models & Emerging Technologies @ INGENERA
– Telephone: +44 (0) 7832 182595 (Mobile)
– +44 (0) 121 445 5689 (Office)
– Email: Nigel-Tebbutt@hotmail.com (Private)
Digital Village - Strategic Enterprise Management (SEM) Framework ©
Proof-of-concept and Prototype
The Patient Pyramid™ approach is lean, agile, smart and creative: -
• We start by providing a custom Pyramid™ Enterprise Application as a proof of concept.
We then work with client key stakeholders to scope a detailed brief which articulates a
business problem domain that the Patient Pyramid™ can help understand and resolve.
• We then harvest all current and past patient records along with any other available internal
and public domain biomedical data – in order to establish a baseline Patient Pyramid™.
• This is augmented by overlaying external data - Social Intelligence and other live
streamed Patient Lifestyle / Biomedical data that drives our new real-time Patient
Pyramid™ view describing the six primitives - who / what / why / where / when and how.
• Finally, we exploit social intelligence for Patient Lifestyle understanding – creating new
actionable insights to inform creative medical campaign solutions against the agreed brief.
• Post proof-of-concept, we then agree a Pyramid™ Enterprise Application fixed term
licence along with Patient Pyramid™ consulting, mentoring, training and support – on-
line, on-site, on-demand - whenever and wherever required.
4D Geospatial Analytics in Digital Healthcare PDF

More Related Content

What's hot

How the Internet of Things Is Transforming Medical Devices
How the Internet of Things Is Transforming Medical DevicesHow the Internet of Things Is Transforming Medical Devices
How the Internet of Things Is Transforming Medical DevicesCognizant
 
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?Data IQ Argentina
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for DevelopmentJoud Khattab
 
Intelligent Maintenance: Mapping the #IIoT Process
Intelligent Maintenance: Mapping the #IIoT ProcessIntelligent Maintenance: Mapping the #IIoT Process
Intelligent Maintenance: Mapping the #IIoT ProcessDan Yarmoluk
 
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...M Shamim Iqbal
 
EiTESAL IOT DAY 26-10-2016
EiTESAL IOT DAY 26-10-2016EiTESAL IOT DAY 26-10-2016
EiTESAL IOT DAY 26-10-2016EITESANGO
 
Final presentation version of prominent role of io t
Final presentation version of prominent role of io t Final presentation version of prominent role of io t
Final presentation version of prominent role of io t Helyxon Healthcare
 
The competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsThe competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsIoTAnalytics
 
Internet of things
Internet of thingsInternet of things
Internet of thingsmmaslo
 
Glassbeam Drives Analytics Innovation
Glassbeam Drives Analytics InnovationGlassbeam Drives Analytics Innovation
Glassbeam Drives Analytics InnovationHarbor Research
 
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsBig Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsTauseef Naquishbandi
 
Why B2B should embrace IoE
Why B2B should embrace IoEWhy B2B should embrace IoE
Why B2B should embrace IoEJerome Petit
 
PreScouter Internet of Medical Things: Industry Roundtable Webinar
PreScouter Internet of Medical Things: Industry Roundtable WebinarPreScouter Internet of Medical Things: Industry Roundtable Webinar
PreScouter Internet of Medical Things: Industry Roundtable WebinarPreScouter
 
Connected Medical Devices in the Internet of Things
Connected Medical Devices in the Internet of ThingsConnected Medical Devices in the Internet of Things
Connected Medical Devices in the Internet of ThingsReal-Time Innovations (RTI)
 
AED Final Project Abstract
AED Final Project AbstractAED Final Project Abstract
AED Final Project AbstractRamya Reddy
 

What's hot (20)

IOT DATA AND BIG DATA
IOT DATA AND BIG DATAIOT DATA AND BIG DATA
IOT DATA AND BIG DATA
 
How the Internet of Things Is Transforming Medical Devices
How the Internet of Things Is Transforming Medical DevicesHow the Internet of Things Is Transforming Medical Devices
How the Internet of Things Is Transforming Medical Devices
 
Pyramid™‏Digital Marketing PDF
Pyramid™‏Digital Marketing PDFPyramid™‏Digital Marketing PDF
Pyramid™‏Digital Marketing PDF
 
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?
¿Cómo puede ayudarlo Qlik a descubrir más valor en sus datos de IoT?
 
Big Data for Development
Big Data for DevelopmentBig Data for Development
Big Data for Development
 
Intelligent Maintenance: Mapping the #IIoT Process
Intelligent Maintenance: Mapping the #IIoT ProcessIntelligent Maintenance: Mapping the #IIoT Process
Intelligent Maintenance: Mapping the #IIoT Process
 
IoT in healthcare
IoT in healthcareIoT in healthcare
IoT in healthcare
 
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...
Future Internet of IoT- A Survey of Healthcare Internet of Things (HIoT) : A ...
 
EiTESAL IOT DAY 26-10-2016
EiTESAL IOT DAY 26-10-2016EiTESAL IOT DAY 26-10-2016
EiTESAL IOT DAY 26-10-2016
 
Final presentation version of prominent role of io t
Final presentation version of prominent role of io t Final presentation version of prominent role of io t
Final presentation version of prominent role of io t
 
The competitive landscape of the Internet of Things
The competitive landscape of the Internet of ThingsThe competitive landscape of the Internet of Things
The competitive landscape of the Internet of Things
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
IoT Healthcare
IoT HealthcareIoT Healthcare
IoT Healthcare
 
Glassbeam Drives Analytics Innovation
Glassbeam Drives Analytics InnovationGlassbeam Drives Analytics Innovation
Glassbeam Drives Analytics Innovation
 
IOT in healthcare
IOT in healthcareIOT in healthcare
IOT in healthcare
 
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and AnalyticsBig Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
Big Data, CEP and IoT : Redefining Healthcare Information Systems and Analytics
 
Why B2B should embrace IoE
Why B2B should embrace IoEWhy B2B should embrace IoE
Why B2B should embrace IoE
 
PreScouter Internet of Medical Things: Industry Roundtable Webinar
PreScouter Internet of Medical Things: Industry Roundtable WebinarPreScouter Internet of Medical Things: Industry Roundtable Webinar
PreScouter Internet of Medical Things: Industry Roundtable Webinar
 
Connected Medical Devices in the Internet of Things
Connected Medical Devices in the Internet of ThingsConnected Medical Devices in the Internet of Things
Connected Medical Devices in the Internet of Things
 
AED Final Project Abstract
AED Final Project AbstractAED Final Project Abstract
AED Final Project Abstract
 

Similar to 4D Geospatial Analytics in Digital Healthcare PDF

Digital hospitals opportunities-overview
Digital hospitals opportunities-overviewDigital hospitals opportunities-overview
Digital hospitals opportunities-overviewDraup
 
Healthcare Technology & Medical Innovations
Healthcare Technology & Medical InnovationsHealthcare Technology & Medical Innovations
Healthcare Technology & Medical InnovationsS A Tabish
 
Healthcare IT Analysis
Healthcare IT AnalysisHealthcare IT Analysis
Healthcare IT AnalysisDraup
 
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdf
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdfIIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdf
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdfISPMAIndia
 
2015_0511 Connected Health-Ingrid-Helix
2015_0511 Connected Health-Ingrid-Helix2015_0511 Connected Health-Ingrid-Helix
2015_0511 Connected Health-Ingrid-HelixIngrid Fernandez, PhD
 
Big data in the real world opportunities and challenges facing healthcare -...
Big data in the real world   opportunities and challenges facing healthcare -...Big data in the real world   opportunities and challenges facing healthcare -...
Big data in the real world opportunities and challenges facing healthcare -...Leo Barella
 
INTRODUCTION TO NURSING INFORMATICS.pptx
INTRODUCTION TO NURSING INFORMATICS.pptxINTRODUCTION TO NURSING INFORMATICS.pptx
INTRODUCTION TO NURSING INFORMATICS.pptxmapimpolio
 
Introduction to Health Informatics
Introduction to Health InformaticsIntroduction to Health Informatics
Introduction to Health Informaticsasm071149
 
Practice Application- Nursing Informatics
Practice Application- Nursing InformaticsPractice Application- Nursing Informatics
Practice Application- Nursing InformaticsJadabear06
 
Trusted! Quest for data-driven and fair health solutions
Trusted! Quest for data-driven and fair health solutions Trusted! Quest for data-driven and fair health solutions
Trusted! Quest for data-driven and fair health solutions Sitra / Hyvinvointi
 
Ehr by jessica austin, shaun baker, victoria blankenship and kayla boro
Ehr by jessica austin, shaun baker, victoria blankenship and kayla boroEhr by jessica austin, shaun baker, victoria blankenship and kayla boro
Ehr by jessica austin, shaun baker, victoria blankenship and kayla borokayla_ann_30
 
Big data for better life
Big data for better lifeBig data for better life
Big data for better lifeShrutika Oswal
 
Metaverse Paving the way for Health 4.0
Metaverse Paving the way for Health 4.0Metaverse Paving the way for Health 4.0
Metaverse Paving the way for Health 4.0Insights10
 
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...IRJET Journal
 
Digital Transformation in Healthcare.pdf
Digital Transformation in Healthcare.pdfDigital Transformation in Healthcare.pdf
Digital Transformation in Healthcare.pdflearntransformation0
 
Intel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsIntel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsCarestream
 
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...Innovation Enterprise
 
Use of mobile device in health care setting
Use of mobile device in health care settingUse of mobile device in health care setting
Use of mobile device in health care settingDr. Samir Sawli
 

Similar to 4D Geospatial Analytics in Digital Healthcare PDF (20)

Digital hospitals opportunities-overview
Digital hospitals opportunities-overviewDigital hospitals opportunities-overview
Digital hospitals opportunities-overview
 
Healthcare Technology & Medical Innovations
Healthcare Technology & Medical InnovationsHealthcare Technology & Medical Innovations
Healthcare Technology & Medical Innovations
 
Healthcare IT Analysis
Healthcare IT AnalysisHealthcare IT Analysis
Healthcare IT Analysis
 
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdf
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdfIIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdf
IIM BLR Smart Product & Deep Tech Strategy Healthcare 40 BANI world.pdf
 
2015_0511 Connected Health-Ingrid-Helix
2015_0511 Connected Health-Ingrid-Helix2015_0511 Connected Health-Ingrid-Helix
2015_0511 Connected Health-Ingrid-Helix
 
Big data in the real world opportunities and challenges facing healthcare -...
Big data in the real world   opportunities and challenges facing healthcare -...Big data in the real world   opportunities and challenges facing healthcare -...
Big data in the real world opportunities and challenges facing healthcare -...
 
Digital Healthcare - Detailed Presentation PDF
Digital Healthcare - Detailed Presentation PDFDigital Healthcare - Detailed Presentation PDF
Digital Healthcare - Detailed Presentation PDF
 
INTRODUCTION TO NURSING INFORMATICS.pptx
INTRODUCTION TO NURSING INFORMATICS.pptxINTRODUCTION TO NURSING INFORMATICS.pptx
INTRODUCTION TO NURSING INFORMATICS.pptx
 
Introduction to Health Informatics
Introduction to Health InformaticsIntroduction to Health Informatics
Introduction to Health Informatics
 
Innovative project1
Innovative project1Innovative project1
Innovative project1
 
Practice Application- Nursing Informatics
Practice Application- Nursing InformaticsPractice Application- Nursing Informatics
Practice Application- Nursing Informatics
 
Trusted! Quest for data-driven and fair health solutions
Trusted! Quest for data-driven and fair health solutions Trusted! Quest for data-driven and fair health solutions
Trusted! Quest for data-driven and fair health solutions
 
Ehr by jessica austin, shaun baker, victoria blankenship and kayla boro
Ehr by jessica austin, shaun baker, victoria blankenship and kayla boroEhr by jessica austin, shaun baker, victoria blankenship and kayla boro
Ehr by jessica austin, shaun baker, victoria blankenship and kayla boro
 
Big data for better life
Big data for better lifeBig data for better life
Big data for better life
 
Metaverse Paving the way for Health 4.0
Metaverse Paving the way for Health 4.0Metaverse Paving the way for Health 4.0
Metaverse Paving the way for Health 4.0
 
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...
IRJET-Cloud based Patient Referral System with RFID Based Clinical Informatio...
 
Digital Transformation in Healthcare.pdf
Digital Transformation in Healthcare.pdfDigital Transformation in Healthcare.pdf
Digital Transformation in Healthcare.pdf
 
Intel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analyticsIntel next-generation-medical-imaging-data-and-analytics
Intel next-generation-medical-imaging-data-and-analytics
 
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...
Big Data Analytics - Opportunities, Enablers, Challenges and Risks to Conside...
 
Use of mobile device in health care setting
Use of mobile device in health care settingUse of mobile device in health care setting
Use of mobile device in health care setting
 

More from Nigel Tebbutt 奈杰尔 泰巴德

Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFNigel Tebbutt 奈杰尔 泰巴德
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFNigel Tebbutt 奈杰尔 泰巴德
 

More from Nigel Tebbutt 奈杰尔 泰巴德 (15)

Connected Fashion™ Final‏
Connected Fashion™ Final‏Connected Fashion™ Final‏
Connected Fashion™ Final‏
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
 
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDFStrategic Foresight Plaform - Training and Education Modules (TEM) PDF
Strategic Foresight Plaform - Training and Education Modules (TEM) PDF
 
Cone TM Digital Marketing - Business Scenarios PDF
Cone TM Digital Marketing - Business Scenarios PDFCone TM Digital Marketing - Business Scenarios PDF
Cone TM Digital Marketing - Business Scenarios PDF
 
Cone TM Digital Marketing - Principles PDF
Cone TM Digital Marketing - Principles PDFCone TM Digital Marketing - Principles PDF
Cone TM Digital Marketing - Principles PDF
 
Enterprise Risk Management 2015 PDF
Enterprise Risk Management 2015 PDFEnterprise Risk Management 2015 PDF
Enterprise Risk Management 2015 PDF
 
Ghost in the Machine 2015 - Workbook PDF
Ghost in the Machine 2015 - Workbook PDFGhost in the Machine 2015 - Workbook PDF
Ghost in the Machine 2015 - Workbook PDF
 
Ghost in the Machine 2015 - Principles PDF
Ghost in the Machine 2015 - Principles PDFGhost in the Machine 2015 - Principles PDF
Ghost in the Machine 2015 - Principles PDF
 
Thinking about the Future 3 - Scenarios and Use Cases PDF
Thinking about the Future 3 - Scenarios and Use Cases PDFThinking about the Future 3 - Scenarios and Use Cases PDF
Thinking about the Future 3 - Scenarios and Use Cases PDF
 
Thinking about the Future 3 - Principles PDF
Thinking about the Future 3 - Principles PDFThinking about the Future 3 - Principles PDF
Thinking about the Future 3 - Principles PDF
 
Retail 2.0 Strategy - Perfect Store PDF
Retail 2.0 Strategy - Perfect Store PDFRetail 2.0 Strategy - Perfect Store PDF
Retail 2.0 Strategy - Perfect Store PDF
 
Nigel Tebbutt Profile - Fin Tech PDF
Nigel Tebbutt Profile - Fin Tech PDFNigel Tebbutt Profile - Fin Tech PDF
Nigel Tebbutt Profile - Fin Tech PDF
 
Business Cycles, Patterns and Trends Version 6 PDF
Business Cycles, Patterns and Trends Version 6 PDFBusiness Cycles, Patterns and Trends Version 6 PDF
Business Cycles, Patterns and Trends Version 6 PDF
 
Future Homes Business Model PDF
Future Homes Business Model PDFFuture Homes Business Model PDF
Future Homes Business Model PDF
 
The Internet of Things (IoT) PDF
The Internet of Things (IoT) PDFThe Internet of Things (IoT) PDF
The Internet of Things (IoT) PDF
 

4D Geospatial Analytics in Digital Healthcare PDF

  • 1. From sports to scientific research, a surprising range of industries will begin to find value in big data.....
  • 2. Digital Health Technologies These are some of the most important DIGITAL HEALTH CATEGORIES..... • Digital Imaging – (MRI / CTI / X-Ray / Ultrasound) • Robotic Surgery – (Microsurgery / Remote Surgery) • Patient Monitoring – (Clinical Trials / Health / Wellbeing) • Biomedical Data – (Data Streaming / Biomedical Analytics) • Epidemiology – (Disease Transmission / Contact Management) • Emergency Incident Management – (Response Teams / Alerts and Alarms) Here are a few of the most important DIGITAL MONITORING SMART APPS..... • Activity Monitor – (Pedometer / GPS) • Position Monitor – (Falling / Fainting / Fitting) • Breathing Monitor – (Breathing Rate / SATS Level) • Sleep Monitor – (Light Sleep / Deep Sleep / REM / Apnoea) • Blood Monitor – (Glucose / Oxygen / Hormones / Organ Function) • Cardiac Monitor – (Heart Rhythm / Blood Pressure / Cardiac Events)
  • 3. Digital Health Technologies These are some of the most influential FUTURE DIGITAL HEALTH leaders: - – Huawei - John Frieslaar (Digital Futures) – Cisco - Andrew Green (Digital Healthcare) – ElationEMR - Kyna Fong (Digital Imaging) – Microsoft - John Coplin (Digital Healthcare) – Google - Eze Vidra (Head of Campus at Tech City) – GE Healthcare - Catherine Yang (Digital Healthcare) – MIT – Prof Alex “Sandy” Pentland (Digital Epidemiology) – Telefónica Digital – Mathew Key – CEO (Digital Healthcare) – Open University – Dr. Blain Price (Digital Patient Monitoring) – UCLA – Prof. Larry Smarr (FuturePatient – Digital Patient Monitoring) – Telefónica – Dr. Mike Short CBE (Digital Futures and the Smart Ward) – Thames Valley Health Innovation and Education Cluster – David Doughty – Department of Business, Industry & Skills – Richard Foggie, KTN Executive – Science City Research Alliance – Sarah Knaggs (Strategic Project Manager)
  • 4. Digital Healthcare – Executive Summary • Digital Healthcare is a cluster of new and emerging applications and technologies that exploit digital, mobile and cloud platforms for treating and supporting patients. The term "Digital Healthcare" is necessarily broad and generic as this novel and exciting Bioinformatics and Medical Analytics innovation driven approach is applied to a very wide range of social and health problems - from monitoring patients in intensive care, general wards, in convalescence or at home – to helping general practitioners make better informed and more accurate diagnoses, improving the effect of prescription and referral decisions for clinical treatment. • Bioinformatics and Medical Analytics utilises Data Science to provide actionable clinical insights. Digital Healthcare has evolved from the need for more proactive and efficient healthcare service delivery, and seeks to offer new and improved types of pro-active and preventive monitoring and medical care at reduced cost – using methods that are only possible thanks to emerging SMAC Digital Technology. Digital Healthcare Technologies – Bioinformatics and Medical Analytics: - – Digital Patient Monitoring • – Biomedical Data Streaming • – Biomedical Data Science and Analytics • – Epidemiology, Clinical Trials, Morbidity and Actuarial Outcomes • • Novel and emerging high-impact Biomedical Health Technologies such as Bioinformatics and Medical Analytics are transforming the way that Healthcare Service Providers can deliver Digital Healthcare globally – Digital Health Technology entrepreneurs, investors and researchers becoming increasingly interested in and attracted to this important and rapidly expanding Life Sciences industry sector.
  • 5.
  • 6. Digital Healthcare – Executive Summary • While many industries can benefit from SMAC digital technology – Smart Devices, Mobile Platforms, Analytics and the Cloud – this is especially the case for Life Sciences, Pharma and Healthcare industry sectors – resulting in more accurate diagnosis, improved treatment regimes, more reliable prognosis, better patient monitoring, care and clinical outcomes. Let’s take a look at some of the Digital Technologies that are bringing significant improvements and benefits to Healthcare • Today, thanks to the regulatory compliance requirements for HIPAA, HITEC, PCI DSS and ISO 27001, the reluctance to adopt Digital Technology has been overcome, and Digital Healthcare adoption is gaining increased traction. Many of the security features required for data protection and patient confidentiality are being addressed by Digital Healthcare service providers, therefore relieving healthcare delivery organizations from tedious and complex security and data protection frameworks. Biomedical Data Analytics: • The exploitation of data by applying analytical methods such as statistics, predictive and quantitative models to patient segments or groups of the population will provide better insights and achieve better outcomes. As far back as 2010, there was evidence that: “93 percent of healthcare providers identified the digital information explosion as the major factor which will drive organizational change over the next 5 years.” (Related article: Cloud and healthcare: A revolution is coming)
  • 7. Digital Healthcare – Executive Summary Data Security and Privacy: • Today, thanks to the regulatory compliance requirements for HIPAA, HITEC, PCI DSS and ISO 27001, reluctance to adopt emerging technologies is starting to be addressed and digital technology is beginning to gain traction - bear in mind also that many of the security features required for data security and protection are addressed by the service providers, therefore relieving the healthcare organization from tedious and complex security frameworks. Mobility: • Mobility Services, where Smart Devices, Smart Apps, Mobile Platforms and Cloud Infrastructure is providing the backbone for medical personnel to access all sorts of patient information from any place, any where - and from a wide range of mobile devices. Collaboration with patients: • Mobility means that complete patient records are now available to healthcare professionals anytime, anywhere – allowing physicians to access historical patient case records , images and clinical data to fine-tune their diagnosis and make informed decisions on treatment – thus reducing diagnosis latency, increasing accuracy and improving patient care and clinical outcomes from initial consultation to specialist referrals. Some scenarios are illustrated in the following: - • Physician Collaboration Solutions (PCS) • • PCS solutions offers video conferencing to facilitate remote consultations and care continuity, allowing patients to be viewed remotely. PCS allows physicians to consult with patients and even perform remote robotic surgery. This is dubbed “tele-health solutions.”
  • 8. Digital Healthcare – Executive Summary • Electronic Medical Records (EMR) • • Every piece of information pertaining to a specific is recorded and stored. The solution is designed to capture and provide a patient’s data at any time of the patient’s monitoring cycle, including the complete medical records and history. • Patient Information Exchange (PIE) • • This allows for the healthcare information to be shared electronically across organizations within a region, community or hospital system. There are currently several Digital Healthcare cloud service providers addressing this market, taking the role of collecting and distributing medical information from and among multiple organizations. • The New York Times has published an interesting article illustrating the use of the cloud in healthcare - leveraging big data in the cloud to manage patient relationships and clinical outcomes. Collaboration among peers: • Technology can provide medical assistance to doctors in the field, b e it in remote areas or in emergency relief operations through satellite communications. Refer to the Remote Assistance for Medical Teams Deployed Abroad (T4MOD project) which could easily find its place in the Digital Healthcare cloud space.
  • 9. 4D Geospatial Analytics in Digital Healthcare
  • 10. GIS Mapping and Spatial Analysis • 4D Geospatial Analytics is the Geographic profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of clusters or groupings – this provides an important basic technique for many statistical and analytic applications. • Environmental and Demographic Geospatial Cluster Analysis – based on geographic distribution or profile similarities – is a statistical method whereby no prior assumptions are made concerning the nature of internal data structures (the number and type of groups and hierarchies). Geo-spatial and geodemographic techniques are frequently used in order to profile and segment populations using ‘natural’ groupings such as shared or common behavioural traits – Medical, Clinical Trial, Morbidity or Actuarial outcomes - along with many other common factors and shared characteristics.....
  • 11. GIS Mapping and Spatial Analysis • GIS MAPPING and SPATIAL DATA ANALYSIS • • A Geographic Information System (GIS) integrates hardware, software and digital data capture and streaming devices – including machine generated data capture such as Computer- aided Design (CAD) information from land and building surveys, Global Positioning System (GPS) terrestrial location data, wearable technology and biomedical data streams – in order to acquire, manage, analyse, distribute, communicate and display every type of static and mobile geographically dependant location data, along with data streams such as imaging data feeds – including personal, transportation and environment , HDCCTV, aerial and satellite image data..... • Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial (Geographic) data and location (Positional) object data overlays. GIS Software that implements spatial data analysis techniques requires access to both the locations of objects and their physical attributes. Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial Data Analysis provides techniques to describe the distribution of data in a geographic space (descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster analysis), identify and measure spatial relationships (clusters and spatial regression), and create 3D surface models from sampled data (spatial interpolation, often categorised as geo-statistics). • The results of spatial data analysis are largely dependent upon the type, quantity, distribution and data quality of the spatial objects which are subject to analysis…
  • 12. GIS Mapping and Spatial Analysis GIS Gazetteer – Biomedical Clusters
  • 13. The Cone™ – Actionable Clinical Insights
  • 14. The Cone™ – Patient Model The Cone™ - Patient Model turning Biomedical Data Streams, Social Intelligence, Patient Monitoring and Analytics – into Actionable Clinical Insights… • Acute – (10%) Active Patient Monitoring – Alerts and Alarms • Chronic – (20%) Passive Monitoring – Biomedical Data Streaming • Casuals – (30%) Walk-in – on-demand Monitoring and Treatment • Indifferent – (40%) Annual Screening – Health-check and Review
  • 16. The Cone™ - Patient Clusters Acute - 10% Chronic - 20% Casuals - 30% Indifferent - 40% The Cone™ Patient Biomedical Analytics Actionable Clinical Insights Presentation Clustering Biomedical Profile Biomedical Epidemiology – Groups (Streams), Types (Segments) Hybrid Cone – 3 Dimensions Biomedical Analytics
  • 17. The Cone™ - Eight Primitives Primitive Domain Function Product Who ? People – Patient Patient Information System Electronic Medical Records (EMR) Where ? Places – Location 1st Responders, Emergency Services, GP, Nurse, Doctor Command / Control / Geospatial Analytics When ? Medical Incident / Event Event Type - Referral, Walk-in, Appointment, Emergency Incident Management – Event Type / Time / Date What ? Emergency / Medical / Clinical Procedure Investigate / Test / Diagnose / Treatment / Follow-up Patient Administration / Patient Care Systems Why ? Reason / Motivation / Cause / Outcome Triage Patient Status - Acute, Chronic, Casual, Indifferent Biomedical Information Streaming and Analytics How ? Patient Medical Data Automatic Streaming of Biomedical Data to Cloud Mobile Platforms / IoT, Smart Devices / Apps Which ? Investigation / Test / Observe / Diagnosis Healthcare Provider - GPs Surgery, Clinics, Hospitals Patient Administration / Patient Care Systems Via ? Referral Channel / Health Service Delivery Partner Healthcare Service Provider – Surgery, Clinics, Hospitals Healthcare Service Partner / Procedure
  • 18. The Patient Cone™ – EIGHT PRIMITIVES Event Dimension Party Dimension Geographic Dimension Motivation Dimension Time Dimension Data Dimension Cone™ MEDICAL FACT WHO ? WHAT ? WHERE ? HOW ?WHEN ?WHY ? • Indifferent • Casuals • Chronic • Acute • Clinical Notes • Images / Graphs • Biomedical Data • Lab Test Results • Cardiac Activity • Brain Activity • Consultation • Clinical Tests • Diagnosis • Treatment • Appointment • Attendance • Phone Call • Letter • Location • Attitude • Movement • Region / Country • State / County • City / Town • Street / Building • Postcode • Person • Organisation Procedure Dimension WHICH ? • Procedure • Prescription Channel Dimension VIA ? • Channel / Partner • Hospital / Clinic Patient Data Delivery Channel Environment Data Subject Location Biomedical Data Event • Walk-in • Emergency • Referral • Follow-upMotivation Patient Time / Date Version 3 – Healthcare
  • 19. The Biomedical Cone™ Converting Data Streams into Actionable Insights Salesforce Anomaly 42 Cone Unica End User BIG DATA ANALYTICS BIOMEDICAL DATA Patient Monitoring Platform INTERVENTION • Treatment • Smart Apps The Cone™ Patient Biomedical Analytics Actionable Medical Insights Electronic Medical Records (EMR) • Individuals • Households • Geo-demographics • Patient Streaming • Patient Segmentation PATIENT RECORDS • Medical History • Key Events Insights InsightsInsights Anomaly 42 Unica Biomedical Data Streaming People, Places and Events Health Campaigns • Clinical and Biomedical Data • Images – X-Ray, CTI, MRI • Procedures and Interventions • Prescriptions and Treatment Social Media Monitoring EXPERIAN MOSAIC
  • 20. Proof-of-concept and Prototype The Patient Pyramid™ approach is lean, agile, smart and creative: - • We start by providing a custom Pyramid™ Enterprise Application as a proof of concept. We then work with client key stakeholders to scope a detailed brief which articulates a business problem domain that the Patient Pyramid™ can help understand and resolve. • We then harvest all current and past patient records along with any other available internal and public domain biomedical data – in order to establish a baseline Patient Pyramid™. • This is augmented by overlaying external data - Social Intelligence and other live streamed Biomedical and Patient Lifestyle Data that drives our new real-time Patient Pyramid™ view describing the six primitives - who / what / why / where / when and how. • Finally, we exploit social intelligence for Patient Lifestyle Understanding – creating new actionable insights to inform creative medical campaign solutions against the agreed brief. • Post proof-of-concept, we can then agree a Pyramid™ Enterprise Application fixed term licence along with Patient Pyramid™ add-ons, enhancements, consulting, mentoring, training and support – on-line, on-site, on-demand - whenever and wherever required.
  • 21. 4D Geospatial Analytics in Digital Healthcare Digital Futures: - Creating new roles and value chains Novel and emerging Biomedical Health Technologies are transforming the way that Healthcare Providers can deliver Healthcare globally – with Digital Health Technology entrepreneurs and investors becoming increasingly attracted to this rapidly growing industry sector. Healthcare Delivery is currently undergoing a global transformation – with Digital Health Technologies leading the way. Companies such as BT Health, Blueprint Health, BUPA, Microsoft (John Coplin), Telefonica Digital (Dr. Mike Shaw) and Rockhealth - are all shaping novel and emerging Digital Healthcare Technologies - bringing new and innovative business propositions to market.
  • 22. 4D Geospatial Analytics Geo-spatial and geodemographic techniques are frequently used to profile, stream and segment human populations using ‘natural’ groupings such as shared or common behavioural traits – Medical, Clinical Trial, Morbidity or Actuarial outcomes – along with many other common factors and shared characteristics..... The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of clusters or groupings, provides an important basic technique for many statistical and analytic applications. Based on geographic distribution or profile similarities – Geospatial Clustering is a statistical method whereby no prior assumptions are made concerning the nature of internal data structures (the number and type of groups and hierarchies).
  • 23. 4D Geospatial Analytics GIS Gazetteer – Biomedical Clusters
  • 24. The Flow of Information through Time • Space-Time is a four-dimensional (4D) integrated dimensional cluster consisting of the three Spatial dimensions (x, y and z axes) plus Time (the fourth dimension - t). Space- Time exists in discrete packages (Temporal Planes) - with the whole of Space-Time existing as an endless stack of Temporal Planes extending from the remote Past, through into our Present, and onwards to the distant Future. Events exist as a line through this stack of Temporal Planes. Thus Time Present is always inextricably woven into both Time Past and Time Future. Every item of Global Content in the Present is somehow connected with both Past and Future temporal planes in a timeline which is composed of a sequence of temporal planes stacked one on top of another. The “arrow of time” governs the flow of Space-Time which can only flow in a single direction - relentlessly towards the future. • Space-Time does not flow uniformly – the path of the “arrow of time” may be deflected or changed by various factors – gravitational fields, dark matter, dark energy, dark flow, hidden dimensions or unknown Membranes in Hyperspace. There may also exist “hidden external forces” (unseen interactions) that create disturbance in the temporal plane stack which marks the passage of time - with the potential to create eddies, vortices and whirlpools along the trajectory of Time (chaos, disorder and uncertainty) – which in turn posses the capacity to generate ripples and waves (randomness and disruption) – thus changing the course of the Space-Time continuum. “Weak Signals” are “Ghosts in the Machine” – echoes of these subliminal temporal interactions – that may contain within insights or clues about possible future “Wild card” or “Black Swan” random events
  • 25. The Flow of Information through Time • String Theory physicists and mathematicians postulate that Space-Time exists in discrete packages (Temporal Planes) - with the whole of Space-Time existing as an endless stack of Temporal Planes extending from the remote Past, through into our Present, and onwards to the distant Future. Thus Time Present is always inextricably woven into both Time Past and Time Future. This yields the intriguing possibility of glimpses through the mists of time into the outcomes of future Event Paths – both isolated Events and linked Event Clusters – as any item of Data or Information (Global Content) may contain faint traces which offer insights into the future trajectory of Past, Present and Future Events. • If all future timelines were linear in nature - then every event would unfold in an unerringly predictable manner towards a known and certain conclusion. The future is, however, both unknown and unknowable (Hawking Paradox). Events exist as a line through this stack of Temporal Planes. Future timelines are non-linear (branched) with an infinite multitude of possible alternative futures – rendering future outcomes as uncertain and unpredictable. Chaos Theory suggests to us that even the most ethereal and subliminal system inputs originating from invisible random events in the Space-Time continuum, are able to project minute unknown forces so small as to be undetectable, which may then simply disappear – or become amplified over time through numerous system cycles to grow in influence and impact – slowly deviating predicted Space-Time trajectories far away from their original estimated path – thus fundamentally altering the flow and outcome of Future Events.
  • 26. 4D Geospatial Analytics – The Temporal Wave • The Temporal Wave is a novel and innovative method for Visual Modelling and Exploration of Geospatial “Big Data” - simultaneously within a Time (history) and Space (geographic) context. The problems encountered in exploring and analysing vast volumes of spatial– temporal information in today's data-rich landscape – are becoming increasingly difficult to manage effectively. In order to overcome the problem of data volume and scale in a Time (history) and Space (location) context requires not only traditional location–space and attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the additional dimension of time–space analysis. The Temporal Wave supports a new method of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context. • This time-visualisation approach integrates Geospatial (location) data within a Temporal (timeline) data along with data visualisation techniques - thus improving accessibility, exploration and analysis of the huge amounts of geo-spatial data used to support geo-visual “Big Data” analytics. The Temporal Wave combines the strengths of both linear timeline and cyclical wave-form analysis – and is able to represent data both within a Space (geographic) and Time (history) context simultaneously – and even at different levels of granularity. Linear and cyclic trends in space-time data may be represented in combination with other graphic representations typical for location–space and attribute–space data- types. The Temporal Wave can be used in multiple roles for exploring very large scale datasets containing Geospatial (location) data within a Temporal (timeline) context - as an integrated Space-Time data reference system, as a Space-Time continuum representation and animation tool, and as Space-Time interaction, simulation and analysis tool.
  • 27. 4D Geospatial Analytics – The Temporal Wave • The problems encountered in exploring, analysing and extracting insights from the vast volumes of spatial–temporal information in today's data-rich landscape are becoming increasingly difficult to manage effectively. In order to overcome the problem of data volume and scale in an integrated Time (history) and Space (location) context requires not only traditional location–space and attribute–space analysis common in GIS Mapping and Spatial Analysis - but now with the additional dimension of Space-Time analysis. The Temporal Wave supports a new method of Visual Exploration for Geospatial (location) data within a Temporal (timeline) context. The Temporal Wave is a novel and innovative method for Visual Modelling, Exploration and Analysis of the Space-Time dimension fundamental to understanding Geospatial “Big Data” – through simultaneously visualising and displaying complex data within a Time (history) and Space (geographic) context. Simplexity Ordered Complexity Disordered Complexity Complex Adaptive Systems (CAS) Linear Systems ComplexitySimplicity (increasing element and interaction density) ChaosOrder EntropyEnthalpy The “arrow of time”
  • 28. 4D Geospatial Analytics – The Temporal Wave • The Temporal Wave time-visualisation approach integrates Geospatial (location) data within a Temporal (timeline) dataset - along with other data visualisation techniques - thus improving accessibility, exploration and analysis of the huge amounts of geo-spatial data used to support geo-visual “Big Data” analytics. The Temporal Wave combines the strengths of both linear timeline and cyclical wave-form analysis – and is able to represent complex data both within a Time (history) and Space (geographic) context simultaneously – even at different levels of granularity. Linear and cyclic trends in space-time data may be represented in combination with other graphic representations typical for location–space and attribute–space data-types. The Temporal Wave can be deployed and used in roles as diverse as a Space-Time data reference system, as a Space-Time continuum representation tool, and as Space-Time display / interaction / simulation / analysis tool. Simplexity Ordered Complexity Disordered Complexity Complex Adaptive Systems (CAS) Linear Systems ComplexitySimplicity (increasing element and interaction density) ChaosOrder EntropyEnthalpy The “arrow of time”
  • 29. Digital Healthcare – Technical Appendices
  • 30. 4D Geospatial Analytics – London Timeline
  • 31. 4D Geospatial Analytics – London Timeline • How did London evolve from its creation as a Roman city in 43AD into the crowded, chaotic cosmopolitan megacity we see today? The London Evolution Animation takes a holistic view of what has been constructed in the capital over different historical periods – what has been lost, what saved and what protected. • Greater London covers 600 square miles. Up until the 17th century, however, the capital city was crammed largely into a single square mile which today is marked by the skyscrapers which are a feature of the financial district of the City. • This visualisation, originally created for the Almost Lost exhibition by the Bartlett Centre for Advanced Spatial Analysis (CASA), explores the historic evolution of the city by plotting a timeline of the development of the road network - along with documented buildings and other features – through 4D geospatial analysis of a vast number of diverse geographic, archaeological and historic data sets. • Unlike other historical cities such as Athens or Rome, with an obvious patchwork of districts from different periods, London's individual structures scheduled sites and listed buildings are in many cases constructed gradually by parts assembled during different periods. Researchers who have tried previously to locate and document archaeological structures and research historic references will know that these features, when plotted, appear scrambled up like pieces of different jigsaw puzzles – all scattered across the contemporary London cityscape.
  • 32. History of Digital Epidemiology • Doctor John Snow (15 March 1813 – 16 June 1858) was an English physician and a leading figure in the adoption of anaesthesia and medical hygiene. John Snow is largely credited with sparking and pursuing a total transformation in Public Health and epidemic disease management and is considered one of the fathers of modern epidemiology in part because of his work in tracing the source of a cholera outbreak in Soho, London, in 1854. • John Snows’ investigation and findings into the Broad Street cholera outbreak - which occurred in 1854 near Broad Street in the London district of Soho in England - inspired fundamental changes in both the clean and waste water systems of London, which led to further similar changes in other cities, and a significant improvement in understanding of Public Health around the whole of the world.
  • 33. History of Digital Epidemiology • The Broad Street cholera outbreak of 1854 was a major cholera epidemic or severe outbreak of cholera which occurred in 1854 near Broad Street in the London district of Soho in England . • This cholera outbreak is best known for statistical analysis and study of the epidemic by the physician John Snow and his discovery that cholera is spread by contaminated water. This knowledge drove improvement in Public Health with mass construction of sanitation facilities from the middle of the19th century. • Later, the term "focus of infection" would be used to describe factors such as the Broad Street pump – where Social and Environmental conditions may result in the outbreak of local infectious diseases.
  • 34. History of Digital Epidemiology • It was the study of cholera epidemics, particularly in Victorian England during the middle of the 19th century, which laid the foundation for epidemiology - the applied observation and surveillance of epidemics and the statistical analysis of public health data. • This discovery came at a time when the miasma theory of disease transmission by noxious “foul air” prevailed in the medical community.
  • 35. History of Digital Epidemiology Modern epidemiology has its origin with the study of Cholera Broad Street cholera outbreak of 1854
  • 36. History of Digital Epidemiology Modern epidemiology has its origin with the study of Cholera. • It was the study of cholera epidemics, particularly in Victorian England during the middle of the 19th century, that laid the foundation for the science of epidemiology - the applied observation and surveillance of epidemics and the statistical analysis of public health data. It was during a time when the miasma theory of disease transmission prevailed in the medical community. • John Snow is largely credited with sparking and pursuing a transformation in Public Health and epidemic disease management from the extant paradigm in which communicable illnesses were thought to have been carried by bad, malodorous airs, or "miasmas“ - towards a new paradigm which would begin to recognize that virulent contagious and infectious diseases are communicated by various other means – such as water being polluted by human sewage. This new approach to disease management recognised that contagious diseases were either directly communicable through contact with infected individuals - or via vectors of infection (water, in the case of cholera) which are susceptible to contamination by viral and bacterial agents.
  • 37. History of Digital Epidemiology • This map is John Snow’s famous plot of the 1854 Broad Street Cholera Outbreak in London. By plotting epidemic data on a map like this, John Snow was able to identify that the outbreak was centred on a specific water pump. • Interviews confirmed that outlying cases were from people who would regularly walk past the pump and take a drink. He removed the handle off the water pump and the outbreak ended almost overnight. • The cause of cholera (bacteria Vibria cholerae) was unknown at the time, and Snow’s important work with cholera in London during the 1850s is considered the beginning of modern epidemiology. Some have even gone so far as to describe Snow’s Broad Street Map as the world’s first GIS.
  • 38. History of Digital Epidemiology Broad Street cholera outbreak of 1854
  • 39. Clinical Risk Types Clinical Risk Types Clinical Risk Group Employee or Service Provider Patient B A Human Risk Process Risk D Morbidity Risk Types Morbidity Risk Group C Legal Risk F 3rd Party Risk G C Technology Risk Trauma Risk E Morbidity Risk H E J G A I D Immunological System Risk Sponsorship Risk Stakeholders Disease Risk Shock Risk Cardiovascular System Risk Pulmonary System Risk Toxicity Risk Organ Failure Risk - Airways - Cognitive - Bleeding Triage Risk - Performance - Finance - Standards Compliance Risk H Patient Risk Neurological System Risk F B Predation Risk Environment Risk Patients
  • 41. • Case Study • Pandemics
  • 42. • Case Study • Pandemics • Pandemics - during a pandemic episode, such as the recent Ebola outbreak, current policies emphasise the need to ground decision-making on empiric evidence. This section studies the tension that remains in decision-making processes when their is a sudden and unpredictable change of course in an outbreak – or when key evidence is weak or ‘silent’. • The current focus in epidemiology is on the ‘known unknowns’ - factors with which we are familiar in the pandemic risk assessment processes. These risk processes cover, for example, monitoring the course of the pandemic, estimating the most affected age groups, and assessing population-level clinical and pharmaceutical interventions. This section looks for the ‘unknown unknowns’ - factors with a lack of, or silence, of evidence, which we have only limited or weak understanding in the pandemic risk assessment processes. • Pandemic risk assessment shows, that any developing, new and emerging or sudden and unpredictable change in the pandemic situation does not accumulate a robust body of evidence for decision making. These uncertainties may be conceptualised as ‘unknown unknowns’, or “silent evidence”. Historical and archaeological pandemic studies indicate that there may well have been evidence that was not discovered, known or recognised. This section looks at a new method to discover “silent evidence” - unknown factors - that affect pandemic risk assessment - by focusing on the tension under pressure that impacts upon the actions of key decision-makers in the pandemic risk decision-making process.
  • 44. Pandemic Black Swan Events Black Swan Pandemic Type / Location Impact Date Malaria For the entirety of human history, Malaria has been a pathogen The Malaria pathogen kills more humans than any other disease 20 kya – present Smallpox (Antonine Plague) Smallpox Roman Empire / Italy Smallpox is the 2nd worst killer 165-180 Black Death (Plague of Justinian) Bubonic Plague – Roman Empire 50 million people died 6th century Black Death (Late Middle Ages) Bubonic Plague – Europe 75 to 200 million people died 1340–1400 Smallpox Amazonian Basin Indians 90% Amazonian Indians died 16th century Tuberculosis Western Europe, 18th - 19th c 900 deaths per 100,000 pop. 18th - 19th c Syphilis Global pandemic – invariably fatal 10% of Victorian men carriers 19th century 1st Cholera Pandemic Global pandemic Started in the Bay of Bengal 1817-1823 2nd Cholera Pandemic Global pandemic (arrived in London in 1832) 1826-1837 Spanish Flu Global pandemic 50 million people died 1918 Smallpox Global pandemic 300 million people died in 20th c Eliminated 20th c Poliomyelitis Global pandemic Contracted by up to 500,000 persons per year 1950’s/1960’s 1950’s -1960’s AIDS Global pandemic – mostly fatal 10% Sub-Saharans are carriers Late 20th century Ebola West African epidemic – 50% fatal Sub-Saharan Africa epicentre Late 20th century
  • 45. For the entirety of human history, Malaria has been the most lethal pathogen to attack man
  • 46. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 1 Malaria Parasitic Biological Disease The Malaria pathogen has killed more humans than any other disease. Malaria may have been a human pathogen for the entire history of our species. Human malaria most likely originated in Africa and has coevolved along with its hosts, mosquitoes and non-human primates. Humans could have originally caught Plasmodium falciparum from gorillas. The first evidence of malaria parasites are approximately 30 million years old, found in mosquitoes preserved in amber from the Palaeogene period.About 10,000 years ago, a period which coincides with the development of agriculture (Neolithic revolution) - malaria started having a major impact on human survival. A consequence was natural selection for sickle-cell disease, thalassaemias, glucose-6-phosphate dehydrogenase deficiency, ovalocytosis, elliptocytosis and loss of the Gerbich antigen (glycophorin C) and the Duffy antigen on erythrocytes because such blood disorders confer a selective advantage against malarial infection (balancing selection). The first description of malaria dates back 4000 years to 2700 B.C. from China, where ancient writings refer to symptoms now commonly associated with malaria. Early anti-malarial treatments were first developed in China from the Quinghao plant, which contains the active ingredient artemisinin, re-discovered and still used in anti-malaria drugs today. The three major types of inherited genetic resistance to malaria (sickle-cell disease, thalassaemias, and glucose-6-phosphate dehydrogenase deficiency) were all present in the Mediterranean world 2,000 years ago, at the peak of the Roman Empire. The role of epidemics and disease in the ultimate decline and fall of the Roman Empire has been largely overlooked by Epidemiology researchers.
  • 47.
  • 48. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 2 Smallpox Viral Biological Disease The history of smallpox holds a unique place in medical history. One of the deadliest viral diseases known to man, it is the first disease to be treated by vaccination - and also the only disease to have been eradicated from the face of the earth by vaccination. Smallpox plagued human populations for thousands of years. Researchers who examined the mummy of Egyptian pharaoh Ramses V (died 1157 BCE) observed scarring similar to that from smallpox on his remains. Ancient Sanskrit medical texts, dating from about 1500 BCE, describe a smallpox-like illness. Smallpox was most likely present in Europe by about 300 CE. – although there are no unequivocal records of smallpox in Europe before the 6th century CE. It has been suggested that it was a major component of the Plague of Athens that occurred in 430 BCE, during the Peloponnesian Wars, and was described by Thucydides. A recent analysis of the description of clinical features provided by Galen during the Antonine Plague that swept through the Roman Empire and Italy in 165–180, indicates that the probable cause was smallpox. In 1796, after noting Smallpox immunity amongst milkmaids – Edward Jenner carried out his now famous experiment on eight-year-old James Phipps, using Cow Pox as a vaccine to confer immunity to Smallpox. Some estimates indicate that 20th century worldwide deaths from smallpox numbered more than 300 million. The last known case of wild smallpox occurred in Somalia in 1977 – until recent outbreaks in Pakistan and Syria.
  • 49. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 3 Bubonic Plague Bacterial Biological Disease The Bubonic Plague – or Black Death – was one of the most devastating pandemics in human history, killing an estimated 75 to 200 million people and peaking in Europe in the years 1348–50 CE. The Bubonic Plague is a bacterial disease – spread by fleas carried by Asian Black Rats - which originated in or near China and then travelled to Italy, overland along the Silk Road, or by sea along the Silk Route. From Italy the Black Death spread onwards through other European countries. Research published in 2002 suggests that the Black Death began in the spring of 1346 in the Russian steppe region, where a plague reservoir stretched from the north-western shore of the Caspian Sea into southern Russia. Although there were several competing theories as to the etiology of the Black Death, analysis of DNA from victims in northern and southern Europe published in 2010 and 2011 indicates that the pathogen responsible was the Yersinia pestis bacterium, possibly causing several forms of plague. The first recorded epidemic ravaged the Byzantine Empire during the sixth century, and was named the Plague of Justinian after emperor Justinian I, who was infected but survived through extensive treatment. The epidemic is estimated to have killed approximately 50 million people in the Roman Empire alone. During the Late Middle Ages (1340–1400) Europe experienced the most deadly disease outbreak in history when the Black Death, the infamous pandemic of bubonic plague, peaked in 1347, killing one third of the human population.
  • 50. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 4 Syphilis Bacterial Biological Disease Syphilis - the exact origin of syphilis is unknown. There are two primary hypotheses: one proposes that syphilis was carried from the Americas to Europe by the crew of Christopher Columbus, the other proposes that syphilis previously existed in Europe but went unrecognized. These are referred to as the "Columbian" and "pre-Columbian" hypotheses. In late 2011 newly published evidence suggested that the Columbian hypothesis is valid. The appearance of syphilis in Europe at the end of the 1400s heralded decades of death as the disease raged across the continent. The first evidence of an outbreak of syphilis in Europe were recorded in 1494/1495 in Naples, Italy, during a French invasion. First spread by returning French troops, the disease was known as the “French Pox”, and it was not until 1530 that the term "syphilis" was first applied by the Italian physician and poet Girolamo Fracastoro. By the 1800s it had become endemic, carried by as many as 10% of men in some areas - in late Victorian London this may have been as high as 20%. Invariably fatal, associated with extramarital sex and prostitution, syphilis was accompanied by enormous social stigma. The secretive nature of syphilis helped it spread - disgrace was such that many sufferers hid their symptoms, while others carrying the latent form of the disease were unaware they even had it. Treponema pallidum, the syphilis causal organism, was first identified by Fritz Schaudinn and Erich Hoffmann in 1905. The first effective treatment (Salvarsan) was developed in 1910 by Paul Ehrlich which was followed by the introduction of penicillin in 1943.
  • 51. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 5 Tuberculosis Bacterial Biological Disease Tuberculosis - the evolutionary origins of the Mycobacterium tuberculosis indicates that the most recent common ancestor was a human-specific pathogen, which encountered an evolutionary bottleneck leading to diversification. Analysis of mycobacterial interspersed repetitive units has allowed dating of this evolutionary bottleneck to approximately 40,000 years ago, which corresponds to the period subsequent to the expansion of Homo sapiens out of Africa. This analysis of mycobacterial interspersed repetitive units also dated the Mycobacterium bovis lineage as dispersing some 6,000 years ago. Tuberculosis existed 15,000 to 20,000 years ago, and has been found in human remains from ancient Egypt, India, and China. Human bones from the Neolithic show the presence of the bacteria, which may be linked to early farming and animal domestication. Evidence of tubercular decay has been found in the spines of Egyptian mummies, and TB was common both in ancient Greece and Imperial Rome. Tuberculosis reached its peak the 18th century in Western Europe with a prevalence as high as 900 deaths per 100,000 - due to malnutrition and overcrowded housing with poor ventilation and sanitation. Although relatively little is known about its frequency before the 19th century, the incidence of Scrofula (consumption) “the captain of all men of death” is thought to have peaked between the end of the 18th century and the end of the 19th century. With advent of HIV there has been a dramatic resurgence of tuberculosis with more than 8 million new cases reported each year worldwide and more than 2 million deaths.
  • 52. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 6 Cholera Bacterial Biological Disease Cholera is a severe infection in the small intestine caused by the bacterium vibrio cholerae, contracted by drinking water or eating food contaminated with the bacterium. Cholera symptoms include profuse watery diarrhoea and vomiting. The primary danger posed by cholera is severe dehydration, which can lead to rapid death. Cholera can now be treated with re-hydration and prevented by vaccination. Cholera outbreaks in recorded history have indeed been explosive and the global proliferation of the disease is seen by most scholars to have occurred in six separate pandemics, with the seventh pandemic still rampant in many developing countries around the world. The first recorded instance of cholera was described in 1563 in an Indian medical report. In modern times, the story of the disease begins in 1817 when it spread from its ancient homeland of the Ganges Delta in the bay of Bengal in North East India - to the rest of the world. The first cholera pandemic raged from 1817-1823, the second from 1826-1837 The disease reached Britain during October 1831 - and finally arrived in London in 1832 (13,000 deaths) with subsequent major outbreaks in 1841, 1848 (21,000 deaths) 1854 (15,000 deaths) and 1866. Surgeon John Snow – by studying the outbreak cantered around the Broad Street well in 1854 – traced the source of cholera to drinking water which was contaminated by infected human faeces – ending the “miasma” or “bad air” theory of cholera transmission.
  • 53. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 7 Poliomyelitis Viral Biological Disease The history of poliomyelitis (polio) infections extends into prehistory. Ancient Egyptian paintings and carvings depict otherwise healthy people with withered limbs, and children walking with canes at a young age.[3] It is theorized that the Roman Emperor Claudius was stricken as a child, and this caused him to walk with a limp for the rest of his life. Perhaps the earliest recorded case of poliomyelitis is that of Sir Walter Scott. At the time, polio was not known to medicine. In 1773 Scott was said to have developed "a severe teething fever which deprived him of the power of his right leg." The symptoms of poliomyelitis have been described as: Dental Paralysis, Infantile Spinal Paralysis, Essential Paralysis of Children, Regressive Paralysis, Myelitis of the Anterior Horns and Paralysis of the Morning. In 1789 the first clinical description of poliomyelitis was provided by the British physician Michael Underwood as "a debility of the lower extremities”. Although major polio epidemics were unknown before the 20th century, the disease has caused paralysis and death for much of human history. Over millennia, polio survived quietly as an endemic pathogen until the 1880s when major epidemics began to occur in Europe; soon after, widespread epidemics appeared in the United States. By 1910, frequent epidemics became regular events throughout the developed world, primarily in cities during the summer months. At its peak in the 1940s and 1950s, polio would maim, paralyse or kill over half a million people worldwide every year
  • 54. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 8 Typhus Bacterial Biological Disease Typhoid fever (jail fever) is an acute illness associated with a high fever that is most often caused by the Salmonella typhi bacteria. Typhoid may also be caused by Salmonella paratyphi, a related bacterium that usually leads to a less severe illness. The bacteria are spread via deposition in water or food by a human carrier. An estimated 16–33 million cases of typhoid fever occur annually. Its incidence is highest in children and young adults between 5 and 19 years old. These cases as of 2010 caused about 190,000 deaths up from 137,000 in 1990. Historically, in the pre-antibiotic era, the case fatality rate of typhoid fever was 10-20%. Today, with prompt treatment, it is less than 1%. 9 Dysentery Bacterial / Parasitic Biological Disease Dysentery (the Flux or the bloody flux) is a form of gastroenteritis – a type inflammatory disorder of the intestine, especially of the colon, resulting in severe diarrhea containing blood and mucus in the feces accompanied by fever, abdominal pain and rectal tenesmus (feeling incomplete defecation), caused by any kind of gastric infection. Conservative estimates suggest that 90 million cases of Bacterial Dysentery (Shigellosis) are contracted annually, killing at least 100,000. Amoebic Dysentery (Amebiasis) infects some 50 million people each year, with over 50,000 cases resulting in death.
  • 55. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 10 Spanish Flu Viral Biological Disease In the United States, the Spanish Flu was first observed in Haskell County, Kansas, in January 1918, prompting a local doctor, Loring Miner to warn the U.S. Public Health Service's academic journal. On 4th March 1918, army cook Albert Gitchell reported sick at Fort Riley, Kansas. A week later on 11th March 1918, over 100 soldiers were in hospital and the Spanish Flu virus had now reached Queens New York. Within days, 522 men had reported sick at the army camp. In August 1918, a more virulent strain appeared simultaneously in Brest, Brittany-France, in Freetown, Sierra Leone, and in the U.S, in Boston, Massachusetts. It is estimated that in 1918, between 20-40% of the worlds population became infected by Spanish Flu - with 50 million deaths globally. 11 HIV / AIDS Viral Biological Disease AIDS was first reported in America in 1981 – and provoked reactions which echoed those associated for so long with syphilis. Many of the earliest cases were among homosexual men - creating a climate of prejudice and moral panic. Fear of catching this new and terrifying disease was also widespread among the public. The observed time-lag between contracting HIV and the onset of AIDS, coupled with new drug treatments, changed perceptions. Increasingly it was seen as a chronic but manageable disease. The global story was very different - by the mid-1980s it became clear that the virus had spread, largely unnoticed, throughout the rest of the world. The nature of this global pandemic varies from region to region, with poorer areas hit hardest. In parts of sub-Saharan Africa nearly 1 in 10 adults carries the virus - a statistic which is reminiscent of the spread of syphilis in parts of Europe in the 1800s.
  • 56. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 12 Ebola Haemorrhagic Viral Biological Disease Ebola is a highly lethal Haemorrhagic Viral Biological Disease, which has caused at least 16 confirmed outbreaks in Africa between 1976 and 2015. Ebola Virus Disease (EVD) is found in wild great apes and kills up to 90% of humans infected - making it one of the deadliest diseases known to man. It is so dangerous that it is considered to be a potential Grade A bioterrorism agent – on a par with anthrax, smallpox, and bubonic plague. The current outbreak of EVD has seen confirmed cases in Guinea, Liberia and Sierra Leone, countries in an area of West Africa where the disease has not previously occurred. There were also a handful of suspected cases in neighbouring Mali, but these patients were found to have contracted other diseases For each epidemic, transmission was quantified in different settings (illness in the community, hospitalization, and traditional burial) and predictive analytics simulated various epidemic scenarios to explore the impact of medical control interventions on an emerging epidemic. A key medical parameter was the rapid institution of control measures. For both epidemic profiles identified, increasing the rate of hospitalization reduced the predicted epidemic size. Over 4000 suspected cases of EVD have been recorded, with the majority of them in Guinea. The current outbreak has currently resulted in over 2000 deaths. These figures will continue to rise as more patients die and as test results confirm that they were infected with Ebola.
  • 57. Pandemic Black Swan Event Types Ebola is a highly lethal Haemorrhagic Viral Biological Disease, which has caused at least 16 confirmed outbreaks in Africa between 1976 and 2015.
  • 58. Pandemic Black Swan Event Types Type Force Epidemiology Black Swan Event 13 Future Bacterial Pandemic Infections Bacterial Biological Disease Bacteria were most likely the real killers in the 1918 H1N1 Flu Pandemic - the vast majority of deaths in the 1918–1919 influenza pandemic resulted directly from secondary bacterial pneumonia, caused by common upper respiratory- tract bacteria. Less substantial data from the subsequent 1957 and 1968 Flu pandemics are consistent with these findings. If severe pandemic influenza is largely a problem of viral-bacterial co-pathogenesis, pandemic planning needs to go beyond addressing the viral cause alone (influenza vaccines and antiviral drugs). The diagnosis, prophylaxis, treatment and prevention of secondary bacterial pneumonia - as well as stockpiling of antibiotics and bacterial vaccines – should be high priorities for future pandemic planning. 14 Future Viral Pandemic infections Viral Biological Disease What was Learned from Reconstructing the 1918 Spanish Flu Virus Comparing pandemic H1N1 influenza viruses at the molecular level yields key insights into pathogenesis – the way animal viruses mutate to cross species. The availability of these two H1N1 virus genomes separated by over 90 years, provided an unparalleled opportunity to study and recognise genetic properties associated with virulent pandemic viruses - allowing for a comprehensive assessment of emerging influenza viruses with human pandemic potential. There are only four to six mutations required within the first three days of viral infection in a new human host, to change an animal virus to become highly virulent and infectious to human beings. Candidate viral gene pools for future possible Human Pandemics include Anthrax, Ebola, Lassa Fever, Rift Valley Fever, SARS, MIRS, H1N1 Swine Flu (2009) and H7N9 Avian / Bat Flu (2013).
  • 59.
  • 60. Clustering in “Big Data” “A Cluster is a group of the same or similar data elements which are aggregated – or closely distributed – together” Clustering is a technique used to explore content and understand information in every business sector and scientific field that collects and processes very large volumes of data Clustering is an essential tool for any “Big Data” problem
  • 61. Multiple Factor Regression Analysis In a multivariate regression case, where there are two or more independent variables, then the resultant regression plane cannot be visualised within the constraints of a two dimensional plane…..
  • 62. Multiple Factor Regression Analysis In a multivariate regression case, where there are two or more independent variables, then the resultant regression plane cannot be visualised within the constraints of a two dimensional plane…..
  • 63. Data Visualisation - Tufte in R "The idea behind Tufte in R is to use R - the easiest and most powerful open-source statistical analysis programming language - to replicate the excellent data visualisation practices developed by Edward Tufte“ - Diego Marinho de Oliveira - Lead Data Scientist / Ph.D. candidate
  • 64. • “Big Data” refers to vast aggregations (super sets) consisting of numerous individual datasets (structured and unstructured) - whose size and scope is beyond the capability of conventional transactional (OLTP) or analytics (OLAP) Database Management Systems and Enterprise Software Tools to capture, store, analyse and manage. Examples of “Big Data” include the vast and ever changing amounts of data generated in social networks where we maintain Blogs and have conversations with each other, news data streams, geo-demographic data, internet search and browser logs, as well as the ever-growing amount of machine data generated by pervasive smart devices - monitors, sensors and detectors in the environment – captured via the Smart Grid, then processed in the Cloud – and delivered to end-user Smart Phones and Tablets via Intelligent Agents and Alerts. • Data Set Mashing and “Big Data” Global Content Analysis – drives Horizon Scanning, Monitoring and Tracking processes by taking numerous, apparently un-related RSS and other Information Streams and Data Feeds, loading them into Very large Scale (VLS) DWH Structures and Document Management Systems for Real-time Analytics – searching for and identifying possible signs of relationships hidden in data (Facts/Events)– in order to discover and interpret previously unknown Data Relationships driven by hidden Clustering Forces – revealed via “Weak Signals” indicating emerging and developing Application Scenarios, Patterns and Trends - in turn predicating possible, probable and alternative global transformations which may unfold as future “Wild Card” or “Black Swan” events. “Big Data”
  • 65. Clustering in “Big Data” • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of groupings provides an important technique for many statistical and analytic applications. Cluster analysis on the basis of profile similarities or geographic distribution is a method where no prior assumptions are made concerning the number of groups or group hierarchies and internal structure. Geo- demographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings - such as common behavioural traits, Clinical Trial, Morbidity or Actuarial outcomes - along with many other shared characteristics and common factors.....
  • 66. Clustering in “Big Data” • "BIG DATA” ANALYTICS – PROFILING, CLUSTERING and 4D GEOSPATIAL ANALYSIS • • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ structure of data relationships or groupings, is an important starting point forming the basis of many mapping, statistical and analytic applications. Cluster analysis of implicit similarities - such as time-series demographic or geographic distribution - is a critical technique where no prior assumptions are made concerning the number or type of groups that may be found, or their relationships, hierarchies or internal data structures. Geospatial and demographic techniques are frequently used in order to profile and segment populations by ‘natural’ groupings. Shared characteristics or common factors such as Behaviour / Propensity or Epidemiology, Clinical, Morbidity and Actuarial outcomes – allow us to discover and explore previously unknown, concealed or unrecognised insights, patterns, trends or data relationships. • PREDICTIVE ANALYITICS and EVENT FORECASTING • • Predictive Analytics and Event Forecasting uses Horizon Scanning, Tracking and Monitoring methods combined with Cycle, Pattern and Trend Analysis techniques for Event Forecasting and Propensity Models in order to anticipate a wide range of business. economic, social and political Future Events – ranging from micro-economic Market phenomena such as forecasting Market Sentiment and Price Curve movements - to large-scale macro-economic Fiscal phenomena using Weak Signal processing to predict future Wild Card and Black Swan Events - such as Monetary System shocks.
  • 67.
  • 68. Digital Healthcare - Patient Experience and Journey • The last decade has seen an unprecedented explosion in mobile platforms as the internet and mobile worlds came of age. It is no longer acceptable just to have a bricks-and-mortar clinical presence only – patient-focused healthcare providers are now expected to deliver their Patient Experience and Journey via internet websites, mobile phones and more recently tablets.
  • 69. Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume Data Flows – Mobile Enterprise Platforms (MEAP’s) Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting – Data Delivery and Consumption News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM – Data Discovery and Collection – Analytics Engines - Hadoop – Data Presentation and Display Excel Web Mobile – Data Management Processes Data Audit Data Profile Data Quality Reporting Data Quality Improvement – Performance Acceleration GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Management Tools DataFlux Embarcadero Informatica Talend – Info. Management Tools Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg – Data Warehouse Appliances Ab Initio Ascential Genio Orchestra Social Intelligence – The Emerging Big Data Stack
  • 70. GIS MAPPING and SPATIAL DATA ANALYSIS • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data.....
  • 71. GIS Mapping and Spatial Analysis • GIS MAPPING and SPATIAL DATA ANALYSIS • • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data..... • Spatial Data Analysis is a set of techniques for analysing 3-dimensional spatial (Geographic) data and location (Positional) object data overlays. Software that implements spatial analysis techniques requires access to both the locations of objects and their physical attributes. Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial Data Analysis provides techniques to describe the distribution of data in the geographic space (descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster analysis), identify and measure spatial relationships (spatial regression), and create a surface from sampled data (spatial interpolation, usually categorized as geo-statistics). • The results of spatial data analysis are largely dependent upon the type, quantity, distribution and data quality of the spatial objects under analysis.
  • 72. World-wide Visitor Count – GIS Mapping
  • 73. Geo-demographic Clustering in “Big Data” • GEODEMOGRAPHIC PROFILING – CLUSTERING IN“BIG DATA” • • The profiling and analysis of large aggregated datasets in order to determine a ‘natural’ or implicit structure of data relationships or groupings where no prior assumptions are made concerning the number or type of groups discovered or group relationships, hierarchies or internal data structures - in order to discover hidden data relationships - is an important starting point forming the basis of many statistical and analytic applications. The subsequent explicit Cluster Analysis as of discovered data relationships is a critical technique which attempts to explain the nature, cause and effect of those implicit profile similarities or geographic distributions. Demographic techniques are frequently used in order to profile and segment populations using ‘natural’ groupings - such as common behavioural traits, Clinical, Morbidity or Actuarial outcomes, along with many other shared characteristics and common factors – and then attempt to understand and explain those natural group affinities and geographical distributions using methods such as Causal Layer Analysis (CLA).....
  • 74. GIS Mapping and Spatial Analysis • A Geographic Information System (GIS) integrates hardware, software and digital data capture devices for acquiring, managing, analysing, distributing and displaying all forms of geographically dependant location data – including machine generated data such as Computer-aided Design (CAD) data from land and building surveys, Global Positioning System (GPS) terrestrial location data - as well as all kinds of data streams - HDCCTV, aerial and satellite image data..... • Spatial Data Analysis is a set of techniques for analysing spatial (Geographic) location data. The results of spatial analysis are dependent on the locations of the objects being analysed. Software that implements spatial analysis techniques requires access to both the locations of objects and their physical attributes. • Spatial statistics extends traditional statistics to support the analysis of geographic data. Spatial Data Analysis provides techniques to describe the distribution of data in the geographic space (descriptive spatial statistics), analyse the spatial patterns of the data (spatial pattern or cluster analysis), identify and measure spatial relationships (spatial regression), and create a surface from sampled data (spatial interpolation, usually categorized as geo-statistics).
  • 77.
  • 78. Targeting – Map / Reduce Consume – End-User Data Data Acquisition – High-Volume – Mobile Enterprise Platforms (MEAP’s) – Data Delivery and Consumption – Data Discovery and Collection – Analytics Engines - Hadoop – Data Management Processes – Performance Acceleration Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM Data Audit Data Profile Data Quality Reporting Data Quality Improvement Data Extract, Transform, Load GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Presentation and Display – Data Management Tools – Info. Management Tools – Data Warehouse Appliances Excel Web Mobile DataFlux Embarcadero Informatica Talend Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now EMC2) Extreme Data xdg Zybert Gridbox Ab Initio Ascential Genio Orchestra
  • 79. Clustering Phenomena in “Big Data” “A Cluster is a group of profiled data similarities aggregated closely together” • Cluster Analysis is a technique which is used to explore very large volumes of structured and unstructured data - transactional, machine generated (automatic) social media and internet content and geo-demographic information - in order to discover previously unknown, unrecognised or hidden logical data relationships.
  • 80. Event Clusters and Connectivity A B C D E G H F The above is an illustration of Event relationships - how Events might be connected. Any detailed, intimate understanding of the connection between Events may help us to answer questions such as: - • If Event A occurs does it make Event B or H more or less likely to occur ? • If Event B occurs what effect does it have on Events C,D,E, F and G ? Answering questions such as these allows us to plan our Event Management approach and Risk mitigation strategy – and to decide how better to focus our Incident / Event resources and effort…..
  • 81. Event Clusters and Connectivity • Aggregated Event includes coincident, related, connected and interconnected Event: - • Coincident - two or more Events appear simultaneously in the same domain – but they arise from different triggers (unrelated causal events) • Related - two more Events materialise in the same domain sharing common Event features or characteristics (may share a possible hidden common trigger or cause – and so are candidates for further analysis and investigation) • Connected - two more Events materialise in the same domain due to the same trigger (common cause) • Interconnected - two more Events materialise together in a Event cluster, series or “storm” - the previous (prior) Event event triggering the subsequent (next) event in an Event Series….. • A series of Aggregated Events may result in a significant cumulative impact - and are therefore frequently identified incorrectly as Wild-card or Black Swan Events - rather than just simply as event clusters or event “storms”.....
  • 82. Event Clusters and Connectivity 1 2 3 4 5 7 8 6 The above is an illustration of Event relationships - how Risk Events might be connected. A detailed and intimate understanding of Event clusters and the connection between Events may help us to understand: - • What is the relationship between Events 1 and 8, and what impact do they have on Events 2 - 7 ? • Events 2 - 5 and Events 6 and 7 occur in clusters – what are the factors influencing these clusters ? Answering questions such as these allows us to plan our Risk Event management approach and mitigation strategy – and to decide how to better focus our resources and effort on Risk Events and fraud management. Claimant 1 Risk Event Claimant 2 Residence Vehicle Event Cluster
  • 83. Aggregated Event Types ATrigger A Coincident Events BTrigger B Event Event CTrigger 1 Related Events DTrigger 2 Event Event E Trigger Connected Events Event EventF GTrigger Inter-connected Events Event Event H
  • 85. From sports to scientific research, a surprising range of industries will begin to find value in big data.....
  • 86. Big Data – Products The MapReduce technique has spilled over into many other disciplines that process vast quantities of information including science, industry, and systems management. The Apache Hadoop Library has become the most popular implementation of MapReduce – with framework implementations from Cloudera, Hortonworks and MAPR
  • 87. “Big Data” Applications • Science and Technology – Pattern, Cycle and Trend Analysis – Horizon Scanning, Monitoring and Tracking – Weak Signals, Wild Cards, Black Swan Events • Multi-channel Retail Analytics – Customer Profiling and Segmentation – Human Behaviour / Predictive Analytics • Global Internet Content Management – Social Media Analytics – Market Data Management – Global Internet Content Management • Smart Devices and Smart Apps – Call Details Records – Internet Content Browsing – Media / Channel Selections – Movies, Video Games and Playlists • Broadband / Home Entertainment – Call Details Records – Internet Content Browsing – Media / Channel Selections – Movies, Video Games and Playlists • Smart Metering / Home Energy – Energy Consumption Details Records • Civil and Military Intelligence  Digital Battlefields of the Future – Data Gathering  Future Combat Systems - Intelligence Database  Person of Interest Database – Criminal Enterprise, Political organisations and Terrorist Cell networks  Remote Warfare - Threat Viewing / Monitoring / Identification / Tracking / Targeting / Elimination  HDCCTV Automatic Character/Facial Recognition • Security  Security Event Management - HDCCTV, Proximity and Intrusion Detection, Motion and Fire Sensors  Emergency Incident Management - Response Services Command, Control and Co-ordination • Biomedical Data Streaming  Care in the Community  Assisted Living at Home  Smart Hospitals and Clinics • Internet of Things (IOT)  SCADA Remote Sensing, Monitoring and Control  Smart Grid Data (machine generated data)  Vehicle Telemetry Management  Intelligent Building Management  Smart Homes Automation
  • 88. Comparing Data in RDBMS, Appliances and Hadoop RDBMS DWH DWH Appliance Hadoop Cluster Data size Gigabytes Terabytes Petabytes Access Interactive and batch Interactive and batch Batch Structure Fixed schema Fixed schema Unstructured schema Language SQL SQL Non-procedural Languages (NoSQL, Hive, Pig, etc) Data Integrity High High Low Architecture Shared memory - SMP Shared nothing - MPP Hadoop DFS Virtualisation Partitions / Regions MPP / Nodal MPP / Clustered Scaling Nonlinear Nodal / Linear Clustered / Linear Updates Read and write Write once, read many Write once, read many Selects Row-based Set-based Column-based Latency Low – Real-time Low – Near Real-time High – Historic Information Figure 1: Comparing RDBMS to MapReduce
  • 89. “Big Data” – Analysing and Informing • “Big Data” is now a torrent raging through every aspect of the global economy – both the public sector and private industry. Global enterprises generate enormous volumes of transactional data – capturing trillions of bytes of information from the internal and external environment. Data Sources include Social Media, Internet Content, Remote Sensors, Monitors and Controllers, and transactions from their own internal business operations – global markets. supply chain, business partners, customers and suppliers. 1. SENSE LAYER – Remote Monitoring and Control Devices – WHAT and WHEN? 2. COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE) – VIA ? 3. SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – WHY? 4. GEO-DEMOGRAPHIC LAYER – Social Media, People and Places – WHO and WHERE ? 5. INFORMATION LAYER – “Big Data” and Internet Content data set “mashing” – HOW ? 6. INFRASTRUCTURE LAYER – Cloud Services / Hadoop Clusters / GPGPUs / SSDs
  • 90. “Big Data” – Analysing and Informing COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE) Biomedical Smart Apps – VIA ? SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – HOW ? INFORMATION LAYER – “Big Data” Analytics MapReduce / Data Set “mashing” Data Science / Causal Layer Analysis – WHY ? INFRASTRUCTURE LAYER – Cloud Service Platforms Hadoop Clusters / GPGPUs / SSDs SENSE LAYER – Remote Monitoring and Control Devices – WHAT and WHEN ? GEO-DEMOGRAPHIC LAYER – People and Places – WHO and WHERE?
  • 91. “Big Data” – Analysing and Informing • SENSE LAYER – Remote Monitoring and Control – WHAT and WHEN? – Remote Sensing – Sensors, Monitors, Detectors, Smart Appliances / Devices – Remote Viewing – Satellite. Airborne, Mobile and Fixed HDCCTV – Remote Monitoring, Command and Control – SCADA • GEO-DEMOGRAPHIC LAYER – People and Places – WHO and WHERE? – Person and Social Network Directories - Personal and Social Media Data – Location and Property Gazetteers - Building Information Models (BIM) – Mapping and Spatial Analysis - Topology, Landscape, Global Positioning Data • COMMUNICATION LAYER – Mobile Enterprise Platforms and the Smart Grid – Connectivity - Smart Devices, Smart Apps, Smart Grid – Integration - Mobile Enterprise Application Platforms (MEAPs) – Backbone – Wireless and Optical Next Generation Network (NGE) Architectures
  • 92. “Big Data” – Analysing and Informing SERVICE LAYER – 4D Geospatial / Real-time / Predictive Analytics – WHY? COMMUNICATION LAYER – Mobile Enterprise Platforms (3G / WiFi + 4G / LTE) Biomedical Smart Apps – VIA ? Market Survey DataTV Set-top Box Channel Selections Smart App Playlists Geographic & Demographic Survey Data EntertainmentFactory Office & Warehouse Wearable & Personal Technology Transport Public Buildings Smart Homes Public house Mall, Shop, Store Smart Kiosks & Cubicles Mobile Smart Apps CCTV / ANPR Social Intelligence Campaign Management e-Business Smart Apps Big Data Analytics The Pyramid™ Customer Loyalty & Brand Affinity The Pyramid™ Analytics Smart Apps INFRASTRUCTURE LAYER – Cloud Services Hadoop Clusters / GPGPUs / SSDs SENSE LAYER – Remote Monitoring, Data and Control Devices – WHAT and WHEN ?
  • 93. “Big Data” – Analysing and Informing • SERVICE LAYER – Real-time Analytics – WHY? – Global Mapping and Spatial Analysis – Service Aggregation, Intelligent Agents and Alerts – Data Analysis, Data Mining and Statistical Analysis – Optical and Wave-form Analysis and Recognition, Pattern and Trend Analysis – Big Data - Hadoop Clusters / GPGPUs / SSDs • INFORMATION LAYER – “Big Data” and Data Set “mashing” – HOW? – Content – Structured and Unstructured Data and Content – Information – Atomic Data, Aggregated, Ordered and Ranked Information – Transactional Data Streams – Smart Devices, EPOS, Internet, Mobile Networks • INFRASTRUCTURE LAYER – Cloud Service Platforms – Cloud Models – Public, Private, Mixed / Hybrid, Enterprise, Secure and G-Cloud – Infrastructure – Network, Storage and Servers – Applications – COTS Software, Utilities, Enterprise Services – Security – Principles, Policies, Users, Profiles and Directories, Data Protection
  • 94. “DATA SCIENCE” – my own special area of Business expertise Targeting – Split / Map / Shuffle / Reduce Consume – End-User Data Data Provisioning – High-Volume Data Flows – Mobile Enterprise Platforms (MEAP’s) Apache Hadoop Framework HDFS, MapReduce, Metlab “R” Autonomy, Vertica Smart Devices Smart Apps Smart Grid Clinical Trial, Morbidity and Actuarial Outcomes Market Sentiment and Price Curve Forecasting Horizon Scanning,, Tracking and Monitoring Weak Signal, Wild Card and Black Swan Event Forecasting – Data Delivery and Consumption News Feeds and Digital Media Global Internet Content Social Mapping Social Media Social CRM – Data Discovery and Collection – Analytics Engines - Hadoop – Data Presentation and Display Excel Web Mobile – Data Management Processes Data Audit Data Profile Data Quality Reporting Data Quality Improvement Data Extract, Transform, Load – Performance Acceleration GPU’s – massive parallelism SSD’s – in-memory processing DBMS – ultra-fast data replication – Data Management Tools DataFlux Embarcadero Informatica Talend – Info. Management Tools Business Objects Cognos Hyperion Microstrategy Biolap Jedox Sagent Polaris Teradata SAP HANA Netezza (now IBM) Greenplum (now Pivotal) Extreme Data xdg Zybert Gridbox – Data Warehouse Appliances Ab Initio Ascential Genio Orchestra The Emerging “Big Data” Stack Information Management Strategy Data Acquisition Strategy
  • 95. Big Data – Process Overview Analytics Big Data Management Big Data Provisioning Big Data Platform Big Data Consumption Data Stream Data ScientistsData Architects Data Analysts Big Data Administration Revenue Stream Data Administrators Data Managers Hadoop Platform Engineering Team Insights
  • 96. Split-Map-Shuffle-Reduce Process Big Data Consumers Split Map Shuffle Reduce Key / Value Pairs Actionable InsightsData Provisioning Raw Data
  • 97. Apache Hadoop Component Stack HDFS MapReduce Pig Zookeeper Hive HBase Oozie Mahoot Hadoop Distributed File System (HDFS) Scalable Data Applications Framework Procedural Language – abstracts low-level MapReduce operators High-reliability distributed cluster co-ordination Structured Data Access Management Hadoop Database Management System Job Management and Data Flow Co-ordination Scalable Knowledge-base Framework
  • 98. Data Management Component Stack Informatica Drill Millwheel Informatica Big Data Edition / Vibe Data Stream Data Analysis Framework Data Analytics on-the-fly + Extract – Transform – Load Framework Flume Sqoop Scribe Extract – Transform - Load Extract – Transform - Load Extract – Transform - Load Talend Extract – Transform - Load Pentaho Extract – Transform – Load Framework + Data Reporting on-the-fly
  • 99. Big Data Storage Platforms Autonomy Vertica MongoDB HP Unstructured Data DBMS HP Columnar DBMS High-availability DBMS CouchDB Couchbase Database Server for Big Data with NoSQL / Hadoop Integration Pivotal Pivotal Big Data Suite – GreenPlum, GemFire, SQLFire, HAWQ Cassandra Cassandra Distributed Database for Big Data with NoSQL and Hadoop Integration NoSQL NoSQL Database for Oracle, SQL/Server, Couchbase etc. Riak Basho Technologies Riak Big Data DBMS with NoSQL / Hadoop Integration
  • 100. Big Data Analytics Engines and Appliances Alpine Karmasphere Kognito Alpine Data Studio - Advanced Big Data Analytics Karmasphere Studio and Analyst – Hadoop Customer Analytics Kognito In-memory Big Data Analytics MPP Platform Skytree Redis Skytree Server Artificial Intelligence / Machine Learning Platform Redis is an open source key-value database for AWS, Pivotal etc. Teradata Teradata Appliance for Hadoop Neo4j Crunchbase Neo4j - Graphical Database for Big Data InfiniDB Columnar MPP open-source DB version hosted on GitHub Big Data Analytics Engines / Appliances
  • 101. Big Data Analytics and Visualisation Platforms Tableaux Tableaux - Big Data Visualisation Engine Eclipse Symentec Eclipse - Big Data Visualisation Mathematica Mathematical Expressions and Algorithms StatGraphics Statistical Expressions and Algorithms FastStats Numerical computation, visualization and programming toolset MatLab R Data Acquisition and Analysis Application Development Toolkit “R” Statistical Programming / Algorithm Language Revolution Revolution Analytics Framework and Library for “R”
  • 102. Hadoop / Big Data Extended Infrastructure Stack SSD Solid State Drive (SSD) – configured as cached memory / fast HDD CUDA CUDA (Compute Unified Device Architecture) GPGPU GPGPU (General Purpose Graphical Processing Unit Architecture) IMDG IMDG (In-memory Data Grid – extended cached memory) Vibe Splunk High Velocity / High Volume Machine / Automatic Data Streaming High Velocity / High Volume Machine / Automatic Data Streaming Ambari High-availability distributed cluster co-ordination YARN Hadoop Resource Scheduling Big Data Extended Architecture Stack
  • 103. Cloud-based Big-Data-as-a-Service and Analytics AWS Amazon Web Services (AWS) – Big Data-as-a-Service (BDaaS) Elastic Compute Cloud (ECC) and Simple Storage Service (S3) 1010 Data Big Data Discovery, Visualisation and Sharing Cloud Platform SAP HANA SAP HANA Cloud - In-memory Big Data Analytics Appliance Azure Microsoft Azure Data-as-a-Service (DaaS) and Analytics Anomaly 42 Anomaly 42 Smart-Data-as-a-Service (SDaaS) and Analytics Workday Workday Big-Data-as-a-Service (BDaaS) and Analytics Google Cloud Google Cloud Platform – Cloud Storage, Compute Platform, Firebrand API Resource Framework Apigee Apigee API Resource Framework
  • 104. Data Warehouse Appliance / Real-time Analytics Engine Price Comparison Manufacturer Server Configuration Cached Memory Server Type Software Platform Cost (est.) SAP HANA 32-node (4 Channels x 8 CPU) 1.3 Terabytes SMP Proprietary $ 6,000,,000 Teradata 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Proprietary $ 1,000,000 Netezza (now IBM) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Proprietary $ 180,000 IBM ex5 (non-HANA configuration) 32-node (4 Channels x 8 CPU) 1.3 Terabytes SMP Proprietary $ 120,000 Greenplum (now Pivotal) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Open Source $ 20,000 XtremeData xdb (BO BW) 20-node (2 Channels x 10 CPU) 1 Terabyte MPP Open Source $ 18,000 Zybert Gridbox 48-node (4 Channels x 12 CPU) 20 Terabytes SMP Open Source $ 60,000
  • 105. Apache Hadoop - Framework Distributions FEATURE Hortonworks Teradata Hadoop Cloudera MAPR Pivotal Open Source Hadoop Library Hcatalog (Hortonworks) Impala MAPR HD Support Yes Yes Yes Yes Yes Professional Services Yes Yes Yes Yes Yes Catalogue Extensions Yes Yes Yes Yes Yes Management Extensions Yes Yes Yes Architecture Extensions Yes Yes Infrastructure Extensions Yes Yes Teradata Cloudera MAPR Pivotal HD Library Support Services Catalogue Management Library Support Services Catalogue Library Support Services Catalogue Management Resilience Availability Performance Library Support Services Catalogue Management Resilience Availability Performance Library Support Services Catalogue Hortonworks Cloudera with Impala EMC Pivotal HD distribution Hortonworks Hcatalog System MAPR with MAPR Control System
  • 106. Gartner Magic Quadrant for BI and Analytics Platforms
  • 107. Apache Hadoop - Framework Distributions FEATURE Intel Hadoop Microsoft HD Hindsight Informatica Vibe IBM BigInsights DataStax Enterprise Open Source Hadoop Library Distribution (Hortonworks) Vibe Symphony Analytics Support Yes Yes Yes Yes Yes Professional Services Yes Yes Yes Yes Yes Catalogue Extensions Yes Yes Yes Yes Yes Management Extensions Yes Yes Yes Architecture Extensions Yes Yes Infrastructure Extensions Yes Yes Hortonworks Vibe Symphony Library Support Services Catalogue Management Library Support Services Catalogue Library Support Services Catalogue Management Resilience Availability Performance Library Support Services Catalogue Intel Hadoop DataStax Library Support Services Catalogue Management Resilience Availability Performance Intel HD Microsoft HD IBM BigInsights Informatica Vibe DataStax Enterprise
  • 109. Apache Hadoop – Cloud Hadoop Platforms FEATURE HP HAVEn AWS EMR SAP HANA Mono-Clustered Big Data Cloud Solution Open Source Hadoop Library HP HAVEn Elastic MapReduce SAP HANA Support Yes Yes Yes Professional Services Yes Yes Yes Catalogue Extensions Yes Yes Yes Management Extensions Yes Architecture Extensions Yes Infrastructure Extensions Yes AWS EMR SAP HANA Library Support Services Catalogue HP HAVEn HP HAVEn AWS EMR SAP HANA Mono-Clustered Big Data Cloud Solution
  • 110. HP HAVEn Big Data Platform
  • 111.
  • 112. IBM BigInsights IBM Platform Symphony: - Parallel Computing and Application Grid management solution
  • 114. Telco 2.0 “Big Data” Analytics Architecture
  • 115. SAP HANA Hortonworks Real-time Big Data Architecture
  • 117. Turing Institute • In his Budget announcement, the chancellor, George Osborne pledged government support for the Turing Institute, a specialist centre named after the great computer pioneer Alan Turing – which will provide a British home for studying Data Science and Big Data Analytics. Clustering and Wave-form algorithms in Big Data are the key to unlocking Cycles, Patterns and Trends in complex (non-linear) systems – Cosmology, Climate and Weather, Economics and Fiscal Policy – in order to forecast future trends, outcomes and events with far greater accuracy. • The chancellor, George Osborne has announced a £42m Alan Turing Institute is to be founded to ensure that Britain leads the way in Data Science, Big Data Analytics for studying complex (non-linear) systems - Clustering and Wave-form algorithmic research in both Deterministic (human activity) and Stochastic (random, chaotic) processes. • Drawing on the name of the famous British mathematician and computer pioneer Alan Turing - who led the Enigma code-breaking work during the second world war at Bletchley Park - the institute is intended to help British companies by bringing together expertise and experience in tackling the challenges of understanding both deterministic and stochastic systems – such as Weather, Climate, Economics, Econometrics and the impact of Fiscal Policy – which require massive data sets and computational power.
  • 119. Turing Institute • The Turing Institute comes at a time when Data Science, Big Data Analytics and complex system algorithm research is front and centre on the commercial stage. The Turing Institute will be the first step to realising the UKs’ digital innovation potential. Exploitation of big data by applying analytical methods - statistical analysis, predictive and quantitative modelling - provides deeper insights and achieves brighter outcomes. • The UK needs a centre of excellence capable of nurturing the talent required to make British Data Science and Big Data Technology world-class. The cornerstone for the new digital technologies isn’t just infrastructure, but the talent that’s needed to found, innovate and grow technology firms and create a knowledge-based digital economy. • The tender to house the institute will be produced this year. It may be a brand-new facility or use existing facilities and space in a university, a Treasury spokesman said. Its funding will come from the Department for Business, Innovation and Skills, and its chief will report to the science minister, David Willetts. Executive appointments and establishment numbers for the Turing Institute have yet to be announced. • "The intention is for this work to benefit British companies to take a critical advantage in the field of Data Science – algorithms, analytics and big data," said the spokesman.
  • 120. The “Bombe” at Bletchley Park
  • 121. Turing Institute • Alan Turing was a pivotal figure in mathematics and computing and has long been recognised as such by fellow mathematicians and computer scientists for his ground- breaking work on Computational Theory. There already exists a Turing Institute at Glasgow University, and an Alan Turing Institute in the Netherlands, as well as the Alan Turing building at the Manchester Institute for Mathematical Sciences. • Alan Turing’s code-breaking work using “the Bombe” - an electromechanical decryption system - led to the de-ciphering of the German "Enigma" codes, which used very highly complex encryption. His crypto-analysis work is claimed to have saved hundreds or even thousands of lives and shortened WWII by as much as two years. Turing later formalised Computational Theory which underpins modern computer science by the separation of data from algorithms – sequences of instructions – in computer. programming languages. • Osborne's announcement marks further official rehabilitation of a scientist who many see as having been badly treated by the British establishment after his work during WWII. Turing, who was homosexual, was convicted of indecency in March 1952, and lost his security clearance with GCHQ - the successor to Bletchley Park. Turing killed himself in June 1954 - but was only given an official pardon by the UK government in December 2013 after a series of public campaigns for recognition of his achievements.
  • 122.
  • 123. Digital Village – Strategic Partners • Digital Village is a consortium of Future Management and Future Systems Consulting firms for Digital Marketing and Lifestyle Strategy – Social Media / Big Data Analytics / Mobile / Cloud Computing / GPS/GIS / Next Generation Enterprise (NGE) / Digital Business Transformation • Colin Mallett Former Chief Scientist @ BT Laboratories, Martlesham Heath – Board Member @ SH&BA and Visiting Fellow @ University of Hertfordshire – Telephone: (Mobile) – (Office) – Email: (Office) • Ian Davey Founder and MD @ Atlantic Forces – Telephone: +44 (0) 203 4026 225 (Mobile) – +44 (0) 7581 178414 (Office) – Email: Ian@atlanticforce.co • Nigel Tebbutt 奈杰尔 泰巴德 – Future Business Models & Emerging Technologies @ INGENERA – Telephone: +44 (0) 7832 182595 (Mobile) – +44 (0) 121 445 5689 (Office) – Email: Nigel-Tebbutt@hotmail.com (Private) Digital Village - Strategic Enterprise Management (SEM) Framework ©
  • 124.
  • 125. Proof-of-concept and Prototype The Patient Pyramid™ approach is lean, agile, smart and creative: - • We start by providing a custom Pyramid™ Enterprise Application as a proof of concept. We then work with client key stakeholders to scope a detailed brief which articulates a business problem domain that the Patient Pyramid™ can help understand and resolve. • We then harvest all current and past patient records along with any other available internal and public domain biomedical data – in order to establish a baseline Patient Pyramid™. • This is augmented by overlaying external data - Social Intelligence and other live streamed Patient Lifestyle / Biomedical data that drives our new real-time Patient Pyramid™ view describing the six primitives - who / what / why / where / when and how. • Finally, we exploit social intelligence for Patient Lifestyle understanding – creating new actionable insights to inform creative medical campaign solutions against the agreed brief. • Post proof-of-concept, we then agree a Pyramid™ Enterprise Application fixed term licence along with Patient Pyramid™ consulting, mentoring, training and support – on- line, on-site, on-demand - whenever and wherever required.