Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Big data-is-the-future-of-healthcare
1. • Cognizant 20-20 Insights
Big Data is the Future of Healthcare
With big data poised to change the healthcare ecosystem, organizations
need to devote time and resources to understanding this phenomenon and
realizing the envisioned benefits.
Executive Summary inexpensively and will radically change healthcare
delivery and research. Leveraging big data will
Big data is already changing the way business
certainly be part of the solution to controlling
decisions are made — and it’s still early in the
spiraling healthcare costs.
game. However, because big data exceeds the
capacity and capabilities of conventional storage, Simply by witnessing how big data has trans-
reporting and analytics systems, it demands new formed consumer IT, it is clear that the promise
problem-solving approaches. With the conver- of big data in healthcare is immense (think
gence of powerful computing, advanced database Google, Facebook and Apple’s Siri, which all rely
technologies, wireless data, mobility and social on processing and transmitting massive amounts
networking, it is now possible to bring together of data). While its potential in healthcare has not
and process big data in many profitable ways. been fulfilled, the question is not if, but when.
Big data solutions attempt to cost-effectively This white paper will define big data, explore
solve the challenges of large and fast-growing the opportunities and challenges it poses for
data volumes and realize its potential analytical healthcare, and recommend solutions and tech-
value. For instance, trend analytics allow you to nologies that will help the healthcare industry
figure out what happened, while root cause and take full advantage of this burgeoning trend.
predictive analytics enable understanding of why
it happened and what is likely to happen in the What Is Big Data?
future. Meanwhile, opportunity and innovative
A large amount of data becomes “big data”
analytics can be applied to identifying opportuni-
when it meets three criteria: volume, variety and
ties and improving the future.
velocity (see Figure 1). Here is a look at all three:
All healthcare constituents — members, payers,
providers, groups, researchers, governments, • Volume: Big data means there is a lot of data
—
terabytes or even petabytes (1,000 terabytes).
etc. — will be impacted by big data, which can
This is perhaps the most immediate challenge
predict how these players are likely to behave,
of big data, as it requires scalable storage and
encourage desirable behavior and minimize less
support for complex, distributed queries across
desirable behavior. These applications of big data
multiple data sources. While many organiza-
can be tested, refined and optimized quickly and
cognizant 20-20 insights | september 2012
2. tions already have the basic capacity to store video streams and Web content. While standard
large volumes of data, the challenge is being techniques and technologies exist to deal with
able to identify, locate, analyze and aggregate large volumes of structured data, it becomes a
specific pieces of data in a vast, partially significant challenge to analyze and process a
structured data set. large amount of highly variable data and turn
it into actionable information. But this is also
• Variety: Big data is an aggregation of many where the potential of big data potential lays,
types of data, both structured and unstruc-
as effective analytics allow you to make better
tured, including multimedia, social media,
decisions and realize opportunities that would
blogs, Web server logs, financial transactions,
not otherwise exist.
GPS and RFID tracking information, audio/
What Big Data Looks Like
THE WORLD’S INFORMATION IS DOUBLING
EVERY TWO YEARS, with a collossal 1.8 zettabytes
to be created and replicated in 2011.
New information being created in 2011 also includes replicated
information such as shared documents or duplicated DVDs.
In terms of sheer volume, 1.8 ZB of data is equivalent to:
Every person in the Over
Unites Stated tweeting
or 200 billion HD movies
3 tweets
per minute Each 120 minutes long
4,320 tweets per day per person
it would take one person
for 26,976 years non-stop 47 million years
of 24/7 viewing to watch every movie
Storing 1.8 ZB of information would take:
57.5 billion
32 GB Apple iPads
With that many iPads we could build
a mountain of iPads that is
25-times higher than
Mount Fuji
Mount Fuji 3,776 miles Mount iPad 94,400 miles
Source: “Extracting Value from Chaos,” IDC Universe study, 2011; Mashable.com,
http://mashable.com/2011/06/28/data-infographic/
Figure 1
cognizant 20-20 insights 2
3. • Velocity: While traditional data warehouse Bringing the Patient into the Loop
analytics tend to be based on periodic — daily, The healthcare model is undergoing an inversion.
weekly or monthly — loads and updates of data, In the old model, facilities and other providers
big data is processed and analyzed in real- or were incented to keep patients in treatment —
near-real-time. This is important in healthcare that is, more inpatient days translated to more
for areas such as clinical decision support, revenue. The trend with new models, including
where access to up-to-date information is vital accountable care organizations (ACO), is to
for correct and timely decision-making and incent and compensate providers to keep patients
elimination of errors. Current data is needed to healthy.
support automated decision-making; after all,
you can’t use five-minute-old data to cross a At the same time, patients are increasingly
busy street. Without current data, automated demanding information about their healthcare
decisions cannot be trusted, forcing expensive options so that they understand their choices
and time-consuming manual reviews of each and can participate in decisions about their care.
decision. Patients are also an important element in keeping
healthcare costs down and improving outcomes.
Big Data = Big Opportunities Providing patients with accurate and up-to-date
Big data has many implications for patients, information and guidance rather than just data
providers, researchers, payers and other will help them make better decisions and better
healthcare constituents. It will impact how these adhere to treatment programs.
players engage with the healthcare ecosystem,
especially when external data, regionalization, In addition to data that is readily available,
globalization, mobility and social networking are such as demographics and medical history,
involved (see Figure 2). another data source is information that patients
Sectors Positioned for Greater Gains from Big Data
Historical productivity growth in the U.S., 2000-2008.
24.0
23.5
22.5
Computer and electronic products
9.0 Manufacturing Information services
3.5 Administration, support and Wholesale trade
Transportation
3.0 waste management and warehousing Real estate and rental
Percent Growth
2.5 Professional services Finance and
2.0 insurance
Healthcare providers
1.5
1.0 Utilities Government
0.5 Retail trade
0
Accommodation and food
-0.5
-1.0 Arts and entertainment
-1.5 Management of companies Natural resources
-2.0
-2.5 Other services Educational services
-3.0 Construction
-3.5
Low High
Big data value potential index
Cluster A Cluster B Cluster C Cluster D Cluster E Bubble sizes denote relative size of GDP
Clusters reflect big data scale as measured by industry segment.
Source: U.S. Bureau of Labor Statistics; McKinsey Global Institute
Figure 2
cognizant 20-20 insights 3
4. divulge about themselves. When combined with healthcare system. While integrating external
outcomes, high-quality data provided by patients data poses similar challenges to integrating
can become a valuable source of information for internal data, there are also additional challenges,
researchers and others looking to reduce costs, such as privacy, security and legal concerns, as
boost outcomes and improve treatment. Several well as questions about authenticity, accuracy
challenges exist with self-reported data: and consistency.
• Accuracy: People tend to understate their As an example, external data about healthy
weight and the degree to which they engage people holds immense potential value for
in negative behaviors such as smoking; research and the future delivery of healthcare.
meanwhile, they tend to overstate positive Typical healthcare data includes only people
behaviors, such as exercise. These inaccuracies visiting doctors and hospitals, which biases that
can be accounted for by adjusting these biases data toward people seeking treatment. Adding
and — through big data processing — improve anonymous data from large numbers of healthy
accuracy time. people could help establish baselines, draw cor-
relations and help with understanding the nature
• Privacy concerns: People are generally
of illnesses. More data, effectively used, leads
reluctant to divulge information about
to better information and decisions, and more
themselves because of privacy and other
meaningful efforts.
concerns. Creative ways need to be found to
encourage and incent them to do so without Implications of Regionalization, Globalization
adversely impacting data quality. Effective
External data will come from different medical
mechanisms and assurances must be put into
systems in various regions and countries. Effec-
place to ensure the privacy of the data that
tively working across these disparate data reposi-
patients submit, including de-identification
tories can help identify local knowledge and
prior to external access.
best practices and leverage them regionally and
• Consistency: Standards need to be defined and globally. Aggregating data regionally and globally
implemented to promote consistency in self- also provides healthcare researchers with larger
reported data across the healthcare system populations for clinical studies, trending and
to eliminate local discrepancies and increase disease monitoring for epidemics, as well as early
the usefulness of data. Usage guidelines follow detection and the potential for improved results.
standards.
As data becomes less local and more regional and
• Facility: Mechanisms based on e-health
global, the quality of both data and metadata will
and m-health — such as mobility and social
improve over time as a result of increased data
networking — need to be creatively employed to
scrutiny and the efforts and contributions of big
ease members’ ability to self-report. Providing
data innovators across the broader healthcare
access to some de-identified data can simulta-
data ecosystem. At the same time, sharing data on
neously improve levels of self-reporting as a
a global basis will lead to security challenges, as
community develops among members.
well as issues resulting from different standards,
Improving Quality with External Data terminology and language barriers.
As progress is made toward initiatives such as Information Demands Drive Mobility
electronic health records (EHR), more and more
In many domains, mobility is a solution looking
external data will become available, and this
for a problem. Big data changes that. Demand
will become an integration challenge. External
for ubiquitous access to information mandates
sources include the National Health Information
mobility and other technologies that provide
Network (NHIN), health information exchanges
access on demand. As data becomes more
(HIE), health information organizations (HIO) and
current, it will be necessary to get information
regional health information organizations (RHIO).
into the hands of people with an immediate need
As sources and volume of information increase,
for it, such as for clinical decision support. Users
so will expectations.
will also demand access to this data so they have
In addition to integrating data within the precise and complete information to make the best
healthcare system, there are many potential possible healthcare decisions. Quality of care and
benefits of integrating data from outside of the improved outcomes will be the ultimate benefits.
cognizant 20-20 insights 4
5. Big Data, Social Media and Healthcare providers, labs, ancillary vendors, data vendors,
Social media will increase communication standards organizations, financial institutions
between patients, providers and communities — and regulatory agencies. Solutions for big data
e.g., patients with similar conditions and providers will break the traditional model, in which all data
with similar specialties. This will not only work to is loaded into a warehouse. Data federation will
globalize and democratize healthcare, but it is emerge as a solution in which the big data archi-
also a potentially important source of big data. tecture is based on a collection of nodes within
Social networking data poses challenges such as and outside the enterprise and accessed through
volume, lack of structure and velocity, as well as a layer that integrates the data and analytics.
new challenges around integration and accuracy.
The biggest obstacle to effective use of big data
For example, if a group of patients is discussing is the nature of healthcare information. Payers,
quality of care about a provider, there will likely providers, research centers and other constitu-
never be 100% consensus. Patient experiences ents all have their own silos of data. These are
will be different, and there will be biases based on fundamentally difficult to integrate because
accidents, misunderstandings and other factors. of concerns about privacy and propriety, the
The challenge will be to create useful information complex and fragmented nature of the data, as
out of this collection of data to provide informa- well as the different schemas and standards
tion such as provider ratings and improvement underlying the data and lack of metadata within
guidance. each silo. Even if everyone shared their data,
there would be enough challenges integrating it
Big Data = Big Challenges within the silo, much less outside it.
The problem in healthcare isn’t the lack of data
Although groups such as HIE, RHIO and NHIN are
but the lack of information that can be used to
working to facilitate the exchange of healthcare
support decision-making, planning and strategy.
data, adoption has been slow, as they have faced
As an example, a single patient stay generates
numerous challenges.
thousands of data elements, including diagnoses,
procedures, medications, medical supplies, lab Security
results and billing. These need to be validated,
The entire healthcare system can realize benefits
processed and integrated into a large data source
from democratizing big data access; for example,
to enable meaningful analysis. Multiply this by all
researchers can more easily collaborate, engage
the patient stays across the system and combine
in peer review and eliminate duplication of efforts.
it with the large number of points where data is
Researchers will also be able to more readily
generated and stored, and the scope of the big
identify opportunities where they can contribute
data challenge begins to emerge. And this is only
and collaborate.
a small part of the healthcare data landscape.
The cloud makes exposing and sharing big data
Outlined below are some of the specific challenges
easy and relatively inexpensive. However, sig-
of healthcare big data, including healthcare as a
nificant security and privacy concerns exist,
technology laggard, data fragmentation, security,
including the Health Insurance Portability and
standards and timeliness.
Accountability Act (HIPAA). A credentialing
Healthcare as a Technology Laggard process could facilitate and automate this access,
but there are complexities and challenges. Since
Healthcare is notoriously slow to redefine and
providers, patients and other interested parties
redesign processes and tends to be a laggard in
such as researchers need secure access, data
adopting technology that impacts the healthcare
access should be controlled by group, role and
system, outside of some specific areas such
function. Finally, the security of the data once
as care delivery and research. In addition, the
it leaves the cloud also needs to be assured. Big
healthcare technology landscape includes vast
data can be used to identify patterns and irregu-
areas of legacy technology, causing further com-
larities indicating and preventing security threats,
plications.
as well as other types of fraud.
Fragmentation
Standards
In healthcare, big data challenges are compounded
Dealing with the myriad of standards (and lack
by the fragmentation and dispersion of data
thereof) creates interoperability challenges, at
among the various stakeholders, including payers,
least through the medium term. Big data solution
cognizant 20-20 insights 5
6. architectures have to be flexible enough to cope Teradata, Vertica (HP) and Netezza (IBM). All of
with not only the additional sources but also the these solutions tend to have low time to value
evolution of schemas and structures used for and maintenance but relatively high total cost of
transporting and storing data. To ensure analytics ownership.
are meaningful, accurate and suitable, metadata
and semantic layers are needed that accurately Cloud-hosted software as a service (SaaS) solu-
define the data and provide business context and tions can help reduce the barriers of participating
guidance, including appropriate and inappropri- in the big data arena. Google and Amazon imple-
ate uses of the data. This evolution of standards ment MapReduce-based solutions to process
will eventually improve data quality. huge datasets using a large number of computers
— e.g., terabytes of data on thousands of comput-
Timeliness ers. MapReduce algorithms take large problems
Data timeliness is a challenge in various healthcare and divide them into a set of discrete tasks that
settings, such as clinical decision support, can then be distributed to a large number of com-
whether for making decisions or providing infor- puters for processing and the results combined
mation that guides decisions. Big data can make into a problem solution. Other cloud-based solu-
decision support simpler, faster and ultimately tions include Tableau, which supports visualiza-
more accurate because decisions are based on tion.
higher volumes of data that are more current and
Open-source Hadoop is a framework used by
relevant. In some cases, there is a very limited
many companies as a high-performance, scalable
window for clinical decision support — significant-
and relatively low-cost option for dealing with big
ly smaller than the time it takes to run a report
data. Training, professional services and support
or analytic query. Careful attention to data and
are needed to effectively deploy Hadoop solutions
query structure, scope and execution is needed
using the open source framework. Vendors such
to ensure that the constraints of the processing
as Greenplum (a division of EMC), Microsoft, IBM
windows are observed while still obtaining the
and Oracle have commercialized Hadoop and
best possible answer.
aligned and integrated it with the rest of their
In other cases, streams of data containing database and analytic offerings.
complex and varied events without an overarch-
SaaS is an important technology for democratiz-
ing structure need to be mined. In this case,
ing the results of big data. SaaS-based solutions
those events have to be turned into meaningful
allow healthcare entities that control subsets
measures in real time that are, in turn, suitable for
of data to expose access through services that
rapid analysis. In many cases, the only practical
eliminate some of the aggregation and integra-
solution is to discard most of the data after
tion challenges. Additional services that facilitate
analyzing it and selectively store those results.
analytics, both basic and advanced, can be made
It’s a tradeoff between the competitive advantage
part of the overall offering.
gained from the shorter feedback loop and the
quality of the information that is being fed back.
Recommendations
Capturing only processed data, streaming or
otherwise, results in a loss of data at the expense To successfully identify and implement big data
of creating information. The underlying principle solutions and benefit from the value that big data
of big data is to keep everything, but in some can bring, healthcare organizations need to devote
cases that’s just not practical or even useful — time and resources to visioning and planning. This
sometimes the hoarder reflex has to be checked will provide the foundation needed for strong
and rational decisions made. execution. Without this preparation, organizations
will not realize the envisioned benefits of big data
Big Data = Technology Choices and will risk being left behind competitors.
There are numerous technology solutions for Our recommendations for healthcare organiza-
dealing with big data, ranging from on-site to tions looking to leverage big data include:
cloud and from open source to proprietary.
On-site options that can tame big data include • Establish a business intelligence center of
excellence with a focus on big data.
cognizant 20-20 insights 6