Big Data and Analytics: The New Underpinning for Supply Chain Success? - 17 FEB 2015

Big Data & Analytics:
The New Underpinning for
Supply Chain Success?
Insights on Big Data and Analytics: Uncovering the Potential
for the Supply Chain
02/17/2015
By Lora Cecere
Founder and CEO
Supply Chain Insights LLC

Contents
Research Methodology
Open Content Research
Disclosure
Executive Overview
What Is Big Data?
Current State and the Evolving Opportunity
Defining Your Big Data Opportunity
Using Analytics
Why SAP HANA Is Not the End-State for Big Data
Barriers
Misconceptions
Recommendations
Conclusion
Appendix
Other Reports in This Series
Terms to Know
About Supply Chain Insights, LLC
About Lora Cecere
3
4
4
5
6
7
8
13
16
17
19
20
20
21
24
25
27
27

Research Methodology
The Supply Chain Insights team focuses on bringing supply chain research to business visionaries. In
this report, we share insights on big data and analytics. This is the third year the Supply Chain
Insights team has tracked the adoption of big data concepts by supply chain leaders through a
quantitative study.
This report compares the trends. It is based on a mixed-methods analysis: completion of a
quantitative survey on big data and analytics, followed by qualitative interviews with 18 supply chain
business/ technology leaders.
To participate in the quantitative study, the respondents had to have a basic familiarity with big data
and analytics. In this process, we find that the knowledge level of the supply chain business leader is
low: one in two respondents was disqualified from the study due to a lack of understanding of big data
and analytics, and only 18% of respondents have a big data and analytics initiative.
Demographic data from the quantitative survey is shared in the Appendix, and an overview of the
study is shared in Figure 1.
Figure 1. Big Data Quantitative Study Overview

Open Content Research
This report is shared using the principles of Open Content research. It is intended for you to read,
share, and use to improve your supply chain decisions. Please share this data freely within your
company and across your industry. All we ask for in return is attribution when you use the materials in
this report. We publish under the Creative Commons License Attribution-Noncommercial-Share Alike
3.0 United States and you will find our citation policy here.
Disclosure
Your trust is important to us. In our business, we are open and transparent about our financial
relationships and our research processes; and, we never share the names of respondents or give
attribution to the open comments collected in the research. This research was 100% funded by the
Supply Chain Insights team.
In the development of our research our philosophy is, “You give to us and we give to you.” As a part
of this philosophy, we share data with all respondents; and if interested, we will share our insights
with the respondents on a one-hour phone call with their team. We are committed to delivering
thought-leading content. It is our goal to be the place where visionaries turn to gain an understanding
of the future of supply chain management.

Executive Overview
Today data is everywhere: but, nowhere. The world’s per capita capacity to store information has
doubled every 40 months since the 1980s; and as of 2012, every day globally, 2.5 exabytes of data
are created1
. As a result, social and customer data piles on the doorstep of the corporation, and
operational data sits in the creases and cracks between functions. While many companies invested in
data warehouse technologies and advanced applications for optimization, a common complaint in
qualitative interviews with business leaders is “I cannot get to my data.” One business leader likened
it to a Hotel California where, “The data checks into the system, but does not check out.” In most
companies with heterogeneous information technology landscapes, simple reporting is still a major
problem.
In the face of growing data, companies struggle with the basics. The question is, “Why pursue a big
data and analytics strategy if the company cannot do basis reporting?” No doubt about it, the current
state of analytics is a barrier to building supply chain excellence. It is hard to have a data-driven
discussion if you can’t get access to data.
Figure 2. Current State of Supply Chains
1
IBM, “What Is Big Data? Bringing the Data to the Enterprise”, http://www.ibm.com/big-data/us/en/, 02/16/2015

The answer lies with business goals and requirements. With the need for real-time decision making
amid growing complexity, the business pain for the supply chain leaders is higher. As shown in Figure
2, based on recent research for the Race for Supply Chain 2020 Study, only 20% of companies feel
that their supply chains are working well. Companies are struggling to make supply chains proactive,
agile and outside-in2
. Big data and new forms of analytics offer a major opportunity to close these
gaps. In short, we cannot achieve the goal with traditional solutions.
While executives talk about big data strategies, it is largely just that… talk. Few companies know
where to start. This is the goal of this research report.
What Is Big Data?
While the term big data is overhyped and overused used in the press, we find the concepts and the
potential opportunities are not well-understood by supply chain teams. Let’s start with a definition. For
the purposes of this report, we define big data as data volume greater than a petabyte, and systems
and solutions that utilize a variety of data types with increasing velocity. It is about volume, data
variety and velocity. While we will show in the data for this report that most supply chain leaders do
not have the challenges with data volume, they do have a multitude of opportunities to harness the
opportunities with data variety and drive better decisions through increasing data velocity.
It is a big data opportunity. The use of big data analytics offers major opportunities for business
leaders to close the gaps shown in Figure 2; but, as outlined in this report, there are many obstacles
to overcome to use new forms of analytics. To help you, the reader, let’s start with a simple example.
The first- and second-generation of supply chain planning solutions are based on linear, deterministic
optimization. The depth of optimization in these systems is not sufficient for the global multinational
company. Why? Supply chains are non-linear, complex systems with changing and variable lead
times. Concurrent optimization in combination with cognitive learning can deliver greater insights and
flexibility. The technologies are new, and there is no fixed, or guaranteed, ROI. This leaves the supply
chain leader in a conundrum. How can they find new funds to embrace new techniques when most of
the funding for supply chain technology is focused on maintaining and upgrading Advanced Planning
Systems (APS) and Enterprise Resource Planning (ERP)? For all, it is a dilemma.
2
Supply Chain Insights, Race for Supply Chain 2020, December 2014, http://www.slideshare.net/loracecere

Figure 3. Mergers and Acquisitions
3
The reality is that information technology (IT) budgets are tighter than ever and the average company
is struggling with massive data complexity. With a surge in mergers and acquisitions in the last
decade, coupled with the rise of global multinationals, the average company today has five to seven
Enterprise Resource Planning (ERP) solutions and two to three instances of both demand and supply
planning4
. Maintaining these systems on tighter IT budgets is a problem. This complexity makes
finding funding for big data and analytics initiatives even harder.
Current State and the Evolving Opportunity
Meanwhile, in traditional supply chains, the business processes and software chug along processing
structured data. The data is read and written to relational databases with fixed structures. It is a
heterogeneous environment with many systems. Despite attempts of IT standardization, there are
many systems and versions of software. As a result, access to data is problematic with most
companies struggling to define and execute simple reporting strategies. To these teams, the concepts
of big data supply chains can seem grandiose. Companies will question why go after something new
when they are struggling with what they have today. The uneducated team does not realize that many
3
IMAA, http://www.imaa-institute.org/statistics-mergers-acquisitions.html#MergersAcquisitions_Worldwide, 2/16/2015
4
Supply Chain Insights, Maximizing the Return in Supply Chain Planning,
http://supplychaininsights.com/maximizing_the_roi_in_supply_chain_planning/

of the concepts in big data analytics can make their current environments simpler and easier to
manage.
It is a study of contrasts, and the change is happening slowly. In Table 1, we contrast the current
state and the evolving opportunities with technologies for supply chain management.
Table 1. Current State versus Evolving Technologies
Where should people start? The big data analytics journey should be built with the goal in mind. The
focus should be on some large area of business opportunity. The challenge is to rewire the brain of
the organization to imagine how these new technologies and approaches can transform business
processes. In this report, we share some research to help the teams get started.
Defining Your Big Data Opportunity
Most supply chain opportunities do not lie with better handling of data volume; instead, the bulk of the
opportunities are in the management of disparate data sets and streaming data. In this survey, 12%
of ERP instances were greater than 10 terabytes (and a long way from a petabyte of data). Here are
some examples of big data and analytics opportunities:
 Quality and Warranty Data. Most quality and warranty data is unstructured. This data is typically
in email dialogue between customer service and distributors or rating & review data on e-commerce

sites. Text mining and unstructured text analytics, in combination with sentiment analysis, is seldom
used today; but when it is utilized, companies can sense quality problems 4-6 weeks earlier than
the normal call center processes.
 Redefining Supply Chain Visibility. Demand and supply volatility have never been higher. The
use of unstructured data and social listening, in combination with concurrent optimization and
cognitive learning, can increase supply chain visibility to redefine agility.
 The Internet of Things and Predictive Maintenance. Today, most manufacturing locations have
programmable logic controllers and sensors. The factor of today is ready to be automated. It is an
Internet of Things opportunity. To harness this sensor data and better schedule factories based on
equipment conditions, companies need to embrace the Internet of Things and build systems to
analyze and use streaming data. This approach can redefine product maintenance.
 Cold Chains and Directed Put-Away. Today’s cold chains record average temperatures. When
handling products that are temperature sensitive like oncology drugs, fruit and vegetables, and
meat, the average temperature is not sufficient. Instead, the use of RFID temperature sensing and
streaming data enables intelligent workflow. Products that have been exposed to heat for long
periods of time are cross-docked on unloading directly to customers, while products that are not
exposed to heat are put away into normal storage.
 Counterfeiting and Use of QR Codes. The use of QR codes with a positive scan verifies product
identity. With increasing product counterfeiting, this is growing in importance.
 Manless Vehicles and Telematics. Google, Apple and Uber are redefining driving to be manless
based on telematics and sensing. This use of geolocation codes and actual traffic conditions in
concert with streaming data from car sensors is redefining driving.
The possibilities are endless. The approaches are limited by our current thinking and well-established
paradigms. It is an opportunity to answer supply chain problems that have long been outside of our
reach. The new techniques for big data supply chains enable the harnessing of insights from data that
is not structured to fit neatly into rows and columns. The world where machines learn from, and
interact with, sensors may seem foreign and futuristic; but today, it is being implemented by 9% of
early adopters.5
On one end of the spectrum, companies have groups of employees sweating on
Excel entries, and on the other end of the continuum, companies are beginning to take small steps to
drive technology adoption for big data analytics.
5
Supply Chain Insights, Imagine the Supply Chain of the Future, http://supplychaininsights.com/imagine-the-supply-chain-of-the-
future/

Figure 4. Familiarity with Big Data Concepts
Today, we find that 18% of companies are looking at big data opportunities; and only one in two
supply chain professionals is somewhat familiar with big data concepts. As shown in Figure 4, only
8% consider themselves to be extremely familiar. The challenge for all is rewiring the organizational
mental models to use big data and analytics concepts.
Companies that see it as an opportunity are usually larger (greater than $500 million) and have
started a big data initiative. These groups are frequently seen in discrete manufacturing environments
with more mature analytics teams like the high-tech and electronics, and semiconductor industries.
The teams are cross-functional and multi-disciplinary in approach.

Figure 5. Identifying the Big Data Opportunity
Figure 6. Presence of Big Data and Analytics Teams

Companies know that there is business opportunity with big data and analytics. As a result, 43% of
companies surveyed plan to have a big data initiative within ten years. The average company is
targeting to have a big data and analytics initiative in three years.
For those with a big data initiative, the focus is on the mining of unstructured data and the use of the
Internet of Things (streaming data). The greatest opportunity is in improving product traceability and
supply chain visibility. The use of unstructured data and streaming data from the Internet of Things is
lower performing.
Figure 7. Focus of Big Data Teams
It is a case of when there is focus, the best get better and the gap gets bigger between leaders and
laggards. When companies self-assess their capabilities on data usage, companies with a big data
initiative significantly out-rank their peer group as shown in Figure 8.

Figure 8. Self-Assessment and Comparison of Companies’ Use of Data between Those with and without Big Data
Analytics Teams
Using Analytics
When most companies refer to analytics, they often mean reporting. For many organizations with
disparate systems, simple reporting is an issue. While we do not want to trivialize the pain with
enterprise reporting, the definition for the purposes of this report is much broader. Instead, it is about
the use of descriptive, predictive, prescriptive, and cognitive analytics to learn, sense, and adapt.
Most organizations are familiar with descriptive analytics (reporting and Business Automated
Workflows (BAM)) and predictive analytics (optimization), but they are less familiar with prescriptive
analytics (artificial intelligence) and cognitive learning (use of rules-based ontologies to learn and
adapt). The first and second generation supply chain planning and execution systems were based
upon descriptive and predictive analytics. The third generation of analytics, which is on the horizon of
the next two years, will include prescriptive and cognitive learning. This will be a new set of
technology vendors.

Figure 9. Evolution of Analytics
Often, when companies come back from a big data conference, there is a mistaken belief that big
data and analytics requires a data scientist and offline processes. We see this quite differently. In the
evolution of big data analytics, we believe that we will have more sensor and computer generated
data and greater dependency on “black boxes” of increasing capabilities to reason, think and predict.
The challenge will be to fine-tune and use these new forms of analytics. Companies have not
adjusted to the first generation of black boxes very well, and cognitive computing and artificial
intelligence is a next step.
For companies that are stuck in Excel ghettos, this is a
change management issue. Planners wedded to
spreadsheets are a tough shift. As analytics become easier
to use, companies are well-served to limit the organization’s
dependencies on spreadsheets and move to the new
generation of planning and analytics. This shift needs to be
hit head-on, as many planners have defined their positions
by becoming spreadsheet jockeys. As a result, as can be
seen in Figure 10, the current focus is on visualization.
“This is new language and a new way
of thinking. Where do I go to learn
these new concepts?”
Supply Chain Leader at a Food and
Beverage Company

Figure 10. Current Focus
For the supply chain leader, and their teams, big data is a new concept. It comes with new terms and
revolutionary thinking. The best place to start is visualization, and the second place to focus is cloud-
based analytics.
The world of supply chain applications—which has been defined by neat, nice packaged applications
where the vendors are well-known—is changing. Over the past ten years these older technologies
have consolidated, matured, and prices have fallen. This has happened just in time to enable many
manufacturers and retailers to roll-out multiyear implementations for their global teams.
Traditional supply chain applications evolved to use transactional data to improve the supply chain
response. The foundational element of supply chain systems is order and shipment data. These data
forms are used extensively in the three primary applications of supply chain management: Enterprise
Resource Planning (ERP), Advanced Planning Systems (APS) and Supply Chain Execution (SCE).
The genesis of Enterprise Resource Planning (ERP) systems was to improve the order-to-cash and
procure-to-pay functionality and maintain a common code of accounts for financial accounting.
Similarly, Advanced Planning Systems (APS) applied predictive analytics to these two data types to

plan and improve the supply chain response. In parallel, Supply Chain Execution (SCE) systems
evolved to improve organizational order-to-shipment capabilities.
The gap in importance and perceived performance of enterprise applications for supply chain
management has never been higher (reference Supply Chain Insights Reports, Voice of the Supply
Chain Leader, 2012 and 2013). For the business leader, it is not just about data. It is about solving
the business problem which involves predictive and prescriptive analytics. As supply chain leaders try
to tackle new problems, most do not realize they are entering into the world of big data and analytics,
it just happens. The term is not in their vocabulary. They just want to do more, and solve new
problems, with new forms of data. They are frustrated with current systems. The analytical projects to
consider are:
 Drive New Customer Insights through Advanced Analytics. Traditional business analytics are
set to answer the questions that leaders know to ask. But what about the questions that are
important but companies do not know to ask? As companies build risk mitigation strategies an
important question is, “How long does it take the company to learn about product and service
failures in the market?” These are the questions that the organization does not know to ask.
Techniques like text mining and rules-based ontologies are used to build listening capabilities to
learn early and mitigate issues quickly.
With the heavy-use of SAP architectures by supply chain teams and the aggressive marketing
techniques by SAP, many companies that we work with are considering the investment in SAP
HANA. While it is our belief that SAP HANA will facilitate data reporting for ERP systems that are
primarily columnar in nature, we do not see this as a foundational Big Data Strategy.
Why? We believe this for three reasons:
-Nature of the Data: Supply chain data is horizontal and time-phased; while transactional data
from ERP is columnar in nature. As a result, SAP HANA is a natural complement to ERP, but does
not offer the same advantages to planning.
-Data Variety: The greatest opportunity for supply chain leaders is in harnessing the variety of
data that is available. The SAP HANA architecture is limited to the use of structured data.
-Not the End State. Do take advantage of streaming data and data in pools, companies need to
use multiple approaches.
Why SAP HANA Is Not the End-State for Big Data

 Listen, Test and Learn. Today’s technologies allow corporations to know their customers and get
direct feedback in the form of ratings and reviews, blog comments, and feedback through social
media. These data forms are largely unstructured. As digital marketing programs become digital
business, organizations are seeking to listen cross-functionally to customer sentiment and use
advanced analytics to test and learn in vitro to the market response. Less than 3% of supply chain
leaders can effectively listen to social data and use it cross-functionally to understand customer
sentiment. For most companies, social data is limited to the digital marketing team.
 Sense Before Responding. The traditional supply chain responds. It is often a late and
inappropriate response. It is based on history not current market data. As a result, the traditional
supply chain is poor at sensing either changes in demand or supply. As companies mature, they
quickly realize that sole reliance on order and shipment data increases the latency and delays the
time to respond to market shifts thus putting the supply chain on the back foot.
 Cognitive Learning: Adapt to Change. Today’s supply chains are hard-wired. They are inflexible.
The response is based on average values and simple “if-then-else” logic. Supply chain leaders
today are looking for more flexibility in their systems, but they are not sure what this means.
Leaders are turning to new forms of predictive analytics—rules-based ontologies—to map “multiple
ifs to multiple thens” through learning systems. The combination of new forms of pattern
recognition, optimization, and learning systems is improving the ability for the organization to
improve the response.
 Rethink Supply Chain Visibility. Geolocation, mapping data and visualization along with supply
sensing transmission (e.g., sensors on items, totes, trucks, and rail cars) transforms supply chain
visibility from near real-time to real-time data feeds augmented by actual location information. For
many, this is transformational.
These initiatives are spread throughout the organization. Most are in their infancy. One by one,
companies are trying to use new forms of data to improve supply chain excellence. However, as they
work on the projects, they stumble into new territory. They stumble into the world of big data supply
chains where data no longer can fit within relational databases, and analysis requires new forms of
parallel processing. They learn, albeit sometimes the hard way, that it requires a new approach.
They learn that they cannot stuff these new forms of data into yesterday’s systems.
Barriers
While there are many opportunities, there are also a number of barriers that need to be hit head-on to
unleash the power of big data and analytics. Here are the top five:

1. Current Paradigms. Moving forward requires companies to learn a new language (reference terms
in the appendix) and start to build new processes using new techniques. Companies need to start with
the end in mind, and not be confined by traditional thinking. Could cognitive learning redefine master
data? Could Hadoop on non-relational databases be used to drive new insights and discovery for
distributor data? Could sentiment data along with rating-and-review data help to better understand
consumer insights on new products? These are all possibilities that are feasible today. However, it
requires the rethinking of existing paradigms.
Action Item: Invest development funding for analytics testing into the Supply Chain Center of
Excellence and fund new initiatives to test and learn new techniques.
2. Funding. Current IT spending is tied up in system upgrades and maintenance support of license
systems. Testing big data and analytics concepts with the tight budgets in IT departments requires
funding by the business department. Work cooperatively with the IT department to test new concepts.
3. The Unknown. Current project approaches require a definitive Return on Investment (ROI) and the
detailing of an “as is” and a “to be” state. The investment in big data and analytics techniques requires
investment funding. The projects are small and iterative with an unknown and undefined ROI. This
requires a rewiring of project methodologies.
4. New Ecosystem. The providers of big data and analytics are in an ecosystem of technology
providers that are not well-known to the supply chain leader. It requires investing time to get to know a
new group of technology providers and build relationships and knowledgebases to use a new set of
tools.
Action Item: Invest training monies into developing new skills with new technology approaches.
5. Shortage of Talent. The United States needs 140,000 to 190,000 more workers with ''deep
analytical'' expertise and 1.5 million more data-literate managers6
. The lack of science, technology and
engineering expertise is at the root of the issue. At the Supply Chain Insights Global Summit, Intel
reported hiring a co-op student two years prior to matriculation. Why? The talent pool is tight. Intel saw
6
Lohr, S. (2012, Febuary 12). New York Times. New York Times, p. 1.

a student with promise, and did not want to miss out on the opportunity to hire a scarce and valuable
resource.
Action Item: Actively use student co-op opportunities to identify talent. Find the naturally
curious and analytical employees and train them on big data techniques. Infuse these
employees into the Supply Chain Center of Excellence to apply these new techniques on
clouds, lakes and streams.
Misconceptions
There are also many misconceptions on the use of big data and analytics. Here are the five that we
hear the most frequently:
 Big Data and Analytics Requires Data Cleaning. Many well-intended articles state that a
company needs to get their house in order before starting on a big data and analytics initiative. Few
see the reality that a big data initiative can help the company get that house in order. A simple
analogy is that it’s like believing you must clean your house before your hire a housekeeper. For
example, e-commerce pure-plays like Google and Amazon never complain about master data
management. Their data volumes are larger and the processing is more complex, but they use
different techniques that help them to sidestep some of the conventional problems of master data
management.
 Big Data and Analytics Concepts Require Vast Sums of Money. Start small by funding data
visualization (like Spotfire and Tableau) and in-memory reporting (like Qlikview and SAS). These
initiatives are $100,000-$500,000 and can add enormous value.
 Big Data Initiatives Focus only on Data Volumes. The power of big data and analytics for the
supply chain leader largely lies in the use of disparate data and streaming analytics. It is an
opportunity for all. Just because you do not have a database greater than a petabyte, don’t wait to
start.
 I Can Move Forward with Big Data and Analytics with My Existing Technology Landscape
Providers. The technologies are spawning a new breed of technology providers. Most have deep
experience in the insurance, hospitality and finance industries. The analytics are very different than
the world of technology vendors that most supply chain leaders know and understand.
 I Will Be Successful If My Big Data and Analytics Initiative Is Led by IT. While the majority of
the big data and analytics initiatives are being led by IT, we don’t think that this is the place to start.
We do endorse a cross-functional team, but believe that the leadership should come from the
business leaders.

Recommendations
Step back and form a team of naturally curious people that want to learn and understand the new
opportunity of big data and analytics. Form your own team, start small, and actively learn from others.
Here are five recommendations:
1. Start Small and Learn as You Go. The easiest place to start, and one of the most rewarding, is in-
memory reporting coupled with visualization technologies. Build a data-driven culture through the use of
these techniques and wean the organization off of Excel spreadsheets.
2. Train a Team on Big Data and Analytics Techniques. Invest in learning about new concepts. Invite
big data and analytics vendors to an open house and attend their training classes. Challenge the team
to return from each training class with three to four new opportunities.
3. Imagine the Future of Supply Chain and Formulate a Roadmap. Following the training, host a
strategy session to envision the future. Challenge the teams to think past existing paradigms.
4. Closely Follow the Work of Leaders. Closely follow the work of leaders in the use of artificial
intelligence and cognitive learning, and the Internet of Things. Actively network to build insights.
5. Stabilize the Investment in Legacy Systems. Free-up funding for new forms of big data analytics
by stabilizing legacy investments. Use the savings from big data analytics to self-fund the next phases.
Conclusion
Supply Chains Insights has conducted this study for the past three years; and in the three years of
doing this study, we find that companies are making little progress on the adoption of big data
strategies. The benefit lies in teasing-out insights from the variety of data that is available.
Unstructured, mobile, streaming, and geolocation data offers great promise to improve supply chain
processes; however, it cannot happen without embracing big data techniques.
Start slowly, focus on the use of the different data types, and build with the end in mind. Leaders will
capture the art of the possible, while laggards will be limited to insights available only in rows and
columns. Now is the time to prepare, invest and re-evaluate what is possible.

Appendix
In this section we share the demographic information of survey respondents, as well as additional
charts referenced in the report to substantiate the findings.
The participants in this research answered the surveys of their own free will. There was no exchange
of currency to drive an improved response rate. The primary incentive made to stimulate the
response was an offer to share and discuss the survey results in the form of Open Content research
at the end of the study.
The names, both of individual respondents and companies participating, are held in confidence. The
demographics are shared to help the readers of this report gain a better perspective on the results.
The demographics and additional charts are found in Figures A–F.
Figure A. Respondent by Company Type

Figure B. Overview of Companies in the Research by Industry
Figure C. Characteristics of Respondents by Company Size

Figure D. Characteristics of Respondents by Role within the Company
Figure E. Respondent by Role

Figure F. Size of ERP Instances
Other Reports in This Series:
This is the third year of studying the adoption of big data concepts by supply chain leaders. Readers
may gain added value by accessing complimentary reports on the Supply Chain Insights website:
Big Data: Go Big or Go Home
Big Data Handbook
Imaging the Supply Chain of the Future

Terms to Know
Early adopters of big data systems have defined a new set of techniques and terms to know. These
are provided to help the supply chain leader become conversant, but not an expert, in reading about
big data systems:
Apache Hadoop. An Apache Foundation Project open source code written in Java and used for the
retrieval and storing of data and metadata for computation in big data systems. It is a platform
consisting of a distributed file system and a distributed parallel processing framework.
Apache Hive. infrastructure built on top of Hadoop for providing data summarization, query, and
analysis. While initially developed by Facebook, Apache Hive is now used and developed by other
companies such as Netflix. Amazon maintains a software fork of Apache Hive that is included in
Amazon Elastic MapReduce on Amazon Web Services.
Apache Spark. Spark is a fast and general processing engine compatible with Hadoop data. It can
run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS,
HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch
processing (similar to MapReduce) and new workloads like streaming, interactive queries, and
machine learning. In contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's in-
memory primitives provide performance up to 100 times faster for certain applications.
Artificial Intelligence. The intelligence exhibited by machines or software.
Beowulf Clusters. A computer cluster of what are normally identical, commodity-grade computers
networked into a small local area network with libraries and programs installed which allow
processing to be shared among them. The result is a high-performance parallel computing cluster
from inexpensive personal computer hardware.
Cascading. A software abstraction layer for Apache Hadoop. Cascading is used to create and
execute complex data processing workflows on a Hadoop cluster using any JVM-based language
(Java, JRuby, Clojure, etc.), hiding the underlying complexity of MapReduce jobs..
Cognitive Computing. Advanced analytic computing that adds the relationship of people and
thinking to artificial intelligence through the use of rules-based ontologies to enable machine learning
based on context and adaptive rule-sets.

Map Reduce. Developed by Google to support distributed computing on large data sets on computer
clusters. It is a parallel programming model for distributed data processing designed to address the
needs of naïve parallel problems. There are three phases:
MAP Phase: Reads input and filters and distributes the output of the results.
Shuffle and Sort Phase: Takes outputs from the MAP and sends to the reducer.
Reduce Phase: Collects the answers to the sub-problems and combines the results.
Ontology. An ontology is an explicit specification of a conceptualization. Rules-based ontologies
provide a roadmap for artificial intelligence by defining the vocabulary for queries and assertions to be
exchanged among agents. Rules-based ontologies enable the mapping of “multiple ifs to multiple
thens.”
Parallel Processing. Distributing data and business processing across multiple servers
simultaneously to reduce data processing times.
Pattern Recognition. Techniques to sense patterns in data that can be used in decision making.
Pig. A programming language often used to simplifies MapReduce programming.
Ratings and Review Data. Consumer product and service evaluation data. It is largely unstructured.
Sentiment Analysis. The use of natural language processing, computational linguistics, and text
analytics to identify and extract meaning from customer data.
Social Data. Data from social networks like LinkedIn, Facebook, Pinterest, and Twitter.
Structured Data. Transactional data that can easily be represented by rows and columns and stored
in relational databases. Today’s supply chain applications run on structured data. Examples include
orders, shipments, forecasts, costs, etc.
Survival Data Mining. A use of predictive analytics to identify when something is likely to occur in a
defined time span.
Text Mining. The process of mining unstructured text for pattern recognition and context.
Unstructured Data. Data that cannot be easily represented in relational data bases. Common
unstructured data in supply chains includes quality, documents, maps, pictures, videos, email, and
customer service and warranty data.

About Supply Chain Insights, LLC
Founded in February, 2012 by Lora Cecere, Supply Chain Insights LLC is focused on delivering
independent, actionable, and objective advice for supply chain leaders. If you need to know
which practices and technologies make the biggest difference to corporate performance, turn to us.
We are a company dedicated to this research. Our goal is to help you understand supply chain
trends, evolving technologies and which metrics matter.
About Lora Cecere
Lora Cecere (twitter ID @lcecere) is the Founder of Supply Chain Insights LLC and
the author of popular enterprise software blog Supply Chain Shaman currently read
by 5,000 supply chain professionals. She also writes as a Linkedin Influencer and
is a a contributor for Forbes. She has written three books. The first book, Bricks
Matter, (co-authored with Charlie Chase) published in 2012. The second book, The
Shaman’s Journal published in September 2014, and the third book, Supply Chain
Metrics Metrics That Matter, which published in December 2014.
With over twelve years as a research analyst with AMR Research, Gartner Group, and Altimeter
Group, and now as a Founder of Supply Chain Insights, Lora understands supply chain. She has
worked with over 600 companies on their supply chain strategy and speaks at over 50 conferences a
year on the evolution of supply chain processes and technologies. Her research is designed for the
early adopter seeking first mover advantage.

Big Data and Analytics: The New Underpinning for Supply Chain Success? - 17 FEB 2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big Data and Analytics: The New Underpinning for Supply Chain Success? - 17 FEB 2015

Similar to Big Data and Analytics: The New Underpinning for Supply Chain Success? - 17 FEB 2015 (20)

More from Lora Cecere

More from Lora Cecere (20)

Recently uploaded

Recently uploaded (20)

Big Data and Analytics: The New Underpinning for Supply Chain Success? - 17 FEB 2015