Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
NFLabs Profile
1. NFLABS
S I M P L I F Y I N G B I G D ATA !
Private and Confidential. Please do not distributed. November 2012!
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
2. AGENDA!
• Executive Summary & Market Opportunity!
• Product Overview !
• Business Strategy!
• Achievements !
• Team!
• Case Studies!
• Appendix!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
3. EXECUTIVE SUMMARY: What is Big Data?
“It is a tagline to describe tools, technology, and
procedures to manage and process large, complex
data sets within a short period of time.”
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
4. 3V’s of Big Data
BATCH! VOLUME!
VELOCITY!
PERIODIC!
NEAR REAL-TIME!
PB!
REAL-TIME! TB!
GB!
KB! MB!
TABLE!
REPORTS!
CMS! DATABASE!
VIDEOS!
AUDIO!
PHOTO!
WEB!
MOBILE!
VARIETY! SOCIAL!
LOGS!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
5. Why is Big Data important?
“Everyday, we create 2.5 quintillion bytes of data—
so much that 90% of the data in the world today has
been created in the last two years alone.” IBM
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
6. Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
7. So what?
“It is an opportunity to find insights in new and
emerging types of data and content, to make your
business more agile, and to answer questions that
were previously considered beyond your reach.
UNTIL NOW, THERE WAS NO PRACTICAL WAY TO HARVEST
THIS OPPORTUNITY.” IBM
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
8. Ok. So what are some big data technologies/tools?
Most big data tools are open source, but…
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
9. Big Data commercial landscape is large and complex
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
10. Because big data market is HUGE
US
Big
Data
Market
Forecast
(US$
Billions)
Korea
$60.0
$3.7
$50.0
$3.4
$40.0
$2.3
$30.0
$53.4
$48.0
$20.0
$1.2
$32.1
$10.0
$0.7
$16.8
$0.4
$10.2
$5.1
$0.0
2012
2013
2014
2015
2016
2017
Korea
market
size
in
red
calculated
per
percentage
of
US
GDP.
Source:
Wikibon
2012
$5.46
Billion
Total
Addressable
Market
Opportunity
with
58%
CAGR.
($360
Million
just
for
Korea)
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
11. So where do I start?
NFLABS
• We are a company formed by technologists, entrepreneurs, and developers focused on
simplifying big data. We believe:!
• Big data will change the world—both at the enterprise and consumer level!
• Big data is too complex—the only way to accelerate adoption is to make it simple!
• Open source is AWESOME—some of the best code is open source, and power of
community is far stronger than power of enterprise!
• We aim to deliver two simple things:!
• Flexibility of platform—We love Hadoop but we believe you shouldn’t be limited to the
Hadoop ecosystem. Different enterprises will have different platform needs.!
• Simplicity of data insight—We don’t believe you need a PhD or be able to see the matrix
to gain insight into data. We believe data is like Lego—Exploration + Creativity = Beautiful
Creation (and Value).!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
12. Company Overview
• NAME: NFLabs Inc.!
• ESTABLISHMENT: September 8, 2011!
• CEO: Sejun Ra!
• OPERATIONS: Seoul, KR and San Jose, CA* (EST January
2013)!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Executive Summary & Market Opportunity!
13. PRODUCT OVERVIEW
PELOTON
(P) MONITOR!
(P)
PLATFORM
(D) EXPLORE! (D)
DATA
• Custom
build
(P) DEPLOY! • Explore
and
your
own
big
(D) CONSUME!
analyze
all
data
data
plaPorm
in
(P) MANAGE! your
data
in
real
minutes
Sme.
(D) ANALYZE!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Product Overview!
14. ARCHITECTURE
Web
Client
(Peloton
Manager)
AnalyScs
IDE
(e.g.
R-‐Studio)
OPEN
APIs
(REST,
Connectors)
DATA SOURCE! DATA INSIGHT!
Search! Explore! Analyze!
Agent-‐based
PELOTON ENGINE!
• real-time index! • across data type! • real-time!
Machine Logs! • no ETL!
• template-based! • across data store!
Agent-‐less
C DATA STORE!
O
N
CRM! ERP! N
E
C
HDFS! HIVE DW!
RDBMS! T
O
R!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Product Overview!
15. MAIN FEATURES: DEPLOY
DEPLOY!
Ease of deployment! Click and Save. No coding or terminal necessary.!
Deploy only what you need! An ever expanding package depository to deploy only needed
components—from Hadoop and sub-components to our custom
built R-Spark for real-time R based analysis.!
Auto-deploy! Create triggers to auto deploy additional components!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Product Overview!
16. MAIN FEATURES: MANAGE & MONITOR
MANAGE & MONITOR!
Monitor! View the state of all your deployments and their components!
Manage! Remotely add, delete, or change configurations of all your
deployments!
Alerts! Set alerts to inform you of errors via email or SMS!
Dashboard! Setup command center dashboards using flexible widgets to
monitor and view all your services in one window.!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Product Overview!
17. MAIN FEATURES: EXPLORE & ANALYZE
EXPLORE & ANALYZE!
Template based analysis! Use our built in log mappers to analyze all your machine logs!
Real-time search! Search or view your data in real-time as it streams to our
platform!
Query and compare! Perform advanced SQL query calls directly to data in HDFS. No
need to ETL to an RDBMS.!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Product Overview!
18. STRATEGY: Product
Continue to add analytical algorithmic templates All within one system,
VALUE ADDED
and integrate with popular IDEs and clouds.! one command center.!
MODULES!
RETAIL
SHARK! ANALYTICS
ANALYTICS!
Continue to curate and upgrade the best open
DEPOSITORY!
source big data packages available!
PACKAGE
ECOSYSTEM!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Business Strategy!
19. ACHIEVEMENTS
BIG
DATA
CLOSED
CLOSED
APPLIANCE
RENEW
SERVICE
SILICON
VALLEY
CONSULTING
GOVERNMENT
LAUNCH
WITH
CONTRACT
WITH
SKBB
OFFICE
LAUNCH
PROJECT
PROJECT
HYOSUNG
Q1 2012
ŸŸŸŸŸŸŸŸŸŸ Q2 2012
ŸŸŸŸŸŸŸŸŸŸ Q3 2012
ŸŸŸŸŸŸŸŸŸŸ Q4 2012
ŸŸŸŸŸŸŸŸŸŸ Q1 2013
ŸŸŸŸ
PARTNERSHIP
SIGNED
WITH
TECHNOLOGY
HYOSUNG
INFORMATION
PARTNERSHIPS
WITH
SYSTEM
(HIS)
TO
JOINTLY
HORTONWORKS
AND
LAUNCH
OF
PELOTON
BUILD
BIG
DATA
APPLIANCE
CLOUDERA
v1.0
(FULLLY
PACKAGED
SOFTWARE)
ACCEPTED
TO
LAUNCH
OF
REGISTRATION
OF
SPARKLABS
REGISTRATION
OF
PELOTON
SMART
ROUTING
INCUBATOR
CLOUDVFS
PATENT
BETA
PATENT
PROGRAM
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Achievements!
20. TEAM
BEST
AND
MOST
EXPERIENCED
HADOOP
MULTI-‐DECADE
DEVELOPERS
IN
EXPERIENCE
IN
KOREA
SALES,
BUSINESS
EXPERIENCE
IN
DEVELOPMENT,
VARIOUS
MANAGEMENT
VERTICALS
(CLOUD,
NETWORK,
NFLABS
MOBILE,
OPEN
SOURCE)
OVER
20
MULTI-‐YEAR
YEARS
OF
GLOBAL
DISTRIBUTED
WORKING
PLATFORM
EXPERIENCE
(EU,
EXPERIENCE
US,
APAC)
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Team!
21. CASE STUDIES
COMPANY OVERVIEW!
SK broadband is one of Korea’s largest telecommunication companies. Their services
range from IPTV to network hosting services.!
!
SK broadband customers’ data needs were growing rapidly, so they needed a storage
solution that would automatically scale while keeping cost investments low, both in
terms of infrastructure as well as management overhead.!
!
SOLUTION!
Peloton provides a multi-tier storage system with near real-time log processing engine.
Our Hadoop based Peloton storage cluster is over 1 PB, distributed across SK’s
network. The Peloton log processing engine also collects and processes over 1TB of
logs every day. To add more capacity, they simply add additional commodity servers.
SK currently have only 2 engineers managing the whole system.!
!
BEFORE NFLABS!
They had a choice between a very expensive SAN solution or a monolithic NAS build
out that would be a nightmare to manage and require multiple engineers.!
CURRENTLY SERVICING SK BROADBAND’S TOP ENTERPRISE CUSTOMERS!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Case Studies!
22. CASE STUDIES
COMPANY OVERVIEW!
KT (Korea Telecom) is the largest telecommunications operator in Korea. They provide
services ranging from mobile to IPTV, from hosting to cloud services.!
!
KT is a leader in innovation and are preparing to build a marketplace for their mobile
customers. Their marketplace required high performing data processing platform that
would scale automatically if needed. All this in less than 6 months at low cost.!
!
SOLUTION!
Peloton’s Hadoop based platform with HIVE, HBASE, MapReduce, and ElasticSearch
provides one of the best platforms for large scale high performance data processing.
And as Peloton is cloud enabled, it can be set up with triggers to automatically deploy
new Peloton instances if necessary.!
!
AFTER NFLABS!
Instead of massively building out a costly relational database, KT was able to leverage
commodity servers to handle expected data sets of over 100 TBs per day. And with the
help of Peloton, their analytical systems were able to conduct both large scale batch
analysis as well as near real time request based assessments.!
TO SERVICE KT’S 20 MILLION+ CUSTOMER BASE!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Case Studies!
23. CASE STUDIES
CONSORTIUM!
OVERVIEW!
In 2011, the Korean government had tasked KT to build the next generation content
delivery network (CDN). As part of a consortium led by KT, NFLabs was tasked to build
the next generation cloud storage back-end and log processing platform for this CDN.!
!
SOLUTION!
Peloton was chosen among many solutions to be the foundation for this nationally
approved cloud storage and log processing engine. Peloton provided the fault-tolerant
(auto replication), decentralized (no single point of failure), elastic (scale up/down)
solution the consortium was seeking.!
!
BEFORE NFLABS!
The KT Consortium reviewed many solutions based on different DFS systems (Lustre,
Glustre), and SAN/NAS alternatives, and separate log processing solutions, but could
not find a fully integrated end-to-end solution like Peloton.!
NATIONALLY APPROVED TECHNOLOGY!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved. Case Studies!
24. APPENDIX!
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
25. CLOUDVFS*: UNIVERSAL CONNECTOR
CloudVFS
is
a
programmaSc
interface
with
support
for
most
standard
storage
connectors/controllers.
But
the
power
of
CloudVFS
is
its
naSve
programmaSc
architecture
that
allows
events
(load
balancing,
failover,
data
movement,
etc)
to
be
customized
based
on
triggers.
Wide
Range
Support
of
connectors
Programma'c
Interface
Load
Balancing
ReplicaSon
Failover
Custom
Support
of
most
storage
and
DB
connectors:
Program
different
events:
Load
Balancing,
HDFS,
S3,
POSIX,
and
many
more
ReplicaSon,
Failover
and
Custom
algorithm
plugins
Configurable:
dynamic
load/unload
Configurable:
dynamic
load/unload
*Patent
Pending
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.
26. COMPARATIVE VALUE
Pure
AnalyScs
Cloud
Based
Build-‐your-‐own
Sofware
AnalyScs
Service
with
stack
EASE
of
USE
Steep
learning
curve.
Inconsistent
layout
Easily
manage,
search,
UI
clugered
with
and
features.
Difficult
and
analyze
from
one
Can
be
customized
to
excessive
features.
to
transfer
data
or
user
base.
single
plaPorm
and
UI.
IntegraSon
with
connect
to
data
legacy
storage
may
source.
PotenSal
not
work.
privacy
issues.
Low
iniSal
Need
to
buy
separate
Very
high
iniSal
One
purchase.
No
need
investment.
Cost
per
storage,
ETL,
data
investment—HW,
to
buy
separate
sofware
use
very
high.
May
COSTS
processors,
and
all
experSse/support,
for
plaPorm
and
analyScs
need
to
pay
storage
necessary
integrators,
architects,
(or
consultants/ separately.
Cost
connectors.
developers,
etc.
experSse).
prohibiSve
to
do
Very
long
Sme
to
market.
large
data
sets.
Low
TCO.
Powerful
feature
Low
iniSal
Can
be
built
to
match
BENEFITS
Integrates
with
legacy.
set
to
conduct
investment.
Nothing
company’s
needs.
No
technical
experSse
complex
analyScs.
on
premise.
necessary.
Store
and
analyze
data
immediately.
Peloton
offers
the
simplest
and
fastest
way
to
start
leveraging
big
data
technologies.
Private and Confidential. Please do not distributed.
Copyright(c) 2012 NFLabs, Inc. All rights reserved.