SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Network Traffic Search using
Apache HBase

Evans Ye @ TWHUG 2014 Q1
2014/3/8
Who am I
• Evans Ye @

– Dumbo Team
• Dumbo In Taiwan Blog

– Talk in TWHUG 2013 Q4
• Building Hadoop Based Big Data Environment

– Apache Bigtop Contributor
3/10/2014

Copyright 2013 Trend Micro Inc.
Agenda
• Problem to Solve

• Solution Design
• Flume ETL Process

• Experience Sharing
• Future Work

3/10/2014

Copyright 2013 Trend Micro Inc.
Security Department:
Hey SPN, I have a big data
problem…

3/10/2014

Copyright 2013 Trend Micro Inc.

閃開讓專業的來!
Network Traffic Analysis Example
C&C 2

C&C 1

C&C 3

INTERNET

INTRANET

VICTIM 1

VICTIM 2

TW branch
3/10/2014

Copyright 2013 Trend Micro Inc.

VICTIM 3

VICTIM 4

US branch
Find Malicious Connections by Searching
Netflow logs

• ArcSight Common Event Format
– Volume: 250G/180 million record per day
3/10/2014

Copyright 2013 Trend Micro Inc.
Valuable Fields in Netflow log
• src: source ip

• dst: destination ip
• spt: source port

• dpt: destination port
• proto: protocol, TCP,UDP…
• rt: timestamp, 1386018915000

3/10/2014

Copyright 2013 Trend Micro Inc.
Search for Connections

Query

……
about
8~10min

3/10/2014

Copyright 2013 Trend Micro Inc.

Netflow
Logger
Big Data Problem

3/10/2014

Copyright 2013 Trend Micro Inc.
Choosing The Right Tool
• Big data solutions

• Why HBase?
– We want to try and figure out HBase Thrift limitation
– How HBase performs when dealing with this kind of problem
3/10/2014

Copyright 2013 Trend Micro Inc.
Solution Design
3/10/2014

Copyright 2013 Trend Micro Inc.
Architecture
Data
Soruce

Send Netflow via
syslog

Talk to HBase
using C++,
Python, PHP,
Ruby, Perl…

A simple Python
web framework
Only one file
Query
under 150k
HBase
Thrift
Server

3/10/2014

Copyright 2013 Trend Micro Inc.
User Requirement
• Searchable Fields
–
–
–
–
–
–

src: source ip
dst: destination ip
spt: source port
dpt: destination port
proto: protocol, TCP,UDP…
rt: timestamp, 1386018915000

• Values
– in, cn2, ad.tcp__flags

3/10/2014

Copyright 2013 Trend Micro Inc.
HBase Rowkey Design – First Attempt
• Compose searchable fields to be rowkey

• For client query, scan by applying HBase Filter
– RowFilter (=, 'regexstring:^src#dst#[^#]*#spt#dpt#proto$')“
– See HBase Thrift Filter doc

3/10/2014

Copyright 2013 Trend Micro Inc.
RD Style Search Portal

3/10/2014

Copyright 2013 Trend Micro Inc.
Performance
• Test on 12 million sample data

• The search performance……
• Since we need to store at least 3 month data for query,
The performance might not be good enough…

3/10/2014

Copyright 2013 Trend Micro Inc.
Lesson Leaned
• Avoid full table scan
– HBase Filters can only helps you to filter out un-wanted data to
client side
– On server side, it still need to compare all the rowkeys when
applying filters
–  set STARTROW and STOPROW

3/10/2014

Copyright 2013 Trend Micro Inc.
Avoid Full Table Scan
• Since HBase is natively designed to store data sorted
by rowkey
• It’s fast to scan rows when rowkey prefix specified

– This can only be fast when source ip specified
– How about destination ip, port, protocol,…?

3/10/2014

Copyright 2013 Trend Micro Inc.
Rethink The User Requirement
• Searchable Fields
–
–
–
–
–
–

src: source ip
required
dst: destination ip
spt: source port
dpt: destination port
proto: protocol
rt: timestamp

• User want to track down suspicious connections
– A query at least need to have an IP

3/10/2014

Copyright 2013 Trend Micro Inc.
HBase Rowkey Design – Second Attempt !
– Search on source ip

– Search on destination ip

– Put netflow timestamp into HBase timestamp to leverage HBase
TimeRange Scan
– Set VERSION=>2147483647 to avoid collision
3/10/2014

Copyright 2013 Trend Micro Inc.
HBase Rowkey Design – Second Attempt !

• Search other searchable fields by applying Qualifier
Filter:
– QualifierFilter (=, 'regexstring:^spt#dpt#proto$')

3/10/2014

Copyright 2013 Trend Micro Inc.
Check The User Requirement
• Searchable Fields
–
–
–
–
–
–

3/10/2014

src: source ip
dst: destination ip
spt: source port
dpt: destination port
proto: protocol
rt: timestamp

Copyright 2013 Trend Micro Inc.

 specifiy STARTROW/STOPROW
 specify STARTROW/STOPROW
 apply qualifier filter
 apply qualifier filter
 apply qualifier filter
 specify HBase TimeRange
Deliver New Portal

3/10/2014

Copyright 2013 Trend Micro Inc.
Performance
• Test on 70 million sample data

• The search performance……
• Enough?
– Since malicious connections won’t have large volume, 80% of
query should be responsed in a second

• Duplicate issue:
– Since we only store needed fields into HBase, the data volume
is only 150MB/day  duplicated 300MB/day
– Store 3 month data = 13.5GB  duplicated 27GB (GZed)
(record count = 12 Billon)
3/10/2014

Copyright 2013 Trend Micro Inc.
Test on Even Large Data
• Test on 240 million sample data

• The search performance……
• The query time is robust on 80% query case

3/10/2014

Copyright 2013 Trend Micro Inc.
Fume ETL Process
3/10/2014

Copyright 2013 Trend Micro Inc.
Architecture
Data
Soruce

Send Netflow via
syslog

Query
Hbase
Thrift
Server

3/10/2014

Copyright 2013 Trend Micro Inc.
Serializer
1. Extract needed fields from Netflow log

Flume Process
Data
Soruce

To

2. Create Hbase put object for Sink to execute

Serializer

Flume Spooling
Directory Source

3/10/2014

Flume file Channel

Copyright 2013 Trend Micro Inc.

Flume HBase Sink
Dual Table Write
Data
Soruce
Infosec

flume.conf
…
agent1.sinks.sink1.serializer.rowKey = src, dst
agent1.sinks.sink2.serializer.rowKey = dst, src

Channel1

Flume Spooling
Directory Source

Channel2

Sink1

Sink2

Duplicate, Again!
3/10/2014

Copyright 2013 Trend Micro Inc.
More Elegant Way
Data
Soruce
Infosec

• A put trigger the prePut Coprocessor

Step1

• Put to dst table in dst#src format in coprocessor

Step2

• Do regular put to src table in src#dst format

Step3

src table
Flume Spooling
Directory Source

3/10/2014

Channel1

Copyright 2013 Trend Micro Inc.

Sink1

dst table

Hook a prePut
Coprocessor
Experience Sharing
& Future Work
3/10/2014

Copyright 2013 Trend Micro Inc.
Experience Sharing
• Thrift
– Thrift is not the first-class citizen of HBase, for example, thrift do
not support Scan with TimeRange and Version
– Do not support New Filters since thrift has it’s own Filter
Language (for example, FuzzyRowFilter)

• Bottle
– It won’t be hurt when you delete you web backend code which is
implement by bottle

3/10/2014

Copyright 2013 Trend Micro Inc.
Experience Sharing
• Flume
– There is also a Flume Syslogudp Source, but can not work well
with out extra works
• 768bytes/per message limitation(fixed in FLUME-2130)
• Still has 2048bytes limitation on netty event decoder
• Data may loss due to messages concatenated...

– Spooling Directory Source is much more stable

3/10/2014

Copyright 2013 Trend Micro Inc.
Future Work
• Transparent index table to clients
– Use coprocessor to hook on the client scan and decide which
table is going to scan

• Make thrift scan support specifying version:
– Now I use scan to fetch rows and qualifiers,
then use getVer to fetch different versions
(thrift do support “version” on get)

3/10/2014

Copyright 2013 Trend Micro Inc.
Questions?
Thank you !

Más contenido relacionado

La actualidad más candente

MUM Middle East 2016 - System Integration Analyst
MUM Middle East 2016 - System Integration AnalystMUM Middle East 2016 - System Integration Analyst
MUM Middle East 2016 - System Integration AnalystFajar Nugroho
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updatesMichal Rostecki
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
 
MUM Europe 2017 - Traffic Generator Case Study
MUM Europe 2017 - Traffic Generator Case StudyMUM Europe 2017 - Traffic Generator Case Study
MUM Europe 2017 - Traffic Generator Case StudyFajar Nugroho
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettJim St. Leger
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaJim St. Leger
 
Access over Ethernet: Insecurites in AoE
Access over Ethernet: Insecurites in AoEAccess over Ethernet: Insecurites in AoE
Access over Ethernet: Insecurites in AoEamiable_indian
 
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on..." Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on...PROIDEA
 
Alessio Lama - Development and testing of a safety network protocol
Alessio Lama - Development and testing of a safety network protocolAlessio Lama - Development and testing of a safety network protocol
Alessio Lama - Development and testing of a safety network protocollinuxlab_conf
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
 
DPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith WilesDPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith WilesJim St. Leger
 
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...Shuichi Ohkubo
 
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...Maximilan Wilhelm
 

La actualidad más candente (20)

MUM Middle East 2016 - System Integration Analyst
MUM Middle East 2016 - System Integration AnalystMUM Middle East 2016 - System Integration Analyst
MUM Middle East 2016 - System Integration Analyst
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updates
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
 
MUM Europe 2017 - Traffic Generator Case Study
MUM Europe 2017 - Traffic Generator Case StudyMUM Europe 2017 - Traffic Generator Case Study
MUM Europe 2017 - Traffic Generator Case Study
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles Shiflett
 
Stun turn poc_pilot
Stun turn poc_pilotStun turn poc_pilot
Stun turn poc_pilot
 
05 06 ike
05   06 ike05   06 ike
05 06 ike
 
Tech f42
Tech f42Tech f42
Tech f42
 
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro NakajimaDPDK Summit 2015 - NTT - Yoshihiro Nakajima
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
 
Access over Ethernet: Insecurites in AoE
Access over Ethernet: Insecurites in AoEAccess over Ethernet: Insecurites in AoE
Access over Ethernet: Insecurites in AoE
 
6.Routing
6.Routing6.Routing
6.Routing
 
100197
100197100197
100197
 
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on..." Breaking Extreme Networks WingOS: How to own millions of devices running on...
" Breaking Extreme Networks WingOS: How to own millions of devices running on...
 
Alessio Lama - Development and testing of a safety network protocol
Alessio Lama - Development and testing of a safety network protocolAlessio Lama - Development and testing of a safety network protocol
Alessio Lama - Development and testing of a safety network protocol
 
Asfws2014 tproxy
Asfws2014 tproxyAsfws2014 tproxy
Asfws2014 tproxy
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
DPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith WilesDPDK Summit 2015 - Intel - Keith Wiles
DPDK Summit 2015 - Intel - Keith Wiles
 
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...
2015.7.17 JANOG36 BGP Flowspec Interoperability Test @ Interop Tokyo 2015 Sho...
 
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...
Fun with PRB, VRFs and NetNS on Linux - What is it, how does it work, what ca...
 
NAT Traversal
NAT TraversalNAT Traversal
NAT Traversal
 

Similar a Network Traffic Search using Apache HBase

Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of MusicLars Albertsson
 
Web Architecture and Technologies
Web Architecture and TechnologiesWeb Architecture and Technologies
Web Architecture and TechnologiesFulvio Corno
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup Suman Karumuri
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixJeff Magnusson
 
Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Hakka Labs
 
Measuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongMeasuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongFastly
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Landon Robinson
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsDatabricks
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseCloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsAshish Mrig
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
application_layer (1).pdf
application_layer (1).pdfapplication_layer (1).pdf
application_layer (1).pdflathass5
 
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...Holger Bartel
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachJiang Zhu
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSBowenDing4
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 

Similar a Network Traffic Search using Apache HBase (20)

Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of Music
 
Web Architecture and Technologies
Web Architecture and TechnologiesWeb Architecture and Technologies
Web Architecture and Technologies
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
 
Lipstick On Pig
Lipstick On Pig Lipstick On Pig
Lipstick On Pig
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at Netflix
 
Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson
 
Measuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongMeasuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrong
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
application_layer (1).pdf
application_layer (1).pdfapplication_layer (1).pdf
application_layer (1).pdf
 
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...
Web Performance in the Age of HTTP/2 - FEDay Conference, Guangzhou, China 19/...
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end Approach
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFS
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
Performance engineering
Performance engineeringPerformance engineering
Performance engineering
 

Más de Evans Ye

Join ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdfJoin ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdfEvans Ye
 
非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽Evans Ye
 
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningTensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningEvans Ye
 
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations publicEvans Ye
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartEvans Ye
 
The Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward SuccessThe Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward SuccessEvans Ye
 
The Apache Way
The Apache WayThe Apache Way
The Apache WayEvans Ye
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningEvans Ye
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductEvans Ye
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisionerEvans Ye
 
Docker workshop
Docker workshopDocker workshop
Docker workshopEvans Ye
 
Fits docker into devops
Fits docker into devopsFits docker into devops
Fits docker into devopsEvans Ye
 
Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Evans Ye
 
Deep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through ImpalaDeep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through ImpalaEvans Ye
 
How we lose etu hadoop competition
How we lose etu hadoop competitionHow we lose etu hadoop competition
How we lose etu hadoop competitionEvans Ye
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environmentEvans Ye
 

Más de Evans Ye (20)

Join ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdfJoin ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdf
 
非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽
 
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningTensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
 
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smart
 
The Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward SuccessThe Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward Success
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisioner
 
Docker workshop
Docker workshopDocker workshop
Docker workshop
 
Fits docker into devops
Fits docker into devopsFits docker into devops
Fits docker into devops
 
Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...
 
Deep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through ImpalaDeep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through Impala
 
How we lose etu hadoop competition
How we lose etu hadoop competitionHow we lose etu hadoop competition
How we lose etu hadoop competition
 
Vagrant
VagrantVagrant
Vagrant
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environment
 

Último

Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 

Último (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 

Network Traffic Search using Apache HBase

  • 1. Network Traffic Search using Apache HBase Evans Ye @ TWHUG 2014 Q1 2014/3/8
  • 2. Who am I • Evans Ye @ – Dumbo Team • Dumbo In Taiwan Blog – Talk in TWHUG 2013 Q4 • Building Hadoop Based Big Data Environment – Apache Bigtop Contributor 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 3. Agenda • Problem to Solve • Solution Design • Flume ETL Process • Experience Sharing • Future Work 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 4. Security Department: Hey SPN, I have a big data problem… 3/10/2014 Copyright 2013 Trend Micro Inc. 閃開讓專業的來!
  • 5. Network Traffic Analysis Example C&C 2 C&C 1 C&C 3 INTERNET INTRANET VICTIM 1 VICTIM 2 TW branch 3/10/2014 Copyright 2013 Trend Micro Inc. VICTIM 3 VICTIM 4 US branch
  • 6. Find Malicious Connections by Searching Netflow logs • ArcSight Common Event Format – Volume: 250G/180 million record per day 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 7. Valuable Fields in Netflow log • src: source ip • dst: destination ip • spt: source port • dpt: destination port • proto: protocol, TCP,UDP… • rt: timestamp, 1386018915000 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 9. Big Data Problem 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 10. Choosing The Right Tool • Big data solutions • Why HBase? – We want to try and figure out HBase Thrift limitation – How HBase performs when dealing with this kind of problem 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 12. Architecture Data Soruce Send Netflow via syslog Talk to HBase using C++, Python, PHP, Ruby, Perl… A simple Python web framework Only one file Query under 150k HBase Thrift Server 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 13. User Requirement • Searchable Fields – – – – – – src: source ip dst: destination ip spt: source port dpt: destination port proto: protocol, TCP,UDP… rt: timestamp, 1386018915000 • Values – in, cn2, ad.tcp__flags 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 14. HBase Rowkey Design – First Attempt • Compose searchable fields to be rowkey • For client query, scan by applying HBase Filter – RowFilter (=, 'regexstring:^src#dst#[^#]*#spt#dpt#proto$')“ – See HBase Thrift Filter doc 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 15. RD Style Search Portal 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 16. Performance • Test on 12 million sample data • The search performance…… • Since we need to store at least 3 month data for query, The performance might not be good enough… 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 17. Lesson Leaned • Avoid full table scan – HBase Filters can only helps you to filter out un-wanted data to client side – On server side, it still need to compare all the rowkeys when applying filters –  set STARTROW and STOPROW 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 18. Avoid Full Table Scan • Since HBase is natively designed to store data sorted by rowkey • It’s fast to scan rows when rowkey prefix specified – This can only be fast when source ip specified – How about destination ip, port, protocol,…? 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 19. Rethink The User Requirement • Searchable Fields – – – – – – src: source ip required dst: destination ip spt: source port dpt: destination port proto: protocol rt: timestamp • User want to track down suspicious connections – A query at least need to have an IP 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 20. HBase Rowkey Design – Second Attempt ! – Search on source ip – Search on destination ip – Put netflow timestamp into HBase timestamp to leverage HBase TimeRange Scan – Set VERSION=>2147483647 to avoid collision 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 21. HBase Rowkey Design – Second Attempt ! • Search other searchable fields by applying Qualifier Filter: – QualifierFilter (=, 'regexstring:^spt#dpt#proto$') 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 22. Check The User Requirement • Searchable Fields – – – – – – 3/10/2014 src: source ip dst: destination ip spt: source port dpt: destination port proto: protocol rt: timestamp Copyright 2013 Trend Micro Inc.  specifiy STARTROW/STOPROW  specify STARTROW/STOPROW  apply qualifier filter  apply qualifier filter  apply qualifier filter  specify HBase TimeRange
  • 23. Deliver New Portal 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 24. Performance • Test on 70 million sample data • The search performance…… • Enough? – Since malicious connections won’t have large volume, 80% of query should be responsed in a second • Duplicate issue: – Since we only store needed fields into HBase, the data volume is only 150MB/day  duplicated 300MB/day – Store 3 month data = 13.5GB  duplicated 27GB (GZed) (record count = 12 Billon) 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 25. Test on Even Large Data • Test on 240 million sample data • The search performance…… • The query time is robust on 80% query case 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 26. Fume ETL Process 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 28. Serializer 1. Extract needed fields from Netflow log Flume Process Data Soruce To 2. Create Hbase put object for Sink to execute Serializer Flume Spooling Directory Source 3/10/2014 Flume file Channel Copyright 2013 Trend Micro Inc. Flume HBase Sink
  • 29. Dual Table Write Data Soruce Infosec flume.conf … agent1.sinks.sink1.serializer.rowKey = src, dst agent1.sinks.sink2.serializer.rowKey = dst, src Channel1 Flume Spooling Directory Source Channel2 Sink1 Sink2 Duplicate, Again! 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 30. More Elegant Way Data Soruce Infosec • A put trigger the prePut Coprocessor Step1 • Put to dst table in dst#src format in coprocessor Step2 • Do regular put to src table in src#dst format Step3 src table Flume Spooling Directory Source 3/10/2014 Channel1 Copyright 2013 Trend Micro Inc. Sink1 dst table Hook a prePut Coprocessor
  • 31. Experience Sharing & Future Work 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 32. Experience Sharing • Thrift – Thrift is not the first-class citizen of HBase, for example, thrift do not support Scan with TimeRange and Version – Do not support New Filters since thrift has it’s own Filter Language (for example, FuzzyRowFilter) • Bottle – It won’t be hurt when you delete you web backend code which is implement by bottle 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 33. Experience Sharing • Flume – There is also a Flume Syslogudp Source, but can not work well with out extra works • 768bytes/per message limitation(fixed in FLUME-2130) • Still has 2048bytes limitation on netty event decoder • Data may loss due to messages concatenated... – Spooling Directory Source is much more stable 3/10/2014 Copyright 2013 Trend Micro Inc.
  • 34. Future Work • Transparent index table to clients – Use coprocessor to hook on the client scan and decide which table is going to scan • Make thrift scan support specifying version: – Now I use scan to fetch rows and qualifiers, then use getVer to fetch different versions (thrift do support “version” on get) 3/10/2014 Copyright 2013 Trend Micro Inc.