Collecting AWS Logs & Introducing Splunk New S3 Compatible Storage (SmartStore)

© 2019 SPLUNK INC.© 2019 SPLUNK INC.
Splunk User Group Edinburgh

© 2019 SPLUNK INC.
Harry McLaren
● Managing Consultant at ECS Security
● Member of SplunkTrust (MVP)
● Leader of the Splunk User Group Edinburgh

© 2019 SPLUNK INC.
Introduction to ECS Security
Splunk Partner - UK
– Security Consultancy & Managed SOC Provider
– Splunk Revolution Award & Splunk Partner of the Year

© 2019 SPLUNK INC.
Agenda
• Housekeeping: Event Overview & House Rules
• AWS Log Collection at Scale (Tomasz Dziwok)
• An Overview of SmartStore (Harry McLaren)
• Configuration Monitoring TA for Splunk (Tomasz Dziwok)

© 2019 SPLUNK INC.
Splunk [Official] User Group
“The overall goal is to create an authentic, ongoing
user group experience for our users, where
they contribute and get involved”
● Technical Discussions
● Sharing Environment
● Build Trust
● No Sales!

© 2019 SPLUNK INC.
AWS Log Collection
at Scale
Tomasz Dziwok

© 2019 SPLUNK INC.
Amazon Web Services Log Collection

© 2019 SPLUNK INC.
▶ Advantages
• HA & Distributed
• Highly Versitile
▶ Drawbacks
• Not really real-time
(requires pull)
• Uses S3 Bucket as a
"buffer"
• Rather complex
▶ Recommened for:
• Acces Logs (ELB,
CloudFront, S3)
• Config
• CloudTrail
▶ Usable for:
• Any data that can be
logged to S3
SQS S3-Based Inputs
▶ Data is sent to S3
as files
▶ SNS Queue is
notified about new
file
▶ SNS Notifies SQS
▶ Splunk HF polls
SQS
• If succesful; Splunk
pulls the S3 file

© 2019 SPLUNK INC.
▶ Advantages
• HA & Distributed
• Real-time
(push based)
• Very Scalable
▶ Drawbacks
• Few supported
sources
• One firehose
required per
sourcetype
• Not possible to
configure exclusively
in AWS Web
Console
▶ Recommened for:
• VPC Flow Logs
• GuardDuty Events
▶ Usable for:
• Other Events
of interest (e.g.
EC2 events)
Kinesis Data Firehose Inputs
( not kinesis stream )
▶ VPC Flow:
• Sent to Cloud Watch Logs
• Forwarded to Kinesis
Firehose
• Pre-Processed in Lambda
• Sent to Splunk HEC
▶ Cloud Watch Events
• Forwarded to firehose
• Sent to Splunk HEC

© 2019 SPLUNK INC.
▶ Advantages
• Can collect from
sources that are
otherwise not
collectable at all
• Very Easy to
configure
▶ Drawbacks
• Not HA
• Not Distributed
• Pull Required
▶ Recommened for:
• Inspector
• Config Rules
• CloudWatch
• "Description"
• Billing
▶ Usable for:
• Virtually anything
available in AWS
AWS API-Based Inputs
▶ A Splunk HF polls
AWS API
▶ The response is
indexed
▶ There is no Step 3

© 2019 SPLUNK INC.
AWS Permissions
the most challanging part of the proccess
NO!
YES!

© 2019 SPLUNK INC.
AWS Permissions : Splunk
https://docs.splunk.com/Documentation/AddOns/released/AWS/ConfigureAWSpermissions
MAYBE?

© 2019 SPLUNK INC.
Linking Arbitrary S3 Bucket to SNS

© 2019 SPLUNK INC.
▶ Permissions
▶ Check data flow in AWS
• S3 Bucket
• Traffic charts
▶ CA Validity
▶ Permissions
AWS/Splunk Debugging

© 2019 SPLUNK INC.
▶ Splunk _internal
• index=_internal source!="/opt/splunk/var/log/splunk/splunkd_ui_access.log" err* OR warn* OR
crit*
▶ AWS Policy Simulator
• https://policysim.aws.amazon.com/home/index.jsp
AWS/Splunk Debugging

© 2019 SPLUNK INC.
An Overview of
SmartStore
Harry McLaren

© 2019 SPLUNK INC.
➢ Independently scale up/down
compute (CPUs) and data storage
based on business demands
➢ Automatically evaluates users’
data access patterns (via app-
aware cache) – placing actively
accessed data in local storage for
real-time analytics; inactive data
moved to low-cost, remote storage
(any S3-compatible environment)
Splunk SmartStore
Maintain Performance & Availability (Lower Storage Cost)
Search
Indexers Storage

© 2019 SPLUNK INC.
Splunk SmartStore
Codename: S2 - Available in Splunk Enterprise 7.2+

© 2019 SPLUNK INC.
▶ Local Storage
• Hot buckets are always on local storage [homePath]
− No change from classic architecture
▶ Remote Storage
• Buckets are copied to the [remotePath] when they roll from Hot  Warm
- Remote storage must provide data protection
- Splunk does not provide resiliency for buckets in remote storage
▶ Cache Manager
• Recently read buckets are also on local cache
− New indexer functionality (not a new role)
− Each indexer has a cache manager that operates independently
− Retrieves buckets from the remote store when needed
− Evicts buckets from the local cache [homePath]
Storage Architecture

© 2019 SPLUNK INC.
1. Data arrives and is written to a Hot
bucket
− This occurs using the standard indexing
pipeline
2. The bucket rolls to warm
3. Bucket is registered with the cache
manager
4. Cache manager uploads the bucket to
the remote store
5. Bucket remains local and searchable
until evicted by the cache manager
Getting Data In
Non-Clustered Deployments
Remote Storage
[remotePath]
Hot/Cache
Storage
[homePath]
1
2
3 4
5

© 2019 SPLUNK INC.
1. Data arrives and is written to a Hot bucket
2. Hot bucket streams to cluster peer(s) according
to RF
3. Replication completes and the buckets roll to
warm
4. Buckets are registered with their cache
managers
5. Cache manager on source peer uploads the
bucket to the remote store
6. Source peer notifies replication peers that the
bucket was uploaded successfully
7. Replication peers delete their local copy of the
bucket and retain a stub
8. Cached copy remains on the source peer until
evicted by the local cache manager
Getting Data In
Clustered Deployments
Remote Storage
Hot/Cache
Storage
1
23
4
5
Hot/Cache
Storage
1
6
78

© 2019 SPLUNK INC.
Storage Configuration
indexes.conf, server.conf, limits.conf
▶ Volume Configuration
▶ Enabled Per-Index
− You can have a mix of SmartStore and classic indexes on the same indexer
▶ Index Configuration
Conf File indexes.conf
Parameters [volume:<volume_name>]
storageType = remote
path = <scheme>://<remote-location-specifier>
remote.s3.endpoint = <URL of S3 API>
remote.s3.secret_key =
remote.s3.access_key =
Conf File indexes.conf
Parameters [<index_name>]
homePath =
coldPath = <path required, but not used>
remotePath = <volume_name>/<index_name>
maxGlobalDataSizeMB =
frozenTimePeriodInSecs =

© 2019 SPLUNK INC.
Some assertions about Splunk searches…
▶ Typically over near-term data
− Research has shown that 97% of searches look back 24hrs or less
By default, the cache manager will attempt to cache buckets with recent events
▶ Typically have spatial and temporal locality
− If I find an event at a specific time or in a log, I will likely run additional searches against
data at that time or in that log
By default, the cache manager will attempt to cache recently accessed buckets
Searching with SmartStore

© 2019 SPLUNK INC.
1. Search request is received
2. Indexer generates a list of relevant
buckets to be searched
3. Search process is spawned
4. Spawned process reads the bucket list
5. Hot buckets are searched in the same
manner as “classic” search
Hot Buckets
Remote Storage
Hot/Cache
Storage
2
3 4
5
1
HOT

© 2019 SPLUNK INC.
1. Search process “opens” the bucket
with the cache manager (Indexer)
2. Cache manager tells the search
process that the bucket is local and
available for search
3. Search process searches the bucket
4. Search process ”closes” the bucket
with the cache manager
Cached Buckets
Remote Storage
Hot/Cache
Storage
2
3
4
1
CACHED

© 2019 SPLUNK INC.
1. Search process “opens” the bucket with the
Cache manager, but it isn’t in cache
2. Search process waits
3. Cache manager fetches the bucket from the
remote store
4. Cache manager tells the search process that
the bucket is local and available for search
5. Search process searches the bucket
6. Search process ”closes” the bucket with the
cache manager
7. Bucket remains in cache until evicted by the
cache manager
Remote Buckets
Remote Storage
Hot/Cache
Storage
2
3
4
1
5
6
7
II
CACHED

© 2019 SPLUNK INC.
▶ Splunk Docs
• About SmartStore:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Indexer/AboutSmartStore
• SmartStore Architecture:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Indexer/SmartStorearchitecture
• How Indexing Works in SmartStore:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Indexer/SmartStoreindexing
• How Search Works in SmartStore:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Indexer/SmartStoresearching
Resources

© 2019 SPLUNK INC.
▶ Open Source
▶ https://gitlab.com/ecs_
public_projects/splunk/
ta-confversion
▶ SplunkBase
▶ https://splunkbase.splu
nk.com/app/4364/
▶ Continued
Development
▶ Merge Requests
Welcome
▶ Version 1.1 to release
by end of week
Now Available

© 2019 SPLUNK INC.
Get Involved!
● Splunk User Group Edinburgh
– https://usergroups.splunk.com/group/splunk-user-group-edinburgh.html
– https://www.linkedin.com/groups/12013212
● Splunk’s Slack Group
– Register via http://splunk-usergroups.signup.team/
– Channel: #edinburgh
● Present & Share at the User Group?
Connect:
‣ Harry McLaren | harry.mclaren@ecssecurity.co.uk | @cyberharibu | harrymclaren.co.uk

Collecting AWS Logs & Introducing Splunk New S3 Compatible Storage (SmartStore)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Collecting AWS Logs & Introducing Splunk New S3 Compatible Storage (SmartStore)

Similar a Collecting AWS Logs & Introducing Splunk New S3 Compatible Storage (SmartStore) (20)

Más de Harry McLaren

Más de Harry McLaren (20)

Último

Último (20)

Collecting AWS Logs & Introducing Splunk New S3 Compatible Storage (SmartStore)

Notas del editor