SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
Storage is changing. We need new algorithms to deal
with it.
We are witnessing at least two revolutions in storage: (1) massive datasets and workloads, and (2) the rise of
scale-out commodity hardware. This whitepaper describes the Acunu Data Platform, and how Acunu is allowing
massive data workloads to take full advantage of
today’s hardware.

Acunu is rewriting the storage stack in the Linux ker-
nel for Massive Data thanks to world-class engineer-
ing and algorithms research.

Massive Data Workloads.

How have workloads changed? The workloads de-
manded by hardware of massive datasets typically
exhibit three main features:

•   Continuously high ingest rates (many thousands of
    updates/s, typically high-entropy, random updates)

•   Individual pieces of data are small, and aren’t valu-
    able in isolation (for example, stock ticks or ses-
    sion IDs)

•   Continual range queries are important for analyt-
    ics (such as demanded by Apache Hadoop)

This is in stark contrast to the ‘load, then query’
regimes of more traditional databases.

Understanding massive data means being able to
extract features and trends, all the time while the
data is continually updated. Existing platforms and
solutions cannot do this at scale, with predictably
high performance. This is where Acunu comes in.

The first revolution is the rise of non-relational, or
‘nosql’ data bases such as Cassandra, and analyt-
ics frameworks and tools such as Hadoop. The driving force is using clusters of commodity machines to ingest large
volumes of data, process it, and serve it. Previous technologies such as mysql are traditionally cumbersome to operate
at the scales needed here. For many deployments in both enterprise and non-enterprise settings, these technologies
are likely to account for the majority of data stored where features such as high availability at low cost are more impor-
tant than transactional durability.

The second revolution is a hardware one. Commodity machines now typically possess many cores, and bear closer
resemblence to a supercomputer of the 90s than a desktop of the same era. Hard drive capacity and sequential band-
width has been doubling every 18 months, as predicted; yet random IO performance has not improved. Solid-state
drives (SSDs) offer 2-3 orders of magnitude better random IO performance than hard drives. Clearly these have huge
potential to revolutionize the database world, if only the software stack can harness and utilize their performance.
Fundamental research = new possibilities.
The Acunu Storage Core is based on fundamental, patent-pending, algorithms and engineering research. This isn’t just
a better implementation of an existing idea, or about a shinier UI or management console (although our management
stack is also pretty cool). We are doing world-class research, engineering, patenting, and we publish at top confer-
ences. Why? This allows us to do things simply not
possible before. Here are some examples.

Fast, full versioning.

Versioning of large data sets is an incredibly powerful
tool. Not just low-performance snapshots for back-
ups, but high-performance, concurrent-accessible
clones and snapshots of live datasets for test and
development, offering many users different, writeable,
views of the same large dataset, going back in time,
and much more.

Traditionally, the state-of-the-art in algorithms for ver-
sioning large data sets is based on a data structure
known as the ‘copy-on-write B-tree’ (CoW B-tree) -
this is ubiquitous in file systems and databases in-
cluding ZFS, WAFL, Btrfs, and more. The CoW B-tree (and most of its variants, such as append-only trees, log file sys-
tems, redirect-on-write, etc.) has three fundamental problems - (1) it is space-inefficient (and thus requires frequent
garbage collection); (2) it relies on random IO to scale (and thus performs poorly on rotational drives); and (3) it cannot
perform fast updates, even on SSDs.

Acunu has invented a fundamentally new data structure - the Stratified B-tree - that addresses all the above problems.
Some details of this revolutionary data structure have been published: see [Twigg, Byde - Stratified B-trees and ver-
sioned dictionaries, USENIX HotStorage’11].

Designed for SSDs

Existing storage schemes do not address the fact that SSDs require addressing in a fundamentally different way. Al-
though they present a SATA/SAS interface and are sector-addressed, this is only to allow them to be a drop-in replace-
ment for hard drives. Extracting maximum performance and lifetimes requires two things: (1) the storage stack to un-
derstand how they operate; and (2) new data structures and algorithms that exploit their design characteristics.

By understanding how SSDs fundamentally work, Acunu has been able to engineer data structures that allow unprece-
dented long-term write performance, while guaranteeing device endurance.

Not just peak performance, but predictable performance.

By eliminating JVM-based garbage collection and memory management issues, and carefully controlling hardware ac-
cess from within the Linux kernel, Acunu is able to offer predictably high performance, even under sustained high loads,
with both ingest and analytic range queries - the perfect ingredients for any real-time analytics platform. Watch carefully
in future versions as Acunu begins to deploy fundamentally new offerings here, exploiting our back-end algorithmic
advantage.
SSDs - it’s all about endurance.
Flash SSDs are a fundamental change in storage technology, yet many systems treat them as if they were rotating hard
drives. Indeed, the legacy storage stack is filled with implicit assumptions about rotational drives. To exploit SSDs fully,
we need new algorithms and a stack that understands how flash SSDs fundamentally work.

What’s the problem?

Let’s start by considering why in-place updates to B-trees fail to give good performance on SSDs. The figure below
shows what happens to a fresh Intel X25M Flash SSD [1] under a simple workload: write a random 512KB buffer to a
random 512KB-aligned offsets. The device’s stated capacity is 160GB, and around this point the performance drops off
dramatically. The take-away message is this: to get consistently high performance from this device, we need to do
something else. B-trees, or any other random-write-intensive data structure won’t work.

The reason for the drop off once the
write volume reaches the device ca-
pacity is quite complex, and depends
on the internal structure of the device
— if you’re interested, read this great
report [2] for a simulation-based
analysis of different SSD architec-
tures. The basic reason is that al-
though the flash memory chips have a
512KB erase block, most SSDs im-
plement an internal log structure (the
magic ‘flash translation layer’ or FTL)
for several reasons, most notably be-
cause the bandwidth of these individ-
ual memory chips is relatively very
low, and to enable wear leveling and
error correction. This often makes the
”’effective”’ logical erase block size
much larger, typically around 100s of MBs for recent MLC devices. The result is that writes are at the mercy of the de-
vice’s FTL, which is the part manufacturers keep quiet and closed.

Log file systems.

Many emerging file systems and storage products argue that append-only B-trees are perfectly suited to today’s hard-
ware, particularly SSDs. Is this true? The append-only B-tree has two major problems, which Acunu’s fundamental al-
gorithms research finally overcomes.

The CoW B-tree has a potentially big space blowup: to rewrite a 16-byte key/value pair in a tree of depth 3 with 256K
block size, you may have to do 3x256K random reads and then write 768K of data. In practice, some of these nodes
are cached and don’t need rewriting, but for random updates to large datasets, this is pretty close. Even if you don’t
care about space utilisation, when the device is full, you’ll be writing, on average, a lot of data per small random update,
and this means you’re no longer fast at writing. Unfortunately, other than heuristic tweaks or giving your machine gigan-
tic amounts of RAM, this is an inherent problem for append-only CoW indexes.

The classic Achilles heel of a log file system is garbage collection (cleaning) — recovering invalidated (e.g. overwritten)
blocks in order to reclaim sufficiently large contiguous regions of free space so that future writes can be efficient. Very
few guarantees are known for garbage collection in log file systems, particularly when the system does not experience
idle time, or is under low free space conditions. To make matters worse, the space blowup described above means that
CoW trees generate a lot of extra work for the garbage collector — at a 50x space blowup, the garbage collector has to
work 50x harder to keep ahead of the input stream.

Soules et al. (2003) [3] compare the metadata efficiency of a versioning file system using both CoW B-trees and a struc-
ture (CVFS) based on the Multi-version B-tree (MVBT) [4]. They find that, in many cases, the size of the CoW metadata
index exceeds the dataset size. In one trace, the versioned data occupies 123GB, yet the CoW metadata requires
152GB while the CVFS metadata requires 4GB, a saving of 97%.

Stratified B-trees.

Acunu has invented a fundamentally new data structure, the Stratified B-tree [5,6], that dominates CoW B-trees, with or
without log file systems. They can be written without append-only logs and heuristic-based garbage collectors. They
are the first data structure to offer provably optimal performance for full versioning (allowing updates in far less than 1
IO per update on average), use asymptotically optimal O(N) space, offer an optimal range of trade-offs between up-
dates and queries, and can generally avoid performing random IO for both updates and range queries. In particular, one
construction offers updates three orders of magnitude faster than CoW B-trees, and can answer range queries around
one order of magnitude faster than the CoW B-tree!




[1] Model Number: INTEL SSDSA2M160G2GC, Firmware Revision: 2CV102M, writes use Linux AIO direct to device
with queue depth 32.

[2] http://research.microsoft.com/apps/pubs/?id=63596

[3] http://www.hpl.hp.com/personal/Craig_Soules/papers/fast03.pdf

[4] http://portal.acm.org/citation.cfm?id=765851.765854

[5] A Twigg et al., Stratified B-trees and versioned dictionaries, USENIX HotStorage’11, 2011.

[6] A Byde, A Twigg, Stratified B-trees and versioned dictionaries (version with proofs), arXiv.org, 2011.
About Acunu.
Acunu is reengineering the storage stack from the ground-up for the age of Massive Data. Based on fundamental algo-
rithms research and world-class engineering, the Acunu Platform allows applications such as Apache Cassandra and
Hadoop, along with many others, to (1) drive today’s commodity hardware harder than ever before, including many-core
architectures, SSDs and large SATA drives; (2) exploit new features in the Acunu Core (such as fast cloning and version-
ing); and (3) obtain predictable, reliable high performance. Storage is the key to understanding Massive Data, and gain-
ing competitive advantage. The Acunu Open Platform lets companies do this quicker, easier and cheaper.

Acunu was founded in 2009 by researchers and engineers from Cambridge, Oxford, and several well-known high-tech
companies. We are backed by some of Europe’s top VCs, with total funding over $5.0M. We are based in London and
California.

Founders.

Dr Tim Moreton, CEO: Tim is an expert in distributed file systems. He holds a PhD from Cambridge, where he built a
distributed file system for the Xen project. He was previously at Tideway (now BMC), where he was lead engineer on a
number of data center projects.

Dr Andy Twigg, CTO: Andy has an outstanding track record of theoretical and applied computing research. He has
held positions at Cambridge University, Microsoft Research, Thomson Research and Oxford University. His PhD in 2006
on compact routing algorithms was nominated for the BCS Best Dissertation Award. He holds a Junior Research Fel-
lowship at Oxford University, where he is a member of the CS department.

Tom Wilkie, VP Engineering: Tom was one of the first UK employees at XenSource before its acquisition by Citrix in
2007. He worked on the XenCenter management stack and numerous customer projects. He has a BA in Computer
Science from Cambridge.

Dr John Wilkes, Technical Advisor: John is an advisor to Acunu. John led the Storage Systems group at HP Labs for
15 years, before moving to Google in 2008. John received his PhD from Cambridge in 1984, an Outstanding Contribu-
tion award from SNIA in 2001 and was made an ACM Fellow in 2002.

Más contenido relacionado

Más de Acunu

Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu
 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraAcunu
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Acunu
 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowAcunu
 

Más de Acunu (20)

Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra Apps
 
All Your Base
All Your BaseAll Your Base
All Your Base
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into Cassandra
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard Low
 

Último

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Acunu - Research overview

  • 1.
  • 2. Storage is changing. We need new algorithms to deal with it. We are witnessing at least two revolutions in storage: (1) massive datasets and workloads, and (2) the rise of scale-out commodity hardware. This whitepaper describes the Acunu Data Platform, and how Acunu is allowing massive data workloads to take full advantage of today’s hardware. Acunu is rewriting the storage stack in the Linux ker- nel for Massive Data thanks to world-class engineer- ing and algorithms research. Massive Data Workloads. How have workloads changed? The workloads de- manded by hardware of massive datasets typically exhibit three main features: • Continuously high ingest rates (many thousands of updates/s, typically high-entropy, random updates) • Individual pieces of data are small, and aren’t valu- able in isolation (for example, stock ticks or ses- sion IDs) • Continual range queries are important for analyt- ics (such as demanded by Apache Hadoop) This is in stark contrast to the ‘load, then query’ regimes of more traditional databases. Understanding massive data means being able to extract features and trends, all the time while the data is continually updated. Existing platforms and solutions cannot do this at scale, with predictably high performance. This is where Acunu comes in. The first revolution is the rise of non-relational, or ‘nosql’ data bases such as Cassandra, and analyt- ics frameworks and tools such as Hadoop. The driving force is using clusters of commodity machines to ingest large volumes of data, process it, and serve it. Previous technologies such as mysql are traditionally cumbersome to operate at the scales needed here. For many deployments in both enterprise and non-enterprise settings, these technologies are likely to account for the majority of data stored where features such as high availability at low cost are more impor- tant than transactional durability. The second revolution is a hardware one. Commodity machines now typically possess many cores, and bear closer resemblence to a supercomputer of the 90s than a desktop of the same era. Hard drive capacity and sequential band- width has been doubling every 18 months, as predicted; yet random IO performance has not improved. Solid-state drives (SSDs) offer 2-3 orders of magnitude better random IO performance than hard drives. Clearly these have huge potential to revolutionize the database world, if only the software stack can harness and utilize their performance.
  • 3. Fundamental research = new possibilities. The Acunu Storage Core is based on fundamental, patent-pending, algorithms and engineering research. This isn’t just a better implementation of an existing idea, or about a shinier UI or management console (although our management stack is also pretty cool). We are doing world-class research, engineering, patenting, and we publish at top confer- ences. Why? This allows us to do things simply not possible before. Here are some examples. Fast, full versioning. Versioning of large data sets is an incredibly powerful tool. Not just low-performance snapshots for back- ups, but high-performance, concurrent-accessible clones and snapshots of live datasets for test and development, offering many users different, writeable, views of the same large dataset, going back in time, and much more. Traditionally, the state-of-the-art in algorithms for ver- sioning large data sets is based on a data structure known as the ‘copy-on-write B-tree’ (CoW B-tree) - this is ubiquitous in file systems and databases in- cluding ZFS, WAFL, Btrfs, and more. The CoW B-tree (and most of its variants, such as append-only trees, log file sys- tems, redirect-on-write, etc.) has three fundamental problems - (1) it is space-inefficient (and thus requires frequent garbage collection); (2) it relies on random IO to scale (and thus performs poorly on rotational drives); and (3) it cannot perform fast updates, even on SSDs. Acunu has invented a fundamentally new data structure - the Stratified B-tree - that addresses all the above problems. Some details of this revolutionary data structure have been published: see [Twigg, Byde - Stratified B-trees and ver- sioned dictionaries, USENIX HotStorage’11]. Designed for SSDs Existing storage schemes do not address the fact that SSDs require addressing in a fundamentally different way. Al- though they present a SATA/SAS interface and are sector-addressed, this is only to allow them to be a drop-in replace- ment for hard drives. Extracting maximum performance and lifetimes requires two things: (1) the storage stack to un- derstand how they operate; and (2) new data structures and algorithms that exploit their design characteristics. By understanding how SSDs fundamentally work, Acunu has been able to engineer data structures that allow unprece- dented long-term write performance, while guaranteeing device endurance. Not just peak performance, but predictable performance. By eliminating JVM-based garbage collection and memory management issues, and carefully controlling hardware ac- cess from within the Linux kernel, Acunu is able to offer predictably high performance, even under sustained high loads, with both ingest and analytic range queries - the perfect ingredients for any real-time analytics platform. Watch carefully in future versions as Acunu begins to deploy fundamentally new offerings here, exploiting our back-end algorithmic advantage.
  • 4. SSDs - it’s all about endurance. Flash SSDs are a fundamental change in storage technology, yet many systems treat them as if they were rotating hard drives. Indeed, the legacy storage stack is filled with implicit assumptions about rotational drives. To exploit SSDs fully, we need new algorithms and a stack that understands how flash SSDs fundamentally work. What’s the problem? Let’s start by considering why in-place updates to B-trees fail to give good performance on SSDs. The figure below shows what happens to a fresh Intel X25M Flash SSD [1] under a simple workload: write a random 512KB buffer to a random 512KB-aligned offsets. The device’s stated capacity is 160GB, and around this point the performance drops off dramatically. The take-away message is this: to get consistently high performance from this device, we need to do something else. B-trees, or any other random-write-intensive data structure won’t work. The reason for the drop off once the write volume reaches the device ca- pacity is quite complex, and depends on the internal structure of the device — if you’re interested, read this great report [2] for a simulation-based analysis of different SSD architec- tures. The basic reason is that al- though the flash memory chips have a 512KB erase block, most SSDs im- plement an internal log structure (the magic ‘flash translation layer’ or FTL) for several reasons, most notably be- cause the bandwidth of these individ- ual memory chips is relatively very low, and to enable wear leveling and error correction. This often makes the ”’effective”’ logical erase block size much larger, typically around 100s of MBs for recent MLC devices. The result is that writes are at the mercy of the de- vice’s FTL, which is the part manufacturers keep quiet and closed. Log file systems. Many emerging file systems and storage products argue that append-only B-trees are perfectly suited to today’s hard- ware, particularly SSDs. Is this true? The append-only B-tree has two major problems, which Acunu’s fundamental al- gorithms research finally overcomes. The CoW B-tree has a potentially big space blowup: to rewrite a 16-byte key/value pair in a tree of depth 3 with 256K block size, you may have to do 3x256K random reads and then write 768K of data. In practice, some of these nodes are cached and don’t need rewriting, but for random updates to large datasets, this is pretty close. Even if you don’t care about space utilisation, when the device is full, you’ll be writing, on average, a lot of data per small random update, and this means you’re no longer fast at writing. Unfortunately, other than heuristic tweaks or giving your machine gigan- tic amounts of RAM, this is an inherent problem for append-only CoW indexes. The classic Achilles heel of a log file system is garbage collection (cleaning) — recovering invalidated (e.g. overwritten) blocks in order to reclaim sufficiently large contiguous regions of free space so that future writes can be efficient. Very few guarantees are known for garbage collection in log file systems, particularly when the system does not experience idle time, or is under low free space conditions. To make matters worse, the space blowup described above means that
  • 5. CoW trees generate a lot of extra work for the garbage collector — at a 50x space blowup, the garbage collector has to work 50x harder to keep ahead of the input stream. Soules et al. (2003) [3] compare the metadata efficiency of a versioning file system using both CoW B-trees and a struc- ture (CVFS) based on the Multi-version B-tree (MVBT) [4]. They find that, in many cases, the size of the CoW metadata index exceeds the dataset size. In one trace, the versioned data occupies 123GB, yet the CoW metadata requires 152GB while the CVFS metadata requires 4GB, a saving of 97%. Stratified B-trees. Acunu has invented a fundamentally new data structure, the Stratified B-tree [5,6], that dominates CoW B-trees, with or without log file systems. They can be written without append-only logs and heuristic-based garbage collectors. They are the first data structure to offer provably optimal performance for full versioning (allowing updates in far less than 1 IO per update on average), use asymptotically optimal O(N) space, offer an optimal range of trade-offs between up- dates and queries, and can generally avoid performing random IO for both updates and range queries. In particular, one construction offers updates three orders of magnitude faster than CoW B-trees, and can answer range queries around one order of magnitude faster than the CoW B-tree! [1] Model Number: INTEL SSDSA2M160G2GC, Firmware Revision: 2CV102M, writes use Linux AIO direct to device with queue depth 32. [2] http://research.microsoft.com/apps/pubs/?id=63596 [3] http://www.hpl.hp.com/personal/Craig_Soules/papers/fast03.pdf [4] http://portal.acm.org/citation.cfm?id=765851.765854 [5] A Twigg et al., Stratified B-trees and versioned dictionaries, USENIX HotStorage’11, 2011. [6] A Byde, A Twigg, Stratified B-trees and versioned dictionaries (version with proofs), arXiv.org, 2011.
  • 6. About Acunu. Acunu is reengineering the storage stack from the ground-up for the age of Massive Data. Based on fundamental algo- rithms research and world-class engineering, the Acunu Platform allows applications such as Apache Cassandra and Hadoop, along with many others, to (1) drive today’s commodity hardware harder than ever before, including many-core architectures, SSDs and large SATA drives; (2) exploit new features in the Acunu Core (such as fast cloning and version- ing); and (3) obtain predictable, reliable high performance. Storage is the key to understanding Massive Data, and gain- ing competitive advantage. The Acunu Open Platform lets companies do this quicker, easier and cheaper. Acunu was founded in 2009 by researchers and engineers from Cambridge, Oxford, and several well-known high-tech companies. We are backed by some of Europe’s top VCs, with total funding over $5.0M. We are based in London and California. Founders. Dr Tim Moreton, CEO: Tim is an expert in distributed file systems. He holds a PhD from Cambridge, where he built a distributed file system for the Xen project. He was previously at Tideway (now BMC), where he was lead engineer on a number of data center projects. Dr Andy Twigg, CTO: Andy has an outstanding track record of theoretical and applied computing research. He has held positions at Cambridge University, Microsoft Research, Thomson Research and Oxford University. His PhD in 2006 on compact routing algorithms was nominated for the BCS Best Dissertation Award. He holds a Junior Research Fel- lowship at Oxford University, where he is a member of the CS department. Tom Wilkie, VP Engineering: Tom was one of the first UK employees at XenSource before its acquisition by Citrix in 2007. He worked on the XenCenter management stack and numerous customer projects. He has a BA in Computer Science from Cambridge. Dr John Wilkes, Technical Advisor: John is an advisor to Acunu. John led the Storage Systems group at HP Labs for 15 years, before moving to Google in 2008. John received his PhD from Cambridge in 1984, an Outstanding Contribu- tion award from SNIA in 2001 and was made an ACM Fellow in 2002.