SlideShare a Scribd company logo
1 of 56
Download to read offline
Java Collections
The Force Awakens
Darth @RaoulUK
Darth @RichardWarburto
#javaforceawakens
Evolution can be interesting ...
Java 1.2 Java 10?
Collection API Improvements
Persistent & Immutable Collections
Performance Improvements
Collection bugs
1. Element access (Off-by-one error, ArrayOutOfBound)
2. Concurrent modification
3. Check-then-Act
Scenario 1
List<String> jedis = new ArrayList<>(asList("Luke", "yoda"));
for (String jedi: jedis) {
if (Character.isLowerCase(jedi.charAt(0))) {
jedis.remove(jedi);
}
}
Scenario 2
Map<String, BigDecimal> movieViews = new HashMap<>();
BigDecimal views = movieViews.get(MOVIE);
if(views != null) {
movieViews.put(MOVIE, views.add(BigDecimal.ONE));
}
views != nullmoviesViews.get movieViews.put
Then
Check Act
Reducing scope for bugs
● ~280 bugs in 28 projects including Cassandra, Lucene
● ~80% check-then-act bugs discovered are put-if-absent
● Library designers can help by updating APIs as new idioms emerge
● Different data structures can provide alternatives by restricting reads &
updates to reduce scope for bugs
CHECK-THEN-ACT Misuse of Java Concurrent Collections
http://dig.cs.illinois.edu/papers/checkThenAct.pdf
Java 9 API updates
Collection factory methods
● Non-goal to provide persistent immutable collections
● http://openjdk.java.net/jeps/269
Live Demo using jShell
http://iteratrlearning.com/java9/2016/11/09/java9-collection-factory-methods
Collection API Improvements
Persistent & Immutable Collections
Performance Improvements
Categorising Collections
Mutable
Immutable
Non-Persistent Persistent
Unsynchronized Concurrent
Unmodifiable View
Available in
Core Library
Mutable
● Popular friends include ArrayList, HashMap, TreeSet
● Memory-efficient modification operations
● State can be accidentally modified
● Can be thread-safe, but requires careful design
Unmodifiable
List<String> jedis = new ArrayList<>();
jedis.add("Luke Skywalker");
List<String> cantChangeMe = Collections.unmodifiableList(jedis);
// java.lang.UnsupportedOperationException
//cantChangeMe.add("Darth Vader");
System.out.println(cantChangeMe); // [Luke Skywalker]
jedis.add("Darth Vader");
System.out.println(cantChangeMe); // [Luke Skywalker, Darth Vader]
Immutable & Non-persistent
● No updates
● Flexibility to convert source in a more efficient representation
● No locking in context of concurrency
● Satisfies co-variant subtyping requirements
● Can be copied with modifications to create a new version (can be
expensive)
Immutable vs. Mutable hierarchy
ImmutableList MutableList
+ ImmutableList<T> toImmutable()
java.util.List
+ MutableList<T> toList()
Eclipse Collections (formaly GSCollections) https://projects.eclipse.org/projects/technology.collections/
ListIterable
Immutable and Persistent
● Changing source produces a new (version) of the collection
● Resulting collections shares structure with source to avoid full copying
on updates
LISP anyone?
Persistent List (aka Cons)
public final class Cons<T> implements ConsList<T> {
private final T head;
private final ConsList<T> tail;
public Cons(T head, ConsList<T> tail) {
this.head = head; this.tail = tail;
}
@Override
public ConsList<T> add(T e) {
return new Cons(e, this);
}
}
Updating Persistent List
A B C X Y Z
Before
Updating Persistent List
A B C X Y Z
Before
A B D
After
Blue nodes indicate new copies
Purple nodes indicates nodes we wish to update
Concatenating Two Persistent Lists
A B C
X Y Z
Before
Concatenating Two Persistent Lists
- Poor locality due to pointer chasing
- Copying of nodes
A B C
X Y Z
Before
A B C
After
Persistent List
● Structural sharing: no need to copy full structure
● Poor locality due to pointer chasing
● Copying becomes more expensive with larger lists
● Poor Random Access and thus Data Decomposition
Updating Persistent Binary Tree
Before
Updating Persistent Binary Tree
After
Persistent Array
How do we get the immutability benefits with performance of mutable
variants?
Trie
root
10 4520
3. Picking the right branch is done by using
parts of the key as a lookup
1. Branch factor
not limited to
binary
2. Leaf nodes
contain actual
values
a
a e
b
c
b c f
Persistent Array (Bitmapped Vector Trie)
... ...
... ...
... ...
... ...
.
.
.
.
.
.
1 31
0 1 31
Level 1 (root)
Level 2
Leaf nodes
Trade-offs
● Large branching factor facilitates iteration but hinders updates
● Small branching factor facilitates updates but hinders traversal
Java Persistent Collections
- Not available as part of Java Core Library
- Existing projects includes
- PCollections: https://github.com/hrldcpr/pcollections
- Port of Clojure DS: https://github.com/krukow/clj-ds
- Port of Scala DS: https://github.com/andrewoma/dexx
- Now also in Javaslang: http://javaslang.io
Memory usage survey
10,000,000 elements, heap < 32GB
int[] : 40MB
Integer[]: 160MB
ArrayList<Integer>: 215MB
PersistentVector<Integer>: 214MB (Clojure-DS)
Vector<Integer>: 206MB (Dexx, port of Scala-DS)
Data collected using Java Object Layout:
http://openjdk.java.net/projects/code-tools/jol/
Takeaways
● Immutable collections reduce the scope for bugs
● Always a compromise between programming safety and performance
● Performance of persistent data structure is improving
Collection API Improvements
Persistent & Immutable Collections
Performance Improvements
O(N)
O(1)
O(HYPERSPACE)
Primitive specialised collections
● Collections often hold boxed representations of primitive values
● Java 8 introduced IntStream, LongStream, DoubleStream and
primitive specialised functional interfaces
● Other libraries, eg: Agrona, Koloboke and Eclipse-Collections provide
primitive specialised collections today.
● Valhalla investigates primitive specialised generics
Java 8 Lazy Collection Initialization
Many allocated HashMaps and ArrayLists never written to, eg Null object
pattern
Java 8 adds Lazy Initialization for the default initialization case
Typically 1-2% reduction in memory consumption
http://www.javamagazine.mozaicreader.com/MarApr2016/Twitter#&pageS
et=28&page=0
HashMaps Basics
...
Han Solo
hash = 72309
Chewbacca
hash = 72309
Chaining Probing
HashMaps
a separate data
structure for
collision lookups
Store inline and
have a probing
sequence
Aliases: Palpatine vs Darth Sidious
Chaining Probing
HashMaps
aka Closed
Addressing
aka Open Hashing
aka Open
Addressing
aka Closed
Hashing
Chaining Probing
HashMaps
Linked List Based Tree Based
java.util.HashMap
Chaining Based HashMap
Historically maintained a LinkedList in the case of a collision
Problem: with high collision rates that the HashMap approaches O(N)
lookup
java.util.HashMap in Java 8
Starts by using a List to store colliding values.
Trees used when there are over 8 elements
Tree based nodes use about twice the memory
Make heavy collision lookup case O(log(N)) rather than O(N)
Relies on keys being Comparable
https://github.com/RichardWarburton/map-visualiser
So which HashMap is best?
Example Jar-Jar Benchmark
call get() on a single value for a map
of size 1
No model of the different factors that
affect things!
Tree Optimization - 60% Collisions
Tree Optimization - 10% Collisions
Probing vs Chaining
Probing Maps usually have lower memory consumption
Small Maps: Probing never has long clusters, can be up to 91% faster.
In large maps with high collision rates, probing scales poorly and can be
significantly slower.
Takeaways
There’s no clearcut “winner”.
JDK Implementations try to minimise worst case.
Linear Probing requires a good hashCode() distribution, Often hashmaps
“precondition” their hashes.
IdentityHashMap has low memory consumption and is fast, use it!
3rd Party libraries offer probing HashMaps, eg Koloboke & Eclipse-Collections.
Conclusions
Any Questions?
www.iteratrlearning.com
● Modern Development with Java 8
● Reactive and Asynchronous Java
● Java Software Development Bootcamp
#javaforceawakens
Further reading
Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays
https://infoscience.epfl.ch/record/64410/files/techlists.pdf
Smaller Footprint for Java Collections
http://www.lirmm.fr/~ducour/Doc-objets/ECOOP2012/ECOOP/ecoop/356.pdf
Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections
http://michael.steindorfer.name/publications/oopsla15.pdf
RRB-Trees: Efficient Immutable Vectors
https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
Further reading
Doug Lea’s Analysis of the HashMap implementation tradeoffs
http://www.mail-archive.com/core-libs-dev@openjdk.java.net/msg02147.html
Java Specialists HashMap article
http://www.javaspecialists.eu/archive/Issue235.html
Sample and Benchmark Code
https://github.com/RichardWarburton/Java-Collections-The-Force-Awakens

More Related Content

What's hot

Java Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and TuningJava Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and TuningCarol McDonald
 
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01Jérôme Rocheteau
 
Modern Java Workshop
Modern Java WorkshopModern Java Workshop
Modern Java WorkshopSimon Ritter
 
Oscon keynote: Working hard to keep it simple
Oscon keynote: Working hard to keep it simpleOscon keynote: Working hard to keep it simple
Oscon keynote: Working hard to keep it simpleMartin Odersky
 
Debugging Your Production JVM
Debugging Your Production JVMDebugging Your Production JVM
Debugging Your Production JVMkensipe
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 
Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production DebuggingTakipi
 
Scala eXchange opening
Scala eXchange openingScala eXchange opening
Scala eXchange openingMartin Odersky
 
Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Marcos García
 
Python Workshop. LUG Maniapl
Python Workshop. LUG ManiaplPython Workshop. LUG Maniapl
Python Workshop. LUG ManiaplAnkur Shrivastava
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Jug trojmiasto 2014.04.24 tricky stuff in java grammar and javac
Jug trojmiasto 2014.04.24  tricky stuff in java grammar and javacJug trojmiasto 2014.04.24  tricky stuff in java grammar and javac
Jug trojmiasto 2014.04.24 tricky stuff in java grammar and javacAnna Brzezińska
 
Clojure concurrency
Clojure concurrencyClojure concurrency
Clojure concurrencyAlex Navis
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
Clojure for Java developers
Clojure for Java developersClojure for Java developers
Clojure for Java developersJohn Stevenson
 

What's hot (20)

Java Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and TuningJava Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and Tuning
 
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01
How Green are Java Best Coding Practices? - GreenDays @ Rennes - 2014-07-01
 
Modern Java Workshop
Modern Java WorkshopModern Java Workshop
Modern Java Workshop
 
Oscon keynote: Working hard to keep it simple
Oscon keynote: Working hard to keep it simpleOscon keynote: Working hard to keep it simple
Oscon keynote: Working hard to keep it simple
 
Debugging Your Production JVM
Debugging Your Production JVMDebugging Your Production JVM
Debugging Your Production JVM
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production Debugging
 
Scala eXchange opening
Scala eXchange openingScala eXchange opening
Scala eXchange opening
 
Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)Quick introduction to Java Garbage Collector (JVM GC)
Quick introduction to Java Garbage Collector (JVM GC)
 
Python Workshop. LUG Maniapl
Python Workshop. LUG ManiaplPython Workshop. LUG Maniapl
Python Workshop. LUG Maniapl
 
Lazy java
Lazy javaLazy java
Lazy java
 
Java best practices
Java best practicesJava best practices
Java best practices
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Streams in Java 8
Streams in Java 8Streams in Java 8
Streams in Java 8
 
Jug trojmiasto 2014.04.24 tricky stuff in java grammar and javac
Jug trojmiasto 2014.04.24  tricky stuff in java grammar and javacJug trojmiasto 2014.04.24  tricky stuff in java grammar and javac
Jug trojmiasto 2014.04.24 tricky stuff in java grammar and javac
 
Clojure concurrency
Clojure concurrencyClojure concurrency
Clojure concurrency
 
Kotlin
KotlinKotlin
Kotlin
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
Clojure for Java developers
Clojure for Java developersClojure for Java developers
Clojure for Java developers
 
Java 7
Java 7Java 7
Java 7
 

Similar to Java collections the force awakens

Java 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevJava 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevMattias Karlsson
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)Spark Summit
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Agora Group
 
Concurrency Constructs Overview
Concurrency Constructs OverviewConcurrency Constructs Overview
Concurrency Constructs Overviewstasimus
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法x1 ichi
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with ClojureJohn Stevenson
 
Eric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemEric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemGuardSquare
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganHazelcast
 
Scala clojure techday_2011
Scala clojure techday_2011Scala clojure techday_2011
Scala clojure techday_2011Thadeu Russo
 
Framework engineering JCO 2011
Framework engineering JCO 2011Framework engineering JCO 2011
Framework engineering JCO 2011YoungSu Son
 
Eric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemEric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemGuardSquare
 

Similar to Java collections the force awakens (20)

Devoxx
DevoxxDevoxx
Devoxx
 
Java 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevJava 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from Oredev
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011
 
Concurrency Constructs Overview
Concurrency Constructs OverviewConcurrency Constructs Overview
Concurrency Constructs Overview
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法
 
Clojure And Swing
Clojure And SwingClojure And Swing
Clojure And Swing
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
 
Eric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemEric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build system
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
Scala clojure techday_2011
Scala clojure techday_2011Scala clojure techday_2011
Scala clojure techday_2011
 
Forgive me for i have allocated
Forgive me for i have allocatedForgive me for i have allocated
Forgive me for i have allocated
 
Framework engineering JCO 2011
Framework engineering JCO 2011Framework engineering JCO 2011
Framework engineering JCO 2011
 
Eric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build systemEric Lafortune - The Jack and Jill build system
Eric Lafortune - The Jack and Jill build system
 

More from RichardWarburton

Fantastic performance and where to find it
Fantastic performance and where to find itFantastic performance and where to find it
Fantastic performance and where to find itRichardWarburton
 
Production profiling what, why and how technical audience (3)
Production profiling  what, why and how   technical audience (3)Production profiling  what, why and how   technical audience (3)
Production profiling what, why and how technical audience (3)RichardWarburton
 
Production profiling: What, Why and How
Production profiling: What, Why and HowProduction profiling: What, Why and How
Production profiling: What, Why and HowRichardWarburton
 
Production profiling what, why and how (JBCN Edition)
Production profiling  what, why and how (JBCN Edition)Production profiling  what, why and how (JBCN Edition)
Production profiling what, why and how (JBCN Edition)RichardWarburton
 
Production Profiling: What, Why and How
Production Profiling: What, Why and HowProduction Profiling: What, Why and How
Production Profiling: What, Why and HowRichardWarburton
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hoodRichardWarburton
 
Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)RichardWarburton
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingRichardWarburton
 
Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8RichardWarburton
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behaveRichardWarburton
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behaveRichardWarburton
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictabilityRichardWarburton
 
Simplifying java with lambdas (short)
Simplifying java with lambdas (short)Simplifying java with lambdas (short)
Simplifying java with lambdas (short)RichardWarburton
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakesRichardWarburton
 
Lambdas: Myths and Mistakes
Lambdas: Myths and MistakesLambdas: Myths and Mistakes
Lambdas: Myths and MistakesRichardWarburton
 

More from RichardWarburton (20)

Fantastic performance and where to find it
Fantastic performance and where to find itFantastic performance and where to find it
Fantastic performance and where to find it
 
Production profiling what, why and how technical audience (3)
Production profiling  what, why and how   technical audience (3)Production profiling  what, why and how   technical audience (3)
Production profiling what, why and how technical audience (3)
 
Production profiling: What, Why and How
Production profiling: What, Why and HowProduction profiling: What, Why and How
Production profiling: What, Why and How
 
Production profiling what, why and how (JBCN Edition)
Production profiling  what, why and how (JBCN Edition)Production profiling  what, why and how (JBCN Edition)
Production profiling what, why and how (JBCN Edition)
 
Production Profiling: What, Why and How
Production Profiling: What, Why and HowProduction Profiling: What, Why and How
Production Profiling: What, Why and How
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
 
How to run a hackday
How to run a hackdayHow to run a hackday
How to run a hackday
 
Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional Programming
 
Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Simplifying java with lambdas (short)
Simplifying java with lambdas (short)Simplifying java with lambdas (short)
Simplifying java with lambdas (short)
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
The Bleeding Edge
The Bleeding EdgeThe Bleeding Edge
The Bleeding Edge
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakes
 
Caching in
Caching inCaching in
Caching in
 
Lambdas: Myths and Mistakes
Lambdas: Myths and MistakesLambdas: Myths and Mistakes
Lambdas: Myths and Mistakes
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Java collections the force awakens

  • 1. Java Collections The Force Awakens Darth @RaoulUK Darth @RichardWarburto #javaforceawakens
  • 2. Evolution can be interesting ... Java 1.2 Java 10?
  • 3. Collection API Improvements Persistent & Immutable Collections Performance Improvements
  • 4. Collection bugs 1. Element access (Off-by-one error, ArrayOutOfBound) 2. Concurrent modification 3. Check-then-Act
  • 5. Scenario 1 List<String> jedis = new ArrayList<>(asList("Luke", "yoda")); for (String jedi: jedis) { if (Character.isLowerCase(jedi.charAt(0))) { jedis.remove(jedi); } }
  • 6. Scenario 2 Map<String, BigDecimal> movieViews = new HashMap<>(); BigDecimal views = movieViews.get(MOVIE); if(views != null) { movieViews.put(MOVIE, views.add(BigDecimal.ONE)); } views != nullmoviesViews.get movieViews.put Then Check Act
  • 7. Reducing scope for bugs ● ~280 bugs in 28 projects including Cassandra, Lucene ● ~80% check-then-act bugs discovered are put-if-absent ● Library designers can help by updating APIs as new idioms emerge ● Different data structures can provide alternatives by restricting reads & updates to reduce scope for bugs CHECK-THEN-ACT Misuse of Java Concurrent Collections http://dig.cs.illinois.edu/papers/checkThenAct.pdf
  • 8. Java 9 API updates Collection factory methods ● Non-goal to provide persistent immutable collections ● http://openjdk.java.net/jeps/269 Live Demo using jShell http://iteratrlearning.com/java9/2016/11/09/java9-collection-factory-methods
  • 9. Collection API Improvements Persistent & Immutable Collections Performance Improvements
  • 10. Categorising Collections Mutable Immutable Non-Persistent Persistent Unsynchronized Concurrent Unmodifiable View Available in Core Library
  • 11. Mutable ● Popular friends include ArrayList, HashMap, TreeSet ● Memory-efficient modification operations ● State can be accidentally modified ● Can be thread-safe, but requires careful design
  • 12. Unmodifiable List<String> jedis = new ArrayList<>(); jedis.add("Luke Skywalker"); List<String> cantChangeMe = Collections.unmodifiableList(jedis); // java.lang.UnsupportedOperationException //cantChangeMe.add("Darth Vader"); System.out.println(cantChangeMe); // [Luke Skywalker] jedis.add("Darth Vader"); System.out.println(cantChangeMe); // [Luke Skywalker, Darth Vader]
  • 13.
  • 14. Immutable & Non-persistent ● No updates ● Flexibility to convert source in a more efficient representation ● No locking in context of concurrency ● Satisfies co-variant subtyping requirements ● Can be copied with modifications to create a new version (can be expensive)
  • 15. Immutable vs. Mutable hierarchy ImmutableList MutableList + ImmutableList<T> toImmutable() java.util.List + MutableList<T> toList() Eclipse Collections (formaly GSCollections) https://projects.eclipse.org/projects/technology.collections/ ListIterable
  • 16. Immutable and Persistent ● Changing source produces a new (version) of the collection ● Resulting collections shares structure with source to avoid full copying on updates
  • 18. Persistent List (aka Cons) public final class Cons<T> implements ConsList<T> { private final T head; private final ConsList<T> tail; public Cons(T head, ConsList<T> tail) { this.head = head; this.tail = tail; } @Override public ConsList<T> add(T e) { return new Cons(e, this); } }
  • 19. Updating Persistent List A B C X Y Z Before
  • 20. Updating Persistent List A B C X Y Z Before A B D After Blue nodes indicate new copies Purple nodes indicates nodes we wish to update
  • 21. Concatenating Two Persistent Lists A B C X Y Z Before
  • 22. Concatenating Two Persistent Lists - Poor locality due to pointer chasing - Copying of nodes A B C X Y Z Before A B C After
  • 23. Persistent List ● Structural sharing: no need to copy full structure ● Poor locality due to pointer chasing ● Copying becomes more expensive with larger lists ● Poor Random Access and thus Data Decomposition
  • 26. Persistent Array How do we get the immutability benefits with performance of mutable variants?
  • 27. Trie root 10 4520 3. Picking the right branch is done by using parts of the key as a lookup 1. Branch factor not limited to binary 2. Leaf nodes contain actual values a a e b c b c f
  • 28. Persistent Array (Bitmapped Vector Trie) ... ... ... ... ... ... ... ... . . . . . . 1 31 0 1 31 Level 1 (root) Level 2 Leaf nodes
  • 29. Trade-offs ● Large branching factor facilitates iteration but hinders updates ● Small branching factor facilitates updates but hinders traversal
  • 30. Java Persistent Collections - Not available as part of Java Core Library - Existing projects includes - PCollections: https://github.com/hrldcpr/pcollections - Port of Clojure DS: https://github.com/krukow/clj-ds - Port of Scala DS: https://github.com/andrewoma/dexx - Now also in Javaslang: http://javaslang.io
  • 31. Memory usage survey 10,000,000 elements, heap < 32GB int[] : 40MB Integer[]: 160MB ArrayList<Integer>: 215MB PersistentVector<Integer>: 214MB (Clojure-DS) Vector<Integer>: 206MB (Dexx, port of Scala-DS) Data collected using Java Object Layout: http://openjdk.java.net/projects/code-tools/jol/
  • 32. Takeaways ● Immutable collections reduce the scope for bugs ● Always a compromise between programming safety and performance ● Performance of persistent data structure is improving
  • 33. Collection API Improvements Persistent & Immutable Collections Performance Improvements
  • 34.
  • 36. Primitive specialised collections ● Collections often hold boxed representations of primitive values ● Java 8 introduced IntStream, LongStream, DoubleStream and primitive specialised functional interfaces ● Other libraries, eg: Agrona, Koloboke and Eclipse-Collections provide primitive specialised collections today. ● Valhalla investigates primitive specialised generics
  • 37. Java 8 Lazy Collection Initialization Many allocated HashMaps and ArrayLists never written to, eg Null object pattern Java 8 adds Lazy Initialization for the default initialization case Typically 1-2% reduction in memory consumption http://www.javamagazine.mozaicreader.com/MarApr2016/Twitter#&pageS et=28&page=0
  • 38.
  • 39. HashMaps Basics ... Han Solo hash = 72309 Chewbacca hash = 72309
  • 40. Chaining Probing HashMaps a separate data structure for collision lookups Store inline and have a probing sequence
  • 41. Aliases: Palpatine vs Darth Sidious
  • 42. Chaining Probing HashMaps aka Closed Addressing aka Open Hashing aka Open Addressing aka Closed Hashing
  • 44. java.util.HashMap Chaining Based HashMap Historically maintained a LinkedList in the case of a collision Problem: with high collision rates that the HashMap approaches O(N) lookup
  • 45. java.util.HashMap in Java 8 Starts by using a List to store colliding values. Trees used when there are over 8 elements Tree based nodes use about twice the memory Make heavy collision lookup case O(log(N)) rather than O(N) Relies on keys being Comparable https://github.com/RichardWarburton/map-visualiser
  • 46. So which HashMap is best?
  • 47. Example Jar-Jar Benchmark call get() on a single value for a map of size 1 No model of the different factors that affect things!
  • 48. Tree Optimization - 60% Collisions
  • 49. Tree Optimization - 10% Collisions
  • 50. Probing vs Chaining Probing Maps usually have lower memory consumption Small Maps: Probing never has long clusters, can be up to 91% faster. In large maps with high collision rates, probing scales poorly and can be significantly slower.
  • 51. Takeaways There’s no clearcut “winner”. JDK Implementations try to minimise worst case. Linear Probing requires a good hashCode() distribution, Often hashmaps “precondition” their hashes. IdentityHashMap has low memory consumption and is fast, use it! 3rd Party libraries offer probing HashMaps, eg Koloboke & Eclipse-Collections.
  • 53.
  • 54. Any Questions? www.iteratrlearning.com ● Modern Development with Java 8 ● Reactive and Asynchronous Java ● Java Software Development Bootcamp #javaforceawakens
  • 55. Further reading Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays https://infoscience.epfl.ch/record/64410/files/techlists.pdf Smaller Footprint for Java Collections http://www.lirmm.fr/~ducour/Doc-objets/ECOOP2012/ECOOP/ecoop/356.pdf Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections http://michael.steindorfer.name/publications/oopsla15.pdf RRB-Trees: Efficient Immutable Vectors https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
  • 56. Further reading Doug Lea’s Analysis of the HashMap implementation tradeoffs http://www.mail-archive.com/core-libs-dev@openjdk.java.net/msg02147.html Java Specialists HashMap article http://www.javaspecialists.eu/archive/Issue235.html Sample and Benchmark Code https://github.com/RichardWarburton/Java-Collections-The-Force-Awakens