Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

[Jfokus] Riding the Jet Streams

199 visualizaciones

Publicado el

Java 8 introduced the Stream API as a modern, functional, and very powerful tool for processing collections of data. One of the main benefits of the Stream API is that it hides the details of iteration over the underlying data set, allowing for parallel processing within a single JVM, using a fork/join framework. I will talk about a Stream API implementation that enables parallel processing across many machines and many JVMs. With an explanation of internals of the implementation, I will give an introduction to the general design behind stream processing using DAG (directed acyclic graph) engines and how an actor-based implementation can provide in-memory performance while still leveraging industry-wide known frameworks as Java Streams API.
https://www.jfokus.se/jfokus/talks.jsp#RidingtheJetStreams

Publicado en: Tecnología
  • Hi there! Get Your Professional Job-Winning Resume Here - Check our website! http://bit.ly/resumpro
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • Sé el primero en recomendar esto

[Jfokus] Riding the Jet Streams

  1. 1. Riding Jet Streams @gAmUssA @hazelcast #jfokus #hazelcastjet http://bit.ly/streams_jfokus2017
  2. 2. @gAmUssA @hazelcast #jfokus #hazelcastjet Solutions Architect @Hazelcast Developer Advocate @Hazelcast @gamussa in internetz Please, follow me on Twitter I’m very interesting © > whoami
  3. 3. @gAmUssA @hazelcast #jfokus #hazelcastjet Agenda Quick refresh on Java 8 Streams Distribute and Conquer Distributed Data Distributed Streams How we did all this
  4. 4. @gAmUssA @hazelcast #jfokus #hazelcastjet Example: Word Count Map<Integer, String> where keys are line numbers and values are lines. Find how many times each word occurs
  5. 5. @gAmUssA @hazelcast #jfokus #hazelcastjet What needs to be done? Iterate through all the lines Split the line into words Update running total of counts with new word
  6. 6. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Iterate through all the lines
  7. 7. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Split the line into words
  8. 8. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Update running total of counts with new word
  9. 9. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Print the result
  10. 10. java.util.stream
  11. 11. @gAmUssA @hazelcast #jfokus #hazelcastjet Java 8 Streams… An abstraction represents a sequence of elements Is not a data structure Convey elements from a source through a pipeline of operations Operation doesn’t modify a source
  12. 12. @gAmUssA @hazelcast #jfokus #hazelcastjet Why I should care about Stream API? You’re Java developer
  13. 13. What does regular Java developer think about Scala? advanced
  14. 14. @gAmUssA @hazelcast #jfokus #hazelcastjet Why I should care about Stream API? You’re Java developer Many Java developers know Java It’s all about data processing
  15. 15. @gAmUssA @hazelcast #jfokus #hazelcastjet java.util.stream map(), flatMap(), filter() reduce(), collect() sorted(), distinct() Intermediate operation Terminal operation Stateful Intermediate (Blocking) operation
  16. 16. @gAmUssA @hazelcast #jfokus #hazelcastjet Why would one need a cluster? One does not simply fit all Big Data in one machine
  17. 17. @gAmUssA @hazelcast #jfokus #hazelcastjet Problem Data doesn’t fit just one machine
  18. 18. @gAmUssA @hazelcast #jfokus #hazelcastjet Why would one need a cluster? One does not simply put all Big Data in one machine Data is too important to have it only one machine
  19. 19. Replication on Sharding? http://book.mixu.net/distsys/single-page.html
  20. 20. @gAmUssA @hazelcast #jfokus #hazelcastjet Another Requirements Easy to use Simple API Embeddable Cloud Native
  21. 21. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s Hazelcast IMDG? In-memory Data Grid Apache v2 Licensed Distributed Caches (IMap, JCache) Java Collections (IList, ISet, IQueue) Messaging (Topic, RingBuffer) Computation (ExecutorService, M-R)
  22. 22. 1 900 StarsOn GitHub 100%Open Source 134 contributors
  23. 23. @gAmUssA @hazelcast #jfokus #hazelcastjet Green Primary Green Backup Green Shard
  24. 24. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s the problem? Use IMap.values().stream() ? Or IMap.entrySet().stream() ? 3 3
  25. 25. @gAmUssA @hazelcast #jfokus #hazelcastjet Problem Data doesn’t fit just one machine
  26. 26. @gAmUssA @hazelcast #jfokus #hazelcastjet EASY (actually, not)! Implement serializable version of the interfaces Introducing DistributedStream 3 7
  27. 27. 3 8
  28. 28. 4 0 Jet Streams
  29. 29. jet.hazelcast.org
  30. 30. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s Hazelcast Jet? General purpose distributed data processing framework Based on Direct Acyclic Graph to model data flow Built on top of Hazelcast IMDG Comparable to Apache Spark or Apache Flink 4 2
  31. 31. @gAmUssA @hazelcast #jfokus #hazelcastjet DAG vertex vertex vertex vertex SOURCE SINK
  32. 32. Benchmarks Compared to Spark, Flink, Hadoop doing word count, running on a cluster of 9 nodes, 40 cores each
  33. 33. @gAmUssA @hazelcast #jfokus #hazelcastjet Future (It’s bright!) Processing guarantees for stream processing Streaming features (windowing, triggering) Higher level streaming and batching APIs Integration with additional Hazelcast structures (ICache, IQueue ..)
  34. 34. @gAmUssA @hazelcast #jfokus #hazelcastjet Future (It’s bright!) Event sourcing / CQRS Off-heap memory support RxJava More connectors to additional sources (JMS, JDBC..)
  35. 35. @gAmUssA @hazelcast #jfokus #hazelcastjet Grab while it’s hot! jet.hazelcast.org hazelcast/hazelcast-jet http://bit.ly/streams_jfokus2017 documentation Source on Github Presentation materials
  36. 36. @gAmUssA @hazelcast #jfokus #hazelcastjet Conclusion Java Stream API provides very white range of data processing tools War And Piece – is a Big (a lot of data) Book! Now we’re pretty sure that Andrew and Pierre are the main characters
  37. 37. SlidesCarnival icons are editable shapes. This means that you can: ● Resize them without losing quality. ● Change fill color and opacity. Isn’t that nice? :) Examples:
  38. 38. Now you can use any emoji as an icon! And of course it resizes without losing quality and you can change the color. How? Follow Google instructions https://twitter.com/googledocs/status/730087240156643328 ✋👆👉👍👤👦👧👨👩👪💃🏃💑❤😂 😉😋😒😭👶😸🐟🍒🍔💣📌📖🔨🎃🎈 🎨🏈🏰🌏🔌🔑and many more... 😉
  39. 39. @gAmUssA @hazelcast #jfokus #hazelcastjet Extra graphics

×