Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University

Safe Automated Refactoring for Intelligent
Parallelization of Java 8 Streams
Raﬃ Khatchadourian Yiming Tang Mehdi Bagherzadeh Syed Ahmed
Columbia University, April 25, 2019
Based on work to appear at the ACM/IEEE International Conference on
Software Engineering (ICSE ’19), Montreal, Canada and the IEEE International
Working Conference on Source Code Analysis and Manipulation (SCAM ’18),
Madrid, Spain (Distinguished Paper Award).

Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
1

Streaming APIs
• Incorporate MapReduce-like operations on native data structures like
collections.
1

Streaming APIs
• Incorporate MapReduce-like operations on native data structures like
collections.
• Can make writing parallel code easier, less error-prone (avoid data
races, thread contention).
1

Streaming API Example in Java >= 8
Consider this simple “widget” class consisting of a “color” and “weight:”
1 // Widget class:
2 public class Widget {
3
4 // enumeration:
5 public enum Color {
6 RED,
7 BLUE,
8 GREEN
9 };
10
11 // instance fields:
12 private Color color;
13 private double weight;
14
15 // continued ...
16 // constructor:
17 Widget(Color c, double w){
18 this.color = c;
19 this.weight = w;
20 }
21
22 // accessors/mutators:
23 public Color getColor() {
24 return this.color;
25 }
26
27 public double getWeight(){
28 return this.weight;
29 } // ...
30 }
2

Consider the following Widget client code:
// an "unordered" collection of widgets.
Collection<Widget> unorderedWidgets = new HashSet<>();
// populate the collection ...
3

Now suppose we would like to sort the collection by weight using the
Java 8 Streaming API:
// sort widgets by weight.
List<Widget> sortedWidgets = unorderedWidgets
.stream()
.sorted(Comparator.comparing(Widget::getWeight))
.collect(Collectors.toList());
4

Now suppose we would like to sort the collection by weight using the
Java 8 Streaming API in parallel:
// sort widgets by weight.
List<Widget> sortedWidgets = unorderedWidgets
.parallelStream()
.sorted(Comparator.comparing(Widget::getWeight))
.collect(Collectors.toList());
5

Without using the Streaming API, running this code in parallel, i.e.,
having multiple iterations occur at once, would have required the use of
explicit threads.
The parallelizable operation (e.g., sorted()) would need to be isolated
and placed into a thread object, forked, and then joined.
Example
new Thread( /* your code here */ ).run();
// ...
Thread.join()
6

Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
7

Problem
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
7

Problem
• Collections reside in local memory.
7

Problem
• Issues may arise from close ties between shared memory and the
operations.
7

Problem
operations.
• Developers must manually determine whether running stream code
in parallel is eﬃcient yet interference-free.
7

Problem
operations.
• Requires thorough understanding of the API.
7

Problem
operations.
• Error-prone, possibly requiring complex analysis.
7

Problem
operations.
• Error-prone, possibly requiring complex analysis.
• Omission-prone, optimization opportunities may be missed.
7

Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
3 .stream()parallelStream()
5 .comparing(
7 .collect(
8

Motivating Example
3 .stream()
5 .comparing(
7 .collect(
5 .comparing(
7 .collect(
• We can perform the transformation at line 3 because the operations
do not access shared memory, i.e., no side-eﬀects.
8

Motivating Example
3 .stream()
5 .comparing(
7 .collect(
5 .comparing(
7 .collect(
• Had the stream been ordered, however, running in parallel may
result in worse performance due to sorted() requiring multiple
passes and data buﬀering.
8

Motivating Example
3 .stream()
5 .comparing(
7 .collect(
5 .comparing(
7 .collect(
• Such operations are called stateful intermediate operations (SIOs).
8

Motivating Example
3 .stream()
5 .comparing(
7 .collect(
5 .comparing(
7 .collect(
• Such operations are called stateful intermediate operations (SIOs).
• Maintaining data ordering is detrimental to parallel performance.
8

Motivating Example
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
3 Set<Double>
5 orderedWidgets
6 .parallelStream()
8 .filter(w -> w > 43.2)
9 .collect(
9

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .parallelStream()
8 .filter(w -> w > 43.2)
9 .collect(
3 Set<Double>
5 orderedWidgets
6 .parallelStream()
8 .filter(w -> w > 43.2)
9 .collect(
• No optimizations are available here because there is no SIO.
9

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .parallelStream()
8 .filter(w -> w > 43.2)
9 .collect(
3 Set<Double>
5 orderedWidgets
6 .parallelStream()
8 .filter(w -> w > 43.2)
9 .collect(
• No optimizations are available here because there is no SIO.
• No performance degradation.
9

Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10

Motivating Example
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
• Like sorted(), skip() is also an SIO.
10

Motivating Example
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
• But, the stream is ordered, making parallelism counterproductive.
10

Motivating Example
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
• Could unorder (via unordered()) to improve parallel performance.
10

Motivating Example
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
3 // 1000.
4 List<Widget>
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
• Could unorder (via unordered()) to improve parallel performance.
• But, doing so would alter semantics due to the target collection
being ordered (line 10).
10

Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
4 = orderedWidgets
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11

Motivating Example
4 = orderedWidgets
5 .stream()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
4 = orderedWidgets
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
• limit() is an SIO and the stream is ordered.
11

Motivating Example
4 = orderedWidgets
5 .stream()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
4 = orderedWidgets
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
• But, the stream is unordered before limit().
11

Motivating Example
4 = orderedWidgets
5 .stream()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
4 = orderedWidgets
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
• It’s safe and advantageous to run in parallel.
11

Motivating Example
4 = orderedWidgets
5 .stream()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
4 = orderedWidgets
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
• It’s safe and advantageous to run in parallel.
• A stream’s ordering does not only depend on its source.
11

Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
12

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
• Computation is already in parallel (line 7).
12

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
• distinct() is an SIO and the stream is ordered.
12

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
• Can we keep it in parallel? No, because TreeSets are ordered.
12

Motivating Example
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
3 Set<Double>
5 orderedWidgets
6 .stream()
7 .parallel()
9 .distinct()
11 .toCollection(
12 TreeSet::new));
• Can we keep it in parallel? No, because TreeSets are ordered.
• De-parallelize on line 7.
12

Motivating Example
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
8 .unordered().distinct()
10 Set::add,
11 Set::addAll);
13

Motivating Example
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
8 .distinct()
10 Set::add,
11 Set::addAll);
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
10 Set::add,
11 Set::addAll);
13

Motivating Example
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
8 .distinct()
10 Set::add,
11 Set::addAll);
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
10 Set::add,
11 Set::addAll);
• Direct form of collect() (line 11).
13

Motivating Example
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
8 .distinct()
10 Set::add,
11 Set::addAll);
3 Set<Color>
5 orderedWidgets
6 .parallelStream()
10 Set::add,
11 Set::addAll);
• Direct form of collect() (line 11).
• Since the reduction is to an unordered collection, we can unorder
immediately before distinct() (line 8) to improve performance.
13

Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
14

• Does not rely on test suites.
14

• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
14

approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
14

approximations.
• The semantics (meaning) of the code remains intact.
14

approximations.
• Examples include renaming a method (function) and pulling up
members in sibling classes to a super class to reduce redundancy.
14

approximations.
• Essential part of agile development.
14

approximations.
• Automated refactoring works by combining static analysis, type
theory, machine learning, and other front-end compiler technologies
to produce code changes that would have been made by an expert
human developer.
14

approximations.
• Automated refactoring works by combining static analysis, type
theory, machine learning, and other front-end compiler technologies
to produce code changes that would have been made by an expert
human developer.
• Very much a problem of automated software engineering.
14

Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
15

Solution
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
15

Solution
approach.
Streams.
• Transforms Java 8 stream code for improved performance.
15

Solution
approach.
Streams.
• Based on:
15

Solution
approach.
Streams.
• Based on:
• Novel ordering analysis.
15

Solution
approach.
Streams.
• Based on:
• Infers when maintaining ordering is necessary for semantics
preservation.
15

Solution
approach.
Streams.
• Based on:
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
15

Solution
approach.
Streams.
• Based on:
preservation.
• Augments the type system with “state.”
15

Solution
approach.
Streams.
• Based on:
preservation.
• Traditionally used for preventing resource usage errors.
15

Solution
approach.
Streams.
• Based on:
preservation.
• Requires interprocedural and alias analyses.
15

Solution
approach.
Streams.
• Based on:
preservation.
• Requires interprocedural and alias analyses.
• Novel adaptation for possibly immutable objects (streams).
15

Solution Highlights
• First to integrate automated refactoring with typestate analysis.1
1To the best of our knowledge.
2http://wala.sf.net
3http://git.io/vxwBs
16

Solution Highlights
• Uses WALA static analysis framework2
and the SAFE typestate
analysis engine.3
2http://wala.sf.net
16

Solution Highlights
• Uses WALA static analysis framework2
and the SAFE typestate
analysis engine.3
• Combines analysis results from varying IR representations (SSA,
AST).
2http://wala.sf.net
16

Identifying Refactoring Preconditions
• Refactoring preconditions are conditions that must hold to guarantee
that the transformation is type-correct and semantics-preserving.
17

• Our refactoring is (conceptually) split into two:
17

• Convert Sequential Stream to Parallel.
17

• Convert Sequential Stream to Parallel.
• Optimize Parallel Stream.
17

Table 1: Convert Sequential Stream to Parallel preconditions.
exe ord se SIO ROM transformation
P1 seq unord F N/A N/A Convert to para.
P2 seq ord F F N/A Convert to para.
P3 seq ord F T F Unorder and convert to para.
18

Table 2: Optimize Parallel Stream preconditions.
exe ord SIO ROM transformation
P4 para ord T F Unorder.
P5 para ord T T Convert to seq.
19

DFA for Determining Stream Execution Mode
⊥ start
seq para
Col.stream(),
BufferedReader.lines(),
Files.lines(Path),
JarFile.stream(),
Pattern.splitAsStream(),
Random.ints()
Col.parallelStream()
BaseStream.sequential()
BaseStream.parallel()
BaseStream.sequential()
BaseStream.parallel()
Figure 1: A subset of the relation E→ in E = (ES , EΛ, E→).
20

DFA for Determining Stream Ordering
⊥
start
ord unord
Arrays.stream(T[]),
Stream.of(T...),
IntStream.range(),
Stream.iterate(),
BitSet.stream(),
Col.parallelStream()
Stream.generate(),
HashSet.stream(),
PriorityQueue.stream(),
CopyOnWrite.parallelStream(),
BeanContextSupport.stream(),
Random.ints()
Stream.sorted()
BaseStream.unordered(),
Stream.concat(unordered),
Stream.concat(ordered)
Stream.sorted(),
Stream.concat(ordered)
BaseStream.unordered(),
Stream.concat(unordered)
Figure 2: A subset of the relation O→ in O = (OS , OΛ, O→).
21

Optimize Streams Eclipse Refactoring Plug-in
• Implemented an open source refactoring tool named Optimize
Streams.
4http://eclipse.org.
5Available at http://git.io/vpTLk.
22

Streams.
• Publicly available as an open source Eclipse IDE4
plug-in.5
22

Streams.
plug-in.5
• Can we be used by projects not using Eclipse.
22

Streams.
plug-in.5
• Can we be used by projects not using Eclipse.
• Includes fully-functional UI, preview pane, and refactoring unit tests.
22

Results
• Applied to 11 Java projects of varying size and domain with a total
of ∼642 KSLOC.
23

Results
of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
23

Results
of ∼642 KSLOC.
• Observed an average speedup of 3.49 during performance testing.
23

Results
of ∼642 KSLOC.
• Observed an average speedup of 3.49 during performance testing.
• See [Khatchadourian et al., 2018, 2019] for more details, including
user feedback, as well as tool and data set engineering challenges.
23

Results
Table 3: Experimental results.
subject KLOC eps k str rft P1 P2 P3 t (m)
htm.java 41.14 21 4 34 10 0 10 0 1.85
JacpFX 23.79 195 4 4 3 3 0 0 2.31
jdp* 19.96 25 4 28 15 1 13 1 31.88
jdk8-exp* 3.43 134 4 26 4 0 4 0 0.78
jetty 354.48 106 4 21 7 3 4 0 17.85
jOOQ 154.01 43 4 5 1 0 1 0 12.94
koral 7.13 51 3 6 6 0 6 0 1.06
monads 1.01 47 2 1 1 0 1 0 0.05
retroλ 5.14 1 4 8 6 3 3 0 0.66
streamql 4.01 92 2 22 2 0 2 0 0.72
threeten 27.53 36 2 2 2 0 2 0 0.51
Total 641.65 751 4 157 57 10 46 1 70.60
* jdp is java-design-patterns and jdk8-exp is jdk8-experiments.
24

Refactoring Failures
Table 4: Refactoring failures.
failure pc cnt
F1. InconsistentPossibleExecutionModes 1
F2. NoStatefulIntermediateOperations P5 1
F3. NonDeterminableReductionOrdering 5
F4. NoTerminalOperations 13
F5. CurrentlyNotHandled 16
F6. ReduceOrderingMatters P3 19
F7. HasSideEﬀects
P1 4
P2 41
Total 100
25

Performance Evaluation
Table 5: Average run times of JMH benchmarks.
# benchmark orig (s/op) refact (s/op) su
1 shouldRetrieveChildren 0.011 (0.001) 0.002 (0.000) 6.57
2 shouldConstructCar 0.011 (0.001) 0.001 (0.000) 8.22
3 addingShouldResultInFailure 0.014 (0.000) 0.004 (0.000) 3.78
4 deletionShouldBeSuccess 0.013 (0.000) 0.003 (0.000) 3.82
5 addingShouldResultInSuccess 0.027 (0.000) 0.005 (0.000) 5.08
6 deletionShouldBeFailure 0.014 (0.000) 0.004 (0.000) 3.90
7 speciﬁcation.AppTest.test 12.666 (5.961) 12.258 (1.880) 1.03
8 CoﬀeeMakingTaskTest.testId 0.681 (0.065) 0.469 (0.009) 1.45
9 PotatoPeelingTaskTest.testId 0.676 (0.062) 0.465 (0.008) 1.45
10 SpatialPoolerLocalInhibition 1.580 (0.168) 1.396 (0.029) 1.13
11 TemporalMemory 0.013 (0.001) 0.006 (0.000) 1.97
26

Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
27

Conclusion
• Integrates an Eclipse refactoring with the advanced static analyses
oﬀered by WALA and SAFE.
27

Conclusion
• 11 Java projects totaling ∼642 thousands of lines of code were used
in the tool’s assessment.
27

Conclusion
• 11 Java projects totaling ∼642 thousands of lines of code were used
in the tool’s assessment.
• An average speedup of 3.49 on the refactored code was observed as
part of a experimental study.
27

Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
28

Future Work
• Incorporate more kinds of (complex) reductions.
28

Future Work
• Those involving maps.
28

Future Work
• Applicability of the tool to other streaming APIs and languages.
28

Future Work
• Refactoring side-eﬀect producing code.
28

Future Work
• Result would be code that is amenable to our refactoring.
28

Future Work
• Finding other kinds of bugs and misuses of Streaming APIs.
28

Future Work
• Finding other kinds of bugs and misuses of Streaming APIs.
• Related to non-termination, non-determinism, etc.
28

Broader Vision
Assist developers not previously familiar with functional programming to
use functional language-inspired programming constructs and APIs in
increasingly pervasive mainstream Object-Oriented (OO) languages that
incorporate such constructs.
Includes empirical studies on how developers use functional-inspired in
real, mainstream OO programs, providing feedback to language and API
designers and a better understanding of this hybrid paradigm.
29

For Further Reading
Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis
(2015). “Streams à la carte: Extensible Pipelines with Object Algebras”. In:
ECOOP, pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591.
Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (May
2008). “Effective Typestate Verification in the Presence of Aliasing”. In: ACM
TOSEM 17.2, pp. 91–934. doi: 10.1145/1348250.1348255.
Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (Sept.
2018). “A Tool for Optimizing Java 8 Stream Software via Automated
Refactoring”. In: International Working Conference on Source Code Analysis and
Manipulation. SCAM ’18. Engineering Track. Distinguished Paper Award. IEEE.
IEEE Press, pp. 34–39. doi: 10.1109/SCAM.2018.00011.
Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (May
2019). “Safe Automated Refactoring for Intelligent Parallelization of Java 8
Streams”. In: International Conference on Software Engineering. ICSE ’19.
Technical Track. To appear. ACM/IEEE. ACM.
Strom, Robert E and Shaula Yemini (Jan. 1986). “Typestate: A programming
language concept for enhancing software reliability”. In: IEEE TSE SE-12.1,
pp. 157–171. doi: 10.1109/tse.1986.6312929.
30

Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University

Similar a Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University (20)

Más de Raffi Khatchadourian

Más de Raffi Khatchadourian (20)

Último

Último (20)

Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University