SlideShare una empresa de Scribd logo
1 de 105
Descargar para leer sin conexión
Safe Automated Refactoring for Intelligent
Parallelization of Java 8 Streams
Raffi Khatchadourian Yiming Tang Mehdi Bagherzadeh Syed Ahmed
Columbia University, April 25, 2019
Based on work to appear at the ACM/IEEE International Conference on
Software Engineering (ICSE ’19), Montreal, Canada and the IEEE International
Working Conference on Source Code Analysis and Manipulation (SCAM ’18),
Madrid, Spain (Distinguished Paper Award).
Introduction
Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
1
Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
• Incorporate MapReduce-like operations on native data structures like
collections.
1
Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
• Incorporate MapReduce-like operations on native data structures like
collections.
• Can make writing parallel code easier, less error-prone (avoid data
races, thread contention).
1
Streaming API Example in Java >= 8
Consider this simple “widget” class consisting of a “color” and “weight:”
1 // Widget class:
2 public class Widget {
3
4 // enumeration:
5 public enum Color {
6 RED,
7 BLUE,
8 GREEN
9 };
10
11 // instance fields:
12 private Color color;
13 private double weight;
14
15 // continued ...
16 // constructor:
17 Widget(Color c, double w){
18 this.color = c;
19 this.weight = w;
20 }
21
22 // accessors/mutators:
23 public Color getColor() {
24 return this.color;
25 }
26
27 public double getWeight(){
28 return this.weight;
29 } // ...
30 }
2
Streaming API Example in Java >= 8
Consider the following Widget client code:
// an "unordered" collection of widgets.
Collection<Widget> unorderedWidgets = new HashSet<>();
// populate the collection ...
3
Streaming API Example in Java >= 8
Now suppose we would like to sort the collection by weight using the
Java 8 Streaming API:
// sort widgets by weight.
List<Widget> sortedWidgets = unorderedWidgets
.stream()
.sorted(Comparator.comparing(Widget::getWeight))
.collect(Collectors.toList());
4
Streaming API Example in Java >= 8
Now suppose we would like to sort the collection by weight using the
Java 8 Streaming API in parallel:
// sort widgets by weight.
List<Widget> sortedWidgets = unorderedWidgets
.parallelStream()
.sorted(Comparator.comparing(Widget::getWeight))
.collect(Collectors.toList());
5
Streaming API Example in Java >= 8
Without using the Streaming API, running this code in parallel, i.e.,
having multiple iterations occur at once, would have required the use of
explicit threads.
The parallelizable operation (e.g., sorted()) would need to be isolated
and placed into a thread object, forked, and then joined.
Example
new Thread( /* your code here */ ).run();
// ...
Thread.join()
6
Motivation
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and the
operations.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and the
operations.
• Developers must manually determine whether running stream code
in parallel is efficient yet interference-free.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and the
operations.
• Developers must manually determine whether running stream code
in parallel is efficient yet interference-free.
• Requires thorough understanding of the API.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and the
operations.
• Developers must manually determine whether running stream code
in parallel is efficient yet interference-free.
• Requires thorough understanding of the API.
• Error-prone, possibly requiring complex analysis.
7
Problem
• MapReduce traditionally runs in highly-distributed environments
with no shared memory.
• Streaming APIs typically execute on a single node under multiple
threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and the
operations.
• Developers must manually determine whether running stream code
in parallel is efficient yet interference-free.
• Requires thorough understanding of the API.
• Error-prone, possibly requiring complex analysis.
• Omission-prone, optimization opportunities may be missed.
7
Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()parallelStream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
8
Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()parallelStream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
• We can perform the transformation at line 3 because the operations
do not access shared memory, i.e., no side-effects.
8
Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()parallelStream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
• We can perform the transformation at line 3 because the operations
do not access shared memory, i.e., no side-effects.
• Had the stream been ordered, however, running in parallel may
result in worse performance due to sorted() requiring multiple
passes and data buffering.
8
Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()parallelStream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
• We can perform the transformation at line 3 because the operations
do not access shared memory, i.e., no side-effects.
• Had the stream been ordered, however, running in parallel may
result in worse performance due to sorted() requiring multiple
passes and data buffering.
• Such operations are called stateful intermediate operations (SIOs).
8
Motivating Example
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
1 List<Widget> sortedWidgets
2 = unorderedWidgets
3 .stream()parallelStream()
4 .sorted(Comparator
5 .comparing(
6 Widget::getWeight))
7 .collect(
8 Collectors.toList());
• We can perform the transformation at line 3 because the operations
do not access shared memory, i.e., no side-effects.
• Had the stream been ordered, however, running in parallel may
result in worse performance due to sorted() requiring multiple
passes and data buffering.
• Such operations are called stateful intermediate operations (SIOs).
• Maintaining data ordering is detrimental to parallel performance.
8
Motivating Example
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
9
Motivating Example
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
• No optimizations are available here because there is no SIO.
9
Motivating Example
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
1 // collect weights over 43.2
2 // into a set in parallel.
3 Set<Double>
4 heavyWidgetWeightSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getWeight)
8 .filter(w -> w > 43.2)
9 .collect(
10 Collectors.toSet());
• No optimizations are available here because there is no SIO.
• No performance degradation.
9
Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
10
Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
• Like sorted(), skip() is also an SIO.
10
Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
• Like sorted(), skip() is also an SIO.
• But, the stream is ordered, making parallelism counterproductive.
10
Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
• Like sorted(), skip() is also an SIO.
• But, the stream is ordered, making parallelism counterproductive.
• Could unorder (via unordered()) to improve parallel performance.
10
Motivating Example
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
1 // sequentially collect into
2 // a list, skipping first
3 // 1000.
4 List<Widget>
5 skippedWidgetList =
6 orderedWidgets
7 .stream()
8 .skip(1000)
9 .collect(
10 Collectors.toList());
• Like sorted(), skip() is also an SIO.
• But, the stream is ordered, making parallelism counterproductive.
• Could unorder (via unordered()) to improve parallel performance.
• But, doing so would alter semantics due to the target collection
being ordered (line 10).
10
Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()parallelStream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
11
Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()parallelStream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
• limit() is an SIO and the stream is ordered.
11
Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()parallelStream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
• limit() is an SIO and the stream is ordered.
• But, the stream is unordered before limit().
11
Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()parallelStream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
• limit() is an SIO and the stream is ordered.
• But, the stream is unordered before limit().
• It’s safe and advantageous to run in parallel.
11
Motivating Example
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
1 // collect the first green
2 // widgets into a list.
3 List<Widget> firstGreenList
4 = orderedWidgets
5 .stream()parallelStream()
6 .filter(w -> w.getColor()
7 == Color.GREEN)
8 .unordered()
9 .limit(5)
10 .collect(
11 Collectors.toList());
• limit() is an SIO and the stream is ordered.
• But, the stream is unordered before limit().
• It’s safe and advantageous to run in parallel.
• A stream’s ordering does not only depend on its source.
11
Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
12
Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
• Computation is already in parallel (line 7).
12
Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
• Computation is already in parallel (line 7).
• distinct() is an SIO and the stream is ordered.
12
Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
• Computation is already in parallel (line 7).
• distinct() is an SIO and the stream is ordered.
• Can we keep it in parallel? No, because TreeSets are ordered.
12
Motivating Example
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
1 // collect distinct widget
2 // weights into a TreeSet.
3 Set<Double>
4 distinctWeightSet =
5 orderedWidgets
6 .stream()
7 .parallel()
8 .map(Widget::getWeight)
9 .distinct()
10 .collect(Collectors
11 .toCollection(
12 TreeSet::new));
• Computation is already in parallel (line 7).
• distinct() is an SIO and the stream is ordered.
• Can we keep it in parallel? No, because TreeSets are ordered.
• De-parallelize on line 7.
12
Motivating Example
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .unordered().distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
13
Motivating Example
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .unordered().distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
• Computation is already in parallel (line 6).
13
Motivating Example
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .unordered().distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
• Computation is already in parallel (line 6).
• Direct form of collect() (line 11).
13
Motivating Example
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
1 // collect distinct widget
2 // colors into a HashSet.
3 Set<Color>
4 distinctColorSet =
5 orderedWidgets
6 .parallelStream()
7 .map(Widget::getColor)
8 .unordered().distinct()
9 .collect(HashSet::new,
10 Set::add,
11 Set::addAll);
• Computation is already in parallel (line 6).
• Direct form of collect() (line 11).
• Since the reduction is to an unordered collection, we can unorder
immediately before distinct() (line 8) to improve performance.
13
Approach
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
• The semantics (meaning) of the code remains intact.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
• The semantics (meaning) of the code remains intact.
• Examples include renaming a method (function) and pulling up
members in sibling classes to a super class to reduce redundancy.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
• The semantics (meaning) of the code remains intact.
• Examples include renaming a method (function) and pulling up
members in sibling classes to a super class to reduce redundancy.
• Essential part of agile development.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
• The semantics (meaning) of the code remains intact.
• Examples include renaming a method (function) and pulling up
members in sibling classes to a super class to reduce redundancy.
• Essential part of agile development.
• Automated refactoring works by combining static analysis, type
theory, machine learning, and other front-end compiler technologies
to produce code changes that would have been made by an expert
human developer.
14
Background: Static Analysis and Automated Refactoring
• Static analysis is the process of examining source code to
understand how the code works without running it.
• Does not rely on test suites.
• Undecidable in the general case (Rice’s Theorem). Instead, uses
approximations.
• Refactoring is the process of restructuring code for improved design,
better performance, and other non-functional enhancements.
• The semantics (meaning) of the code remains intact.
• Examples include renaming a method (function) and pulling up
members in sibling classes to a super class to reduce redundancy.
• Essential part of agile development.
• Automated refactoring works by combining static analysis, type
theory, machine learning, and other front-end compiler technologies
to produce code changes that would have been made by an expert
human developer.
• Very much a problem of automated software engineering.
14
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
• Traditionally used for preventing resource usage errors.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
• Traditionally used for preventing resource usage errors.
• Requires interprocedural and alias analyses.
15
Solution
• Devised a fully-automated, semantics-preserving refactoring
approach.
• Embodied by an open source refactoring tool named Optimize
Streams.
• Transforms Java 8 stream code for improved performance.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
• Traditionally used for preventing resource usage errors.
• Requires interprocedural and alias analyses.
• Novel adaptation for possibly immutable objects (streams).
15
Solution Highlights
• First to integrate automated refactoring with typestate analysis.1
1To the best of our knowledge.
2http://wala.sf.net
3http://git.io/vxwBs
16
Solution Highlights
• First to integrate automated refactoring with typestate analysis.1
• Uses WALA static analysis framework2
and the SAFE typestate
analysis engine.3
1To the best of our knowledge.
2http://wala.sf.net
3http://git.io/vxwBs
16
Solution Highlights
• First to integrate automated refactoring with typestate analysis.1
• Uses WALA static analysis framework2
and the SAFE typestate
analysis engine.3
• Combines analysis results from varying IR representations (SSA,
AST).
1To the best of our knowledge.
2http://wala.sf.net
3http://git.io/vxwBs
16
Identifying Refactoring Preconditions
• Refactoring preconditions are conditions that must hold to guarantee
that the transformation is type-correct and semantics-preserving.
17
Identifying Refactoring Preconditions
• Refactoring preconditions are conditions that must hold to guarantee
that the transformation is type-correct and semantics-preserving.
• Our refactoring is (conceptually) split into two:
17
Identifying Refactoring Preconditions
• Refactoring preconditions are conditions that must hold to guarantee
that the transformation is type-correct and semantics-preserving.
• Our refactoring is (conceptually) split into two:
• Convert Sequential Stream to Parallel.
17
Identifying Refactoring Preconditions
• Refactoring preconditions are conditions that must hold to guarantee
that the transformation is type-correct and semantics-preserving.
• Our refactoring is (conceptually) split into two:
• Convert Sequential Stream to Parallel.
• Optimize Parallel Stream.
17
Identifying Refactoring Preconditions
Table 1: Convert Sequential Stream to Parallel preconditions.
exe ord se SIO ROM transformation
P1 seq unord F N/A N/A Convert to para.
P2 seq ord F F N/A Convert to para.
P3 seq ord F T F Unorder and convert to para.
18
Identifying Refactoring Preconditions
Table 2: Optimize Parallel Stream preconditions.
exe ord SIO ROM transformation
P4 para ord T F Unorder.
P5 para ord T T Convert to seq.
19
DFA for Determining Stream Execution Mode
⊥ start
seq para
Col.stream(),
BufferedReader.lines(),
Files.lines(Path),
JarFile.stream(),
Pattern.splitAsStream(),
Random.ints()
Col.parallelStream()
BaseStream.sequential()
BaseStream.parallel()
BaseStream.sequential()
BaseStream.parallel()
Figure 1: A subset of the relation E→ in E = (ES , EΛ, E→).
20
DFA for Determining Stream Ordering
⊥
start
ord unord
Arrays.stream(T[]),
Stream.of(T...),
IntStream.range(),
Stream.iterate(),
BitSet.stream(),
Col.parallelStream()
Stream.generate(),
HashSet.stream(),
PriorityQueue.stream(),
CopyOnWrite.parallelStream(),
BeanContextSupport.stream(),
Random.ints()
Stream.sorted()
BaseStream.unordered(),
Stream.concat(unordered),
Stream.concat(ordered)
Stream.sorted(),
Stream.concat(ordered)
BaseStream.unordered(),
Stream.concat(unordered)
Figure 2: A subset of the relation O→ in O = (OS , OΛ, O→).
21
Evaluation
Optimize Streams Eclipse Refactoring Plug-in
• Implemented an open source refactoring tool named Optimize
Streams.
4http://eclipse.org.
5Available at http://git.io/vpTLk.
22
Optimize Streams Eclipse Refactoring Plug-in
• Implemented an open source refactoring tool named Optimize
Streams.
• Publicly available as an open source Eclipse IDE4
plug-in.5
4http://eclipse.org.
5Available at http://git.io/vpTLk.
22
Optimize Streams Eclipse Refactoring Plug-in
• Implemented an open source refactoring tool named Optimize
Streams.
• Publicly available as an open source Eclipse IDE4
plug-in.5
• Can we be used by projects not using Eclipse.
4http://eclipse.org.
5Available at http://git.io/vpTLk.
22
Optimize Streams Eclipse Refactoring Plug-in
• Implemented an open source refactoring tool named Optimize
Streams.
• Publicly available as an open source Eclipse IDE4
plug-in.5
• Can we be used by projects not using Eclipse.
• Includes fully-functional UI, preview pane, and refactoring unit tests.
4http://eclipse.org.
5Available at http://git.io/vpTLk.
22
Results
• Applied to 11 Java projects of varying size and domain with a total
of ∼642 KSLOC.
23
Results
• Applied to 11 Java projects of varying size and domain with a total
of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
23
Results
• Applied to 11 Java projects of varying size and domain with a total
of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
• Observed an average speedup of 3.49 during performance testing.
23
Results
• Applied to 11 Java projects of varying size and domain with a total
of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
• Observed an average speedup of 3.49 during performance testing.
• See [Khatchadourian et al., 2018, 2019] for more details, including
user feedback, as well as tool and data set engineering challenges.
23
Results
Table 3: Experimental results.
subject KLOC eps k str rft P1 P2 P3 t (m)
htm.java 41.14 21 4 34 10 0 10 0 1.85
JacpFX 23.79 195 4 4 3 3 0 0 2.31
jdp* 19.96 25 4 28 15 1 13 1 31.88
jdk8-exp* 3.43 134 4 26 4 0 4 0 0.78
jetty 354.48 106 4 21 7 3 4 0 17.85
jOOQ 154.01 43 4 5 1 0 1 0 12.94
koral 7.13 51 3 6 6 0 6 0 1.06
monads 1.01 47 2 1 1 0 1 0 0.05
retroλ 5.14 1 4 8 6 3 3 0 0.66
streamql 4.01 92 2 22 2 0 2 0 0.72
threeten 27.53 36 2 2 2 0 2 0 0.51
Total 641.65 751 4 157 57 10 46 1 70.60
* jdp is java-design-patterns and jdk8-exp is jdk8-experiments.
24
Refactoring Failures
Table 4: Refactoring failures.
failure pc cnt
F1. InconsistentPossibleExecutionModes 1
F2. NoStatefulIntermediateOperations P5 1
F3. NonDeterminableReductionOrdering 5
F4. NoTerminalOperations 13
F5. CurrentlyNotHandled 16
F6. ReduceOrderingMatters P3 19
F7. HasSideEffects
P1 4
P2 41
Total 100
25
Performance Evaluation
Table 5: Average run times of JMH benchmarks.
# benchmark orig (s/op) refact (s/op) su
1 shouldRetrieveChildren 0.011 (0.001) 0.002 (0.000) 6.57
2 shouldConstructCar 0.011 (0.001) 0.001 (0.000) 8.22
3 addingShouldResultInFailure 0.014 (0.000) 0.004 (0.000) 3.78
4 deletionShouldBeSuccess 0.013 (0.000) 0.003 (0.000) 3.82
5 addingShouldResultInSuccess 0.027 (0.000) 0.005 (0.000) 5.08
6 deletionShouldBeFailure 0.014 (0.000) 0.004 (0.000) 3.90
7 specification.AppTest.test 12.666 (5.961) 12.258 (1.880) 1.03
8 CoffeeMakingTaskTest.testId 0.681 (0.065) 0.469 (0.009) 1.45
9 PotatoPeelingTaskTest.testId 0.676 (0.062) 0.465 (0.008) 1.45
10 SpatialPoolerLocalInhibition 1.580 (0.168) 1.396 (0.029) 1.13
11 TemporalMemory 0.013 (0.001) 0.006 (0.000) 1.97
26
Conclusion
Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
27
Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
• Integrates an Eclipse refactoring with the advanced static analyses
offered by WALA and SAFE.
27
Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
• Integrates an Eclipse refactoring with the advanced static analyses
offered by WALA and SAFE.
• 11 Java projects totaling ∼642 thousands of lines of code were used
in the tool’s assessment.
27
Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
• Integrates an Eclipse refactoring with the advanced static analyses
offered by WALA and SAFE.
• 11 Java projects totaling ∼642 thousands of lines of code were used
in the tool’s assessment.
• An average speedup of 3.49 on the refactored code was observed as
part of a experimental study.
27
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
• Applicability of the tool to other streaming APIs and languages.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
• Applicability of the tool to other streaming APIs and languages.
• Refactoring side-effect producing code.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
• Applicability of the tool to other streaming APIs and languages.
• Refactoring side-effect producing code.
• Result would be code that is amenable to our refactoring.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
• Applicability of the tool to other streaming APIs and languages.
• Refactoring side-effect producing code.
• Result would be code that is amenable to our refactoring.
• Finding other kinds of bugs and misuses of Streaming APIs.
28
Future Work
• Handle more advanced ways of relating ASTs to SSA-based IR.
• Incorporate more kinds of (complex) reductions.
• Those involving maps.
• Applicability of the tool to other streaming APIs and languages.
• Refactoring side-effect producing code.
• Result would be code that is amenable to our refactoring.
• Finding other kinds of bugs and misuses of Streaming APIs.
• Related to non-termination, non-determinism, etc.
28
Broader Vision
Assist developers not previously familiar with functional programming to
use functional language-inspired programming constructs and APIs in
increasingly pervasive mainstream Object-Oriented (OO) languages that
incorporate such constructs.
Includes empirical studies on how developers use functional-inspired in
real, mainstream OO programs, providing feedback to language and API
designers and a better understanding of this hybrid paradigm.
29
For Further Reading
Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis
(2015). “Streams `a la carte: Extensible Pipelines with Object Algebras”. In:
ECOOP, pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591.
Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (May
2008). “Effective Typestate Verification in the Presence of Aliasing”. In: ACM
TOSEM 17.2, pp. 91–934. doi: 10.1145/1348250.1348255.
Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (Sept.
2018). “A Tool for Optimizing Java 8 Stream Software via Automated
Refactoring”. In: International Working Conference on Source Code Analysis and
Manipulation. SCAM ’18. Engineering Track. Distinguished Paper Award. IEEE.
IEEE Press, pp. 34–39. doi: 10.1109/SCAM.2018.00011.
Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (May
2019). “Safe Automated Refactoring for Intelligent Parallelization of Java 8
Streams”. In: International Conference on Software Engineering. ICSE ’19.
Technical Track. To appear. ACM/IEEE. ACM.
Strom, Robert E and Shaula Yemini (Jan. 1986). “Typestate: A programming
language concept for enhancing software reliability”. In: IEEE TSE SE-12.1,
pp. 157–171. doi: 10.1109/tse.1986.6312929.
30

Más contenido relacionado

La actualidad más candente

24 collections framework interview questions
24 collections framework interview questions24 collections framework interview questions
24 collections framework interview questions
Arun Vasanth
 
Advanced Hibernate Notes
Advanced Hibernate NotesAdvanced Hibernate Notes
Advanced Hibernate Notes
Kaniska Mandal
 

La actualidad más candente (20)

What is new in java 8 concurrency
What is new in java 8 concurrencyWhat is new in java 8 concurrency
What is new in java 8 concurrency
 
Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1
 
Belfast JUG 23-10-2013
Belfast JUG 23-10-2013Belfast JUG 23-10-2013
Belfast JUG 23-10-2013
 
Spring data jpa
Spring data jpaSpring data jpa
Spring data jpa
 
24 collections framework interview questions
24 collections framework interview questions24 collections framework interview questions
24 collections framework interview questions
 
Reactive Programming on Android - RxAndroid - RxJava
Reactive Programming on Android - RxAndroid - RxJavaReactive Programming on Android - RxAndroid - RxJava
Reactive Programming on Android - RxAndroid - RxJava
 
Sqlapi0.1
Sqlapi0.1Sqlapi0.1
Sqlapi0.1
 
Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)Hibernate Performance Tuning (JEEConf 2012)
Hibernate Performance Tuning (JEEConf 2012)
 
Distributed Model Validation with Epsilon
Distributed Model Validation with EpsilonDistributed Model Validation with Epsilon
Distributed Model Validation with Epsilon
 
Java 5 and 6 New Features
Java 5 and 6 New FeaturesJava 5 and 6 New Features
Java 5 and 6 New Features
 
Jsp standard tag_library
Jsp standard tag_libraryJsp standard tag_library
Jsp standard tag_library
 
Advanced Hibernate Notes
Advanced Hibernate NotesAdvanced Hibernate Notes
Advanced Hibernate Notes
 
Reactive programming with RxJava
Reactive programming with RxJavaReactive programming with RxJava
Reactive programming with RxJava
 
JDK1.6
JDK1.6JDK1.6
JDK1.6
 
JDBC Basics (In 20 Minutes Flat)
JDBC Basics (In 20 Minutes Flat)JDBC Basics (In 20 Minutes Flat)
JDBC Basics (In 20 Minutes Flat)
 
Hibernate presentation
Hibernate presentationHibernate presentation
Hibernate presentation
 
Java 9
Java 9Java 9
Java 9
 
Jstl Guide
Jstl GuideJstl Guide
Jstl Guide
 
Spring boot
Spring boot Spring boot
Spring boot
 
Introduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applicationsIntroduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applications
 

Similar a Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University

Java 7 & 8
Java 7 & 8Java 7 & 8
Java 7 & 8
Ken Coenen
 
Programming Server side with Sevlet
 Programming Server side with Sevlet  Programming Server side with Sevlet
Programming Server side with Sevlet
backdoor
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
In-Memory Computing Summit
 

Similar a Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University (20)

Java 8
Java 8Java 8
Java 8
 
Java 8 Overview
Java 8 OverviewJava 8 Overview
Java 8 Overview
 
Wt unit 3
Wt unit 3 Wt unit 3
Wt unit 3
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
New Features of JAVA SE8
New Features of JAVA SE8New Features of JAVA SE8
New Features of JAVA SE8
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
 
JDK8 Streams
JDK8 StreamsJDK8 Streams
JDK8 Streams
 
Java 7 & 8
Java 7 & 8Java 7 & 8
Java 7 & 8
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Microservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive ProgrammingMicroservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive Programming
 
java.pptx
java.pptxjava.pptx
java.pptx
 
Programming Server side with Sevlet
 Programming Server side with Sevlet  Programming Server side with Sevlet
Programming Server side with Sevlet
 
Collections
CollectionsCollections
Collections
 
Lambda.pdf
Lambda.pdfLambda.pdf
Lambda.pdf
 
Web Oriented Architecture at Oracle
Web Oriented Architecture at OracleWeb Oriented Architecture at Oracle
Web Oriented Architecture at Oracle
 
Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's Guide
 
Data access
Data accessData access
Data access
 
OBJECT ORIENTED PROGRAMMING LANGUAGE - SHORT NOTES
OBJECT ORIENTED PROGRAMMING LANGUAGE - SHORT NOTESOBJECT ORIENTED PROGRAMMING LANGUAGE - SHORT NOTES
OBJECT ORIENTED PROGRAMMING LANGUAGE - SHORT NOTES
 
Collections
CollectionsCollections
Collections
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
 

Más de Raffi Khatchadourian

Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Raffi Khatchadourian
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Raffi Khatchadourian
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Raffi Khatchadourian
 
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
Raffi Khatchadourian
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Raffi Khatchadourian
 
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
A Tool for Optimizing Java 8 Stream Software via Automated RefactoringA Tool for Optimizing Java 8 Stream Software via Automated Refactoring
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
Raffi Khatchadourian
 
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Proactive Empirical Assessment of New Language Feature Adoption via Automated...Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Raffi Khatchadourian
 

Más de Raffi Khatchadourian (20)

Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
 
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
 
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
 
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
 
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
 
A Brief Introduction to Type Constraints
A Brief Introduction to Type ConstraintsA Brief Introduction to Type Constraints
A Brief Introduction to Type Constraints
 
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
A Tool for Optimizing Java 8 Stream Software via Automated RefactoringA Tool for Optimizing Java 8 Stream Software via Automated Refactoring
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
 
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
 
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Proactive Empirical Assessment of New Language Feature Adoption via Automated...Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
 
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
 
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
 
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
 
Poster on Automated Refactoring of Legacy Java Software to Default Methods
Poster on Automated Refactoring of Legacy Java Software to Default MethodsPoster on Automated Refactoring of Legacy Java Software to Default Methods
Poster on Automated Refactoring of Legacy Java Software to Default Methods
 
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMUAutomated Refactoring of Legacy Java Software to Default Methods Talk at GMU
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU
 
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
 
Detecting Broken Pointcuts using Structural Commonality and Degree of Interest
Detecting Broken Pointcuts using Structural Commonality and Degree of InterestDetecting Broken Pointcuts using Structural Commonality and Degree of Interest
Detecting Broken Pointcuts using Structural Commonality and Degree of Interest
 
Fraglight: Shedding Light on Broken Pointcuts in Evolving Aspect-Oriented Sof...
Fraglight: Shedding Light on Broken Pointcuts in Evolving Aspect-Oriented Sof...Fraglight: Shedding Light on Broken Pointcuts in Evolving Aspect-Oriented Sof...
Fraglight: Shedding Light on Broken Pointcuts in Evolving Aspect-Oriented Sof...
 
Fraglight: Shedding Light on Broken Pointcuts Using Structural Commonality
Fraglight: Shedding Light on Broken Pointcuts Using Structural CommonalityFraglight: Shedding Light on Broken Pointcuts Using Structural Commonality
Fraglight: Shedding Light on Broken Pointcuts Using Structural Commonality
 

Último

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Último (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 

Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Talk at Columbia University

  • 1. Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams Raffi Khatchadourian Yiming Tang Mehdi Bagherzadeh Syed Ahmed Columbia University, April 25, 2019 Based on work to appear at the ACM/IEEE International Conference on Software Engineering (ICSE ’19), Montreal, Canada and the IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM ’18), Madrid, Spain (Distinguished Paper Award).
  • 3. Streaming APIs • Streaming APIs are widely-available in today’s mainstream, Object-Oriented programming languages [Biboudis et al., 2015]. 1
  • 4. Streaming APIs • Streaming APIs are widely-available in today’s mainstream, Object-Oriented programming languages [Biboudis et al., 2015]. • Incorporate MapReduce-like operations on native data structures like collections. 1
  • 5. Streaming APIs • Streaming APIs are widely-available in today’s mainstream, Object-Oriented programming languages [Biboudis et al., 2015]. • Incorporate MapReduce-like operations on native data structures like collections. • Can make writing parallel code easier, less error-prone (avoid data races, thread contention). 1
  • 6. Streaming API Example in Java >= 8 Consider this simple “widget” class consisting of a “color” and “weight:” 1 // Widget class: 2 public class Widget { 3 4 // enumeration: 5 public enum Color { 6 RED, 7 BLUE, 8 GREEN 9 }; 10 11 // instance fields: 12 private Color color; 13 private double weight; 14 15 // continued ... 16 // constructor: 17 Widget(Color c, double w){ 18 this.color = c; 19 this.weight = w; 20 } 21 22 // accessors/mutators: 23 public Color getColor() { 24 return this.color; 25 } 26 27 public double getWeight(){ 28 return this.weight; 29 } // ... 30 } 2
  • 7. Streaming API Example in Java >= 8 Consider the following Widget client code: // an "unordered" collection of widgets. Collection<Widget> unorderedWidgets = new HashSet<>(); // populate the collection ... 3
  • 8. Streaming API Example in Java >= 8 Now suppose we would like to sort the collection by weight using the Java 8 Streaming API: // sort widgets by weight. List<Widget> sortedWidgets = unorderedWidgets .stream() .sorted(Comparator.comparing(Widget::getWeight)) .collect(Collectors.toList()); 4
  • 9. Streaming API Example in Java >= 8 Now suppose we would like to sort the collection by weight using the Java 8 Streaming API in parallel: // sort widgets by weight. List<Widget> sortedWidgets = unorderedWidgets .parallelStream() .sorted(Comparator.comparing(Widget::getWeight)) .collect(Collectors.toList()); 5
  • 10. Streaming API Example in Java >= 8 Without using the Streaming API, running this code in parallel, i.e., having multiple iterations occur at once, would have required the use of explicit threads. The parallelizable operation (e.g., sorted()) would need to be isolated and placed into a thread object, forked, and then joined. Example new Thread( /* your code here */ ).run(); // ... Thread.join() 6
  • 12. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. 7
  • 13. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. 7
  • 14. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. 7
  • 15. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. 7
  • 16. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient yet interference-free. 7
  • 17. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient yet interference-free. • Requires thorough understanding of the API. 7
  • 18. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient yet interference-free. • Requires thorough understanding of the API. • Error-prone, possibly requiring complex analysis. 7
  • 19. Problem • MapReduce traditionally runs in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient yet interference-free. • Requires thorough understanding of the API. • Error-prone, possibly requiring complex analysis. • Omission-prone, optimization opportunities may be missed. 7
  • 20. Motivating Example 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream()parallelStream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 8
  • 21. Motivating Example 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream()parallelStream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); • We can perform the transformation at line 3 because the operations do not access shared memory, i.e., no side-effects. 8
  • 22. Motivating Example 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream()parallelStream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); • We can perform the transformation at line 3 because the operations do not access shared memory, i.e., no side-effects. • Had the stream been ordered, however, running in parallel may result in worse performance due to sorted() requiring multiple passes and data buffering. 8
  • 23. Motivating Example 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream()parallelStream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); • We can perform the transformation at line 3 because the operations do not access shared memory, i.e., no side-effects. • Had the stream been ordered, however, running in parallel may result in worse performance due to sorted() requiring multiple passes and data buffering. • Such operations are called stateful intermediate operations (SIOs). 8
  • 24. Motivating Example 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); 1 List<Widget> sortedWidgets 2 = unorderedWidgets 3 .stream()parallelStream() 4 .sorted(Comparator 5 .comparing( 6 Widget::getWeight)) 7 .collect( 8 Collectors.toList()); • We can perform the transformation at line 3 because the operations do not access shared memory, i.e., no side-effects. • Had the stream been ordered, however, running in parallel may result in worse performance due to sorted() requiring multiple passes and data buffering. • Such operations are called stateful intermediate operations (SIOs). • Maintaining data ordering is detrimental to parallel performance. 8
  • 25. Motivating Example 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); 9
  • 26. Motivating Example 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); • No optimizations are available here because there is no SIO. 9
  • 27. Motivating Example 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); 1 // collect weights over 43.2 2 // into a set in parallel. 3 Set<Double> 4 heavyWidgetWeightSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getWeight) 8 .filter(w -> w > 43.2) 9 .collect( 10 Collectors.toSet()); • No optimizations are available here because there is no SIO. • No performance degradation. 9
  • 28. Motivating Example 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 10
  • 29. Motivating Example 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); • Like sorted(), skip() is also an SIO. 10
  • 30. Motivating Example 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); • Like sorted(), skip() is also an SIO. • But, the stream is ordered, making parallelism counterproductive. 10
  • 31. Motivating Example 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); • Like sorted(), skip() is also an SIO. • But, the stream is ordered, making parallelism counterproductive. • Could unorder (via unordered()) to improve parallel performance. 10
  • 32. Motivating Example 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); 1 // sequentially collect into 2 // a list, skipping first 3 // 1000. 4 List<Widget> 5 skippedWidgetList = 6 orderedWidgets 7 .stream() 8 .skip(1000) 9 .collect( 10 Collectors.toList()); • Like sorted(), skip() is also an SIO. • But, the stream is ordered, making parallelism counterproductive. • Could unorder (via unordered()) to improve parallel performance. • But, doing so would alter semantics due to the target collection being ordered (line 10). 10
  • 33. Motivating Example 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream()parallelStream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 11
  • 34. Motivating Example 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream()parallelStream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); • limit() is an SIO and the stream is ordered. 11
  • 35. Motivating Example 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream()parallelStream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); • limit() is an SIO and the stream is ordered. • But, the stream is unordered before limit(). 11
  • 36. Motivating Example 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream()parallelStream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); • limit() is an SIO and the stream is ordered. • But, the stream is unordered before limit(). • It’s safe and advantageous to run in parallel. 11
  • 37. Motivating Example 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); 1 // collect the first green 2 // widgets into a list. 3 List<Widget> firstGreenList 4 = orderedWidgets 5 .stream()parallelStream() 6 .filter(w -> w.getColor() 7 == Color.GREEN) 8 .unordered() 9 .limit(5) 10 .collect( 11 Collectors.toList()); • limit() is an SIO and the stream is ordered. • But, the stream is unordered before limit(). • It’s safe and advantageous to run in parallel. • A stream’s ordering does not only depend on its source. 11
  • 38. Motivating Example 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 12
  • 39. Motivating Example 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); • Computation is already in parallel (line 7). 12
  • 40. Motivating Example 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); • Computation is already in parallel (line 7). • distinct() is an SIO and the stream is ordered. 12
  • 41. Motivating Example 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); • Computation is already in parallel (line 7). • distinct() is an SIO and the stream is ordered. • Can we keep it in parallel? No, because TreeSets are ordered. 12
  • 42. Motivating Example 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); 1 // collect distinct widget 2 // weights into a TreeSet. 3 Set<Double> 4 distinctWeightSet = 5 orderedWidgets 6 .stream() 7 .parallel() 8 .map(Widget::getWeight) 9 .distinct() 10 .collect(Collectors 11 .toCollection( 12 TreeSet::new)); • Computation is already in parallel (line 7). • distinct() is an SIO and the stream is ordered. • Can we keep it in parallel? No, because TreeSets are ordered. • De-parallelize on line 7. 12
  • 43. Motivating Example 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .unordered().distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); 13
  • 44. Motivating Example 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .unordered().distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); • Computation is already in parallel (line 6). 13
  • 45. Motivating Example 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .unordered().distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); • Computation is already in parallel (line 6). • Direct form of collect() (line 11). 13
  • 46. Motivating Example 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); 1 // collect distinct widget 2 // colors into a HashSet. 3 Set<Color> 4 distinctColorSet = 5 orderedWidgets 6 .parallelStream() 7 .map(Widget::getColor) 8 .unordered().distinct() 9 .collect(HashSet::new, 10 Set::add, 11 Set::addAll); • Computation is already in parallel (line 6). • Direct form of collect() (line 11). • Since the reduction is to an unordered collection, we can unorder immediately before distinct() (line 8) to improve performance. 13
  • 48. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. 14
  • 49. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. 14
  • 50. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. 14
  • 51. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. 14
  • 52. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. • The semantics (meaning) of the code remains intact. 14
  • 53. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. • The semantics (meaning) of the code remains intact. • Examples include renaming a method (function) and pulling up members in sibling classes to a super class to reduce redundancy. 14
  • 54. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. • The semantics (meaning) of the code remains intact. • Examples include renaming a method (function) and pulling up members in sibling classes to a super class to reduce redundancy. • Essential part of agile development. 14
  • 55. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. • The semantics (meaning) of the code remains intact. • Examples include renaming a method (function) and pulling up members in sibling classes to a super class to reduce redundancy. • Essential part of agile development. • Automated refactoring works by combining static analysis, type theory, machine learning, and other front-end compiler technologies to produce code changes that would have been made by an expert human developer. 14
  • 56. Background: Static Analysis and Automated Refactoring • Static analysis is the process of examining source code to understand how the code works without running it. • Does not rely on test suites. • Undecidable in the general case (Rice’s Theorem). Instead, uses approximations. • Refactoring is the process of restructuring code for improved design, better performance, and other non-functional enhancements. • The semantics (meaning) of the code remains intact. • Examples include renaming a method (function) and pulling up members in sibling classes to a super class to reduce redundancy. • Essential part of agile development. • Automated refactoring works by combining static analysis, type theory, machine learning, and other front-end compiler technologies to produce code changes that would have been made by an expert human developer. • Very much a problem of automated software engineering. 14
  • 57. Solution • Devised a fully-automated, semantics-preserving refactoring approach. 15
  • 58. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. 15
  • 59. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. 15
  • 60. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: 15
  • 61. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. 15
  • 62. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. 15
  • 63. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. 15
  • 64. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” 15
  • 65. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” • Traditionally used for preventing resource usage errors. 15
  • 66. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” • Traditionally used for preventing resource usage errors. • Requires interprocedural and alias analyses. 15
  • 67. Solution • Devised a fully-automated, semantics-preserving refactoring approach. • Embodied by an open source refactoring tool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” • Traditionally used for preventing resource usage errors. • Requires interprocedural and alias analyses. • Novel adaptation for possibly immutable objects (streams). 15
  • 68. Solution Highlights • First to integrate automated refactoring with typestate analysis.1 1To the best of our knowledge. 2http://wala.sf.net 3http://git.io/vxwBs 16
  • 69. Solution Highlights • First to integrate automated refactoring with typestate analysis.1 • Uses WALA static analysis framework2 and the SAFE typestate analysis engine.3 1To the best of our knowledge. 2http://wala.sf.net 3http://git.io/vxwBs 16
  • 70. Solution Highlights • First to integrate automated refactoring with typestate analysis.1 • Uses WALA static analysis framework2 and the SAFE typestate analysis engine.3 • Combines analysis results from varying IR representations (SSA, AST). 1To the best of our knowledge. 2http://wala.sf.net 3http://git.io/vxwBs 16
  • 71. Identifying Refactoring Preconditions • Refactoring preconditions are conditions that must hold to guarantee that the transformation is type-correct and semantics-preserving. 17
  • 72. Identifying Refactoring Preconditions • Refactoring preconditions are conditions that must hold to guarantee that the transformation is type-correct and semantics-preserving. • Our refactoring is (conceptually) split into two: 17
  • 73. Identifying Refactoring Preconditions • Refactoring preconditions are conditions that must hold to guarantee that the transformation is type-correct and semantics-preserving. • Our refactoring is (conceptually) split into two: • Convert Sequential Stream to Parallel. 17
  • 74. Identifying Refactoring Preconditions • Refactoring preconditions are conditions that must hold to guarantee that the transformation is type-correct and semantics-preserving. • Our refactoring is (conceptually) split into two: • Convert Sequential Stream to Parallel. • Optimize Parallel Stream. 17
  • 75. Identifying Refactoring Preconditions Table 1: Convert Sequential Stream to Parallel preconditions. exe ord se SIO ROM transformation P1 seq unord F N/A N/A Convert to para. P2 seq ord F F N/A Convert to para. P3 seq ord F T F Unorder and convert to para. 18
  • 76. Identifying Refactoring Preconditions Table 2: Optimize Parallel Stream preconditions. exe ord SIO ROM transformation P4 para ord T F Unorder. P5 para ord T T Convert to seq. 19
  • 77. DFA for Determining Stream Execution Mode ⊥ start seq para Col.stream(), BufferedReader.lines(), Files.lines(Path), JarFile.stream(), Pattern.splitAsStream(), Random.ints() Col.parallelStream() BaseStream.sequential() BaseStream.parallel() BaseStream.sequential() BaseStream.parallel() Figure 1: A subset of the relation E→ in E = (ES , EΛ, E→). 20
  • 78. DFA for Determining Stream Ordering ⊥ start ord unord Arrays.stream(T[]), Stream.of(T...), IntStream.range(), Stream.iterate(), BitSet.stream(), Col.parallelStream() Stream.generate(), HashSet.stream(), PriorityQueue.stream(), CopyOnWrite.parallelStream(), BeanContextSupport.stream(), Random.ints() Stream.sorted() BaseStream.unordered(), Stream.concat(unordered), Stream.concat(ordered) Stream.sorted(), Stream.concat(ordered) BaseStream.unordered(), Stream.concat(unordered) Figure 2: A subset of the relation O→ in O = (OS , OΛ, O→). 21
  • 80. Optimize Streams Eclipse Refactoring Plug-in • Implemented an open source refactoring tool named Optimize Streams. 4http://eclipse.org. 5Available at http://git.io/vpTLk. 22
  • 81. Optimize Streams Eclipse Refactoring Plug-in • Implemented an open source refactoring tool named Optimize Streams. • Publicly available as an open source Eclipse IDE4 plug-in.5 4http://eclipse.org. 5Available at http://git.io/vpTLk. 22
  • 82. Optimize Streams Eclipse Refactoring Plug-in • Implemented an open source refactoring tool named Optimize Streams. • Publicly available as an open source Eclipse IDE4 plug-in.5 • Can we be used by projects not using Eclipse. 4http://eclipse.org. 5Available at http://git.io/vpTLk. 22
  • 83. Optimize Streams Eclipse Refactoring Plug-in • Implemented an open source refactoring tool named Optimize Streams. • Publicly available as an open source Eclipse IDE4 plug-in.5 • Can we be used by projects not using Eclipse. • Includes fully-functional UI, preview pane, and refactoring unit tests. 4http://eclipse.org. 5Available at http://git.io/vpTLk. 22
  • 84. Results • Applied to 11 Java projects of varying size and domain with a total of ∼642 KSLOC. 23
  • 85. Results • Applied to 11 Java projects of varying size and domain with a total of ∼642 KSLOC. • 36.31% candidate streams were refactorable. 23
  • 86. Results • Applied to 11 Java projects of varying size and domain with a total of ∼642 KSLOC. • 36.31% candidate streams were refactorable. • Observed an average speedup of 3.49 during performance testing. 23
  • 87. Results • Applied to 11 Java projects of varying size and domain with a total of ∼642 KSLOC. • 36.31% candidate streams were refactorable. • Observed an average speedup of 3.49 during performance testing. • See [Khatchadourian et al., 2018, 2019] for more details, including user feedback, as well as tool and data set engineering challenges. 23
  • 88. Results Table 3: Experimental results. subject KLOC eps k str rft P1 P2 P3 t (m) htm.java 41.14 21 4 34 10 0 10 0 1.85 JacpFX 23.79 195 4 4 3 3 0 0 2.31 jdp* 19.96 25 4 28 15 1 13 1 31.88 jdk8-exp* 3.43 134 4 26 4 0 4 0 0.78 jetty 354.48 106 4 21 7 3 4 0 17.85 jOOQ 154.01 43 4 5 1 0 1 0 12.94 koral 7.13 51 3 6 6 0 6 0 1.06 monads 1.01 47 2 1 1 0 1 0 0.05 retroλ 5.14 1 4 8 6 3 3 0 0.66 streamql 4.01 92 2 22 2 0 2 0 0.72 threeten 27.53 36 2 2 2 0 2 0 0.51 Total 641.65 751 4 157 57 10 46 1 70.60 * jdp is java-design-patterns and jdk8-exp is jdk8-experiments. 24
  • 89. Refactoring Failures Table 4: Refactoring failures. failure pc cnt F1. InconsistentPossibleExecutionModes 1 F2. NoStatefulIntermediateOperations P5 1 F3. NonDeterminableReductionOrdering 5 F4. NoTerminalOperations 13 F5. CurrentlyNotHandled 16 F6. ReduceOrderingMatters P3 19 F7. HasSideEffects P1 4 P2 41 Total 100 25
  • 90. Performance Evaluation Table 5: Average run times of JMH benchmarks. # benchmark orig (s/op) refact (s/op) su 1 shouldRetrieveChildren 0.011 (0.001) 0.002 (0.000) 6.57 2 shouldConstructCar 0.011 (0.001) 0.001 (0.000) 8.22 3 addingShouldResultInFailure 0.014 (0.000) 0.004 (0.000) 3.78 4 deletionShouldBeSuccess 0.013 (0.000) 0.003 (0.000) 3.82 5 addingShouldResultInSuccess 0.027 (0.000) 0.005 (0.000) 5.08 6 deletionShouldBeFailure 0.014 (0.000) 0.004 (0.000) 3.90 7 specification.AppTest.test 12.666 (5.961) 12.258 (1.880) 1.03 8 CoffeeMakingTaskTest.testId 0.681 (0.065) 0.469 (0.009) 1.45 9 PotatoPeelingTaskTest.testId 0.676 (0.062) 0.465 (0.008) 1.45 10 SpatialPoolerLocalInhibition 1.580 (0.168) 1.396 (0.029) 1.13 11 TemporalMemory 0.013 (0.001) 0.006 (0.000) 1.97 26
  • 92. Conclusion • Optimize Streams is an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. 27
  • 93. Conclusion • Optimize Streams is an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. • Integrates an Eclipse refactoring with the advanced static analyses offered by WALA and SAFE. 27
  • 94. Conclusion • Optimize Streams is an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. • Integrates an Eclipse refactoring with the advanced static analyses offered by WALA and SAFE. • 11 Java projects totaling ∼642 thousands of lines of code were used in the tool’s assessment. 27
  • 95. Conclusion • Optimize Streams is an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. • Integrates an Eclipse refactoring with the advanced static analyses offered by WALA and SAFE. • 11 Java projects totaling ∼642 thousands of lines of code were used in the tool’s assessment. • An average speedup of 3.49 on the refactored code was observed as part of a experimental study. 27
  • 96. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. 28
  • 97. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. 28
  • 98. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. 28
  • 99. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. • Applicability of the tool to other streaming APIs and languages. 28
  • 100. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. • Applicability of the tool to other streaming APIs and languages. • Refactoring side-effect producing code. 28
  • 101. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. • Applicability of the tool to other streaming APIs and languages. • Refactoring side-effect producing code. • Result would be code that is amenable to our refactoring. 28
  • 102. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. • Applicability of the tool to other streaming APIs and languages. • Refactoring side-effect producing code. • Result would be code that is amenable to our refactoring. • Finding other kinds of bugs and misuses of Streaming APIs. 28
  • 103. Future Work • Handle more advanced ways of relating ASTs to SSA-based IR. • Incorporate more kinds of (complex) reductions. • Those involving maps. • Applicability of the tool to other streaming APIs and languages. • Refactoring side-effect producing code. • Result would be code that is amenable to our refactoring. • Finding other kinds of bugs and misuses of Streaming APIs. • Related to non-termination, non-determinism, etc. 28
  • 104. Broader Vision Assist developers not previously familiar with functional programming to use functional language-inspired programming constructs and APIs in increasingly pervasive mainstream Object-Oriented (OO) languages that incorporate such constructs. Includes empirical studies on how developers use functional-inspired in real, mainstream OO programs, providing feedback to language and API designers and a better understanding of this hybrid paradigm. 29
  • 105. For Further Reading Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis (2015). “Streams `a la carte: Extensible Pipelines with Object Algebras”. In: ECOOP, pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591. Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (May 2008). “Effective Typestate Verification in the Presence of Aliasing”. In: ACM TOSEM 17.2, pp. 91–934. doi: 10.1145/1348250.1348255. Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (Sept. 2018). “A Tool for Optimizing Java 8 Stream Software via Automated Refactoring”. In: International Working Conference on Source Code Analysis and Manipulation. SCAM ’18. Engineering Track. Distinguished Paper Award. IEEE. IEEE Press, pp. 34–39. doi: 10.1109/SCAM.2018.00011. Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (May 2019). “Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams”. In: International Conference on Software Engineering. ICSE ’19. Technical Track. To appear. ACM/IEEE. ACM. Strom, Robert E and Shaula Yemini (Jan. 1986). “Typestate: A programming language concept for enhancing software reliability”. In: IEEE TSE SE-12.1, pp. 157–171. doi: 10.1109/tse.1986.6312929. 30