Latency tracing in distributed Java applications

1
JAVA DAY KYIV 2017
LATENCY TRACING
IN DISTRIBUTED JAVA
APPLICATIONS
KONSTANTIN SLISENKO
LEAD SOFTWARE ENGINEER
NOV 5, 2017

2
kanstantsin_slisenka@epam.com
Konstantin Slisenko
Java Team Lead in EPAM
Financial services, trading solutions
Speaker at Java Meetups
github.com/kslisenko

5
AGENDA
What is latency tracing?
2
3
1
How it works?2
Live demo3

6
ONE SYSTEM
LITTLE STORY ABOUT

9
Response time
150ms
150ms
Team 1
Team 3
Team 2
150ms
150ms
150ms
150ms

10
Response time
300ms
300ms
Team 1
Team 3
Team 2
300ms
300ms
300ms
300ms

11
Response time
timeout
timeout
Team 1
Team 3
Team 2
timeout
timeout
timeout
timeout

12
Response time
timeout
timeout
Team 1
Team 3
Team 2
timeout
timeout
timeout
timeout
Frustrated
user

15
1. Whose fault?
Where?
2. Why?

16
1. Whose fault?
Where?
2. Why?
3. How to prevent?

17
PROFILERS, LOGS, METRICS?
?
?
?
?
?

18
“When systems involve not just dozens of subsystems
but dozens of engineering teams, even our best and
most experienced engineers routinely guess wrong about
the root cause of poor end-to-end performance”
https://research.google.com/pubs/pub36356.html

19
LATENCY TRACING
THE MOMENT WHEN YOU NEED

20
AGENDA
2
3
1
How it works?2
Live demo3

22
HOW IT WORKS
ID=1 300 ms
ID = 1

23
HOW IT WORKS
ID=1 300 ms
ID = 1
ID = 1
ID=1 150 ms

24
HOW IT WORKS
ID=1 300 ms
ID = 1
ID = 1
ID = 1
ID=1 120 ms
ID=1 150 ms

25
Service 2
parent id: 1
Service 3
parent id: 1
TRACES AND SPANS
Service 1
no parent id
span id: 1

26
Service 1
no parent id
span id: 1
Service 2
parent id: 1
span id: 2
JAVA CODE
parent id: 2
span id: 3
DB CALL
parent id: 2
span id: 4
Service 3
parent id: 1
span id: 5
TRACES AND SPANS
JAVA CODE
parent id: 5
span id: 6
DB CALL
parent id: 5
span id: 7

27
Service 1
no parent id
span id: 1
Service 2
parent id: 1
span id: 2
JAVA CODE
parent id: 2
span id: 3
DB CALL
parent id: 2
span id: 4
Service 3
parent id: 1
span id: 5
TRACES AND SPANS
JAVA CODE
parent id: 5
span id: 6
DB CALL
parent id: 5
span id: 7 Team 3
Team 1
Team 2

28
1. Whose fault?
Where?
2. Why?
3. How to prevent?

29
HOW DO I ADD THIS
TO MY PROJECT?

30
1. Pass request IDs between tiers
2. Measure and report processing time
3. Collect traces and spans
SO, THE PLAN IS

31
Communication protocols
 Pass trace/span IDs
 Use HTTP headers, JMS attrs
 Modify custom protocols

32
Entry points
 Intercept communication
frameworks (HTTP, JMS, RPC, …)
 Start new traces

33
Method execution flow
 Measure execution time
 Report new spans
 Capture method arguments
 Thread locals for trace/span IDs

34
Asynchronous invocation
 Intercept new thread starting
 Pass trace/span IDs to the new
threads

35
WHAT NEEDS TO BE CHANGED
Communication protocols
 Pass trace/span IDs
 Use HTTP headers, JMS attrs
 Modify custom protocols
Method execution flow
 Measure execution time
 Report new spans
 Capture method arguments
 Thread locals for trace/span IDs
Entry points
 Intercept communication
frameworks (HTTP, JMS, RPC…)
 Start new traces
Asynchronous invocation
 Intercept new thread starting
 Pass trace/span IDs to the new
threads

36
HOW DO I MODIFY
MY JAVA APP?

38
Instrumentation in
Java
Source code

39
Instrumentation in
Java
Source code Byte code

40
Instrumentation in
Java
On the fly
Custom class loader
Java agents
JAVASSIST, ASM, …
Run-time aspects

41
Instrumentation in
Java
OfflineOn the fly
Custom class loader
Java agents
Compile-time
aspects
JAVASSIST, ASM, …
Run-time aspects

43
COMMERCIAL
Magic Quadrant for Application Performance
Monitoring Suites (21 December 2016)
OPEN-SOURCE
Java Performance Monitoring: 5 Open Source
Tools You Should Know (19 January 2017)
www.stagemonitor.org github.com/naver/pinpoint
www.moskito.org
glowroot.org kamon.io
zipkin.io
https://www.gartner.com/doc/reprints?id=1-3OGTPY9&ct=161221
https://dzone.com/articles/java-performance-
monitoring-5-open-source-tools-you-should-know

44
Tracer tracer = ...;
Span parentSpan = ...;
Span span = tracer
.buildSpan(“someWork”)
.asChildOf(parentSpan.context())
.withTag(“foo”, “bar”)
.start();
try {
// Do things
} finally {
span.finish();
}
A vendor-neutral open standard
for distributed tracing
http://opentracing.io

45
AGENDA
2
3
1
How it works?2
Live demo3

47
http://github.com/kslisenko/java-performance

48
HTTP1

49
2
JMS
HTTP1

50
2
JMS
HTTP1
Custom
protocol
3

51
LET’S GO!
github.com/kslisenko/java-performance

53
LATENCY TRACING ISSUES AND LIMITATIONS
1. Computation and I/O overhead
2. Custom protocols
3. Reactive streams, batch processing
4. Security and privacy

54
Latency tracing
 Must have — for microservices
 Better — in production
 At least — at performance testing

Latency tracing in distributed Java applications

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Latency tracing in distributed Java applications

Similar a Latency tracing in distributed Java applications (20)

Más de Constantine Slisenka

Más de Constantine Slisenka (9)

Último

Último (20)

Latency tracing in distributed Java applications