Más contenido relacionado La actualidad más candente (20) Similar a Warsaw MuleSoft Meetup #12 Effective Streaming (20) Más de Patryk Bandurski (13) Warsaw MuleSoft Meetup #12 Effective Streaming2. ● Introductions & Community Updates
● Effective Streaming in Mule 4
● Quiz & Lottery
Agenda
2
5. ● IT Expert at Roche
● More than 21 years of experience in IT
● MuleSoft Certifications: MCIA, MCPA, MCD
● Salesforce Certifications: Administrator, Platform App Builder,
Platform Developer I
● I like technology :)
https://www.linkedin.com/in/jacekbialecki/
5
Jacek Białecki
Speaker
6. ● Subject Matter Expert at PwC Poland
● MuleSoft Ambassador
● MuleSoft Meetup Leader for Warsaw, Poland
● Working with MuleSoft products for over 10 years now
● One of Salesforce Trailblazers
https://trailhead.salesforce.com/trailblazers/patryk-bandurski
Organizer
Check out my integration blog
https://ambassadorpatryk.com/blog
6
7. Share the event
● Share the Meetup in your social media
● Tag the event using
#MuleSoftMeetup
#WarsawMuleSoftMeetup
Thanks ☺
7
8. Lottery
● How it works?
○ During the event randomly is selected a winner
among the present attendees.
○ One winner at a time!
○ Three winners at the event
○ I will ask winners to send me Direct Message with
email address
● Prize is sponsored by
Three winners of today’s
lottery receives:
Amazon Voucher for 30$
8
Go to www.menti.com and use the code 9155 3697
13. All contents © MuleSoft, LLC
Agenda
● Introduction to Streams
● Streams in Mule 4
● DEMO
● Streaming in DataWeave
● DEMO
● Summary & performance considerations
15. All contents © MuleSoft, LLC
Concept of Streams
Scenario - read whole input to the memory
16. All contents © MuleSoft, LLC
Concept of Streams
Scenario - read whole input to the memory
Problem: MEMORY CONSUMPTION
Out Of
Memory
17. All contents © MuleSoft, LLC
Concept of Streams
Scenario – splitting files
18. All contents © MuleSoft, LLC
Concept of Streams
Scenario - pagination
https://api.com/items?offset=200&limit=100
19. All contents © MuleSoft, LLC
Concept of Streams
Scenario - read data as a stream
22. All contents © MuleSoft, LLC
Streams in MuleSoft 4:
● Non-Repeatable Streams (available in Mule 3)
● In-Memory Repeatable Stream (new in Mule 4)
● File-Stored Repeatable Stream (new in Mule 4)
23. All contents © MuleSoft, LLC
Streams in MuleSoft 4: Non-Repeatable Streams
● Cannot be read more than once
● Cannot be consumed simultaneously
● Very performant but needs careful treatment
○ No additional memory buffers
○ No I/O operations (disk based buffers)
● Suitable for processing LARGE streams
24. All contents © MuleSoft, LLC
Streams in MuleSoft 4: Non-repeatable Streams
● Cannot be read more than once (examples)
25. All contents © MuleSoft, LLC
Streams in MuleSoft 4: Non-repeatable Streams
● Cannot be consumed simultaneously (example)
26. All contents © MuleSoft, LLC
New in Mule 4: In-Memory Repeatable Stream
27. All contents © MuleSoft, LLC
New in Mule 4: File-Stored Repeatable Stream
Uses KRYO serializer. Better than standard Java serializer but
still cannot serialize everything
A typical case is a POJO containing an
org.apache.xerces.jaxp.datatype.XMLGregorianCale
ndarImpl, which is in use in the NetSuite or Microsoft Dynamics
CRM connectors.
Only available in
Mule Enterprise
Edition (Mule EE)
28. All contents © MuleSoft, LLC
Stream vs Iterable connectors
Stream Iterable
Connectors
• File
• FTP
• HTTP
• Sockets
• Database
• SalesForce
31. All contents © MuleSoft, LLC
Could we build a solution
using DW and Non-Repeatable stream...?
32. All contents © MuleSoft, LLC
Streaming in DataWeave
● DataWeave supports end-to-end streaming through a flow
● Speeds the processing of large documents without overloading memory
● When in deferred mode, DW passes streamed output data directly to next message processor
● The basic unit of the stream is specific to the data format.
○ a record in a CSV document
○ an element of an array in a JSON document
○ a collection in an XML document
● Streaming accesses each unit of the stream sequentially, it doesn't support random access to a
document
● Streaming is not enabled by default
33. All contents © MuleSoft, LLC
Enabling Streaming in DataWeave
● streaming property, for reading source data as a stream (MIME Type section)
34. All contents © MuleSoft, LLC
Enabling Streaming in DataWeave
● deferred writer property, for passing an output stream directly to the next message
processor in a flow
35. All contents © MuleSoft, LLC
Streaming CSV in DataWeave
● Each row below the CSV header is a streamable record
Sample CSV:
name,lastName,age
mariano,achaval,37
leandro,shokida,30
pedro,achaval,4
christian,chibana,25
sara,achaval,2
matias,achaval,8
Sample DW script:
%dw 2.0
output application/json deferred=true
---
payload map (record) ->
{
FullName: record.lastName ++ "," ++
record.name,
Age: record.age
}
36. All contents © MuleSoft, LLC
Streaming CSV in DataWeave
● Each row below the CSV header is a streamable record
Sample CSV:
name,lastName,age
mariano,achaval,37
leandro,shokida,30
pedro,achaval,4
christian,chibana,25
sara,achaval,2
matias,achaval,8
Sample DW script:
%dw 2.0
output application/json deferred=true
---
[payload[-2], payload[-1], payload[3]]
ERROR HERE
37. All contents © MuleSoft, LLC
Streaming JSON in DataWeave
● The unit of a JSON stream is each element in an array
Sample JSON:
[
{
"firstName": "John",
"lastName": "Smith",
"age": "37"
},
{
"firstName": "Foo",
"lastName": "Bar",
"age": "30"
}
]
Sample DW script:
%dw 2.0
output application/json deferred=true
---
payload map (record) ->
{
FullName: record.firstName ++ ","
++ record.lastName,
Age: record.age
}
38. All contents © MuleSoft, LLC
Streaming XML in DataWeave
● XML is more complicated than JSON because
there are no arrays in the document
● To enable XML streaming the property
collectionPath has to be provided
<order>
<header>
<date>Wed Nov 15 13:45:28 EST
2006</date>
<customer
number="1">Joe</customer>
</header>
<order-items>
<order-item id="31">
<product>111</product>
<quantity>2</quantity>
<price>8.90</price>
</order-item>
<order-item id="31">
<product>222</product>
<quantity>7</quantity>
<price>5.20</price>
</order-item>
(...)
</order-items>
</order
39. All contents © MuleSoft, LLC
Validate a Script using @StreamCapable()
● The @StreamCapable() validator checks a
script against the following criteria:
○ The variable is referenced only once.
○ No index selector is set for negative
access, such as [-1].
○ No reference to the variable is found in a
nested lambda.
● If all criteria are met, the selected data
is streamable
Sample DW script:
%dw 2.0
@StreamCapable()
input payload application/csv
output application/json deferred=true
---
payload map (record) ->
{
// script here...
}
41. All contents © MuleSoft, LLC
Let's summarize
● Non-Repeatable Streams
○ The input stream is read only once
○ No extra memory or performance overhead in comparison to repeatable streams
● Repeatable In Memory Streams
○ The default for the Mule Kernel (formerly called Mule Runtime Community Edition)
○ Uses a default max buffer size of 500 objects (Iterable) or 1024KB (binary)
○ The buffer is extended from the initial size until the max buffer size is reached
○ If the stream exceeds the max buffer size - the applications fails
● File-Stored Repeatable Stream
○ The default for Mule EE (and only available in Mule EE)
○ By default it store 500 objects (Iterable) or 512KB (binary) in it's memory buffer
○ If the stream exceeds the memory buffer then Kryo serializer writes data to disk
43. All contents © MuleSoft, LLC
Performance considerations
● Repeatable streams introduced in Mule 4 are a good compromise between
performance and convenience/robustness. It's not a coincidence that they are set
as defaults ;-)
● More most of the cases (input data relatively small) Repeatable streams should be
used as they hide the complexity of streams
● For large data (gigabytes or more) Non-repeatable streams should be definitely
considered as they provide exceptional good performance and low memory
consumption
● What about Batch Jobs?
● Once you decide Non-Repeatable streams will be used you have to ensure that:
○ The input stream is read only once
○ The input stream is not read simultaneously by multiple threads
○ Transform Message / DataWeave is used properly and supports streaming
44. All contents © MuleSoft, LLC
Please remember:
Conduct proper
performance testing
before going to PROD!
47. Trivia Quiz
● Remember!
○ The quicker you respond more point you earn
○ Only good answers count ☺
○ Only one voucher per winner per month
○ Training account with the given email
● Prize is sponsored by
Three winners of today’s
quiz receives:
Free voucher for MuleSoft
online training
47
Go to www.menti.com and use the code 9155 3697
50. Share your knowledge
● Become a speaker and share your knowledge with our community
● Submit your idea via this form:
https://tinyurl.com/become-speaker
via email patryk.bandurski@gmail.com
or
50
51. ● Share:
○ Tweet using the hashtag #MuleSoftMeetups
○ Invite your network to join: https://meetups.mulesoft.com/warsaw/
● Feedback:
○ Fill out the survey feedback and suggest topics for upcoming events
○ Contact MuleSoft at meetups@mulesoft.com for ways to improve the program
What’s next?
51