Enviar búsqueda
Cargar
Financial big data loosely coupled, highly structured - andrew elmore
•
1 recomendación
•
1,189 vistas
A
andrewelmore
Seguir
Denunciar
Compartir
Denunciar
Compartir
1 de 44
Descargar ahora
Descargar para leer sin conexión
Recomendados
Business analysts and sdlc
Business analysts and sdlc
Aniket Sharma
Presentation deploying cloud based services
Presentation deploying cloud based services
xKinAnx
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战
Weiwei Fang
IPv6 Address Planning
IPv6 Address Planning
Deploy360 Programme (Internet Society)
Meetup milano #3 all you need to know before creating your vpc
Meetup milano #3 all you need to know before creating your vpc
Gonzalo Marcos Ansoain
2022-Db2-Securing_Your_data_in_motion.pdf
2022-Db2-Securing_Your_data_in_motion.pdf
Roland Schock
Cloud Computing-10012021-v4-hosting by Rinko Kabiraj.ppt
Cloud Computing-10012021-v4-hosting by Rinko Kabiraj.ppt
tanim26
Stabilizing Ceph
Stabilizing Ceph
Ceph Community
Recomendados
Business analysts and sdlc
Business analysts and sdlc
Aniket Sharma
Presentation deploying cloud based services
Presentation deploying cloud based services
xKinAnx
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战
Weiwei Fang
IPv6 Address Planning
IPv6 Address Planning
Deploy360 Programme (Internet Society)
Meetup milano #3 all you need to know before creating your vpc
Meetup milano #3 all you need to know before creating your vpc
Gonzalo Marcos Ansoain
2022-Db2-Securing_Your_data_in_motion.pdf
2022-Db2-Securing_Your_data_in_motion.pdf
Roland Schock
Cloud Computing-10012021-v4-hosting by Rinko Kabiraj.ppt
Cloud Computing-10012021-v4-hosting by Rinko Kabiraj.ppt
tanim26
Stabilizing Ceph
Stabilizing Ceph
Ceph Community
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
Lars Aksel Opsahl
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
DataWorks Summit
Desktop Private Cloud
Desktop Private Cloud
Paul Morse
Istio as an enabler for migrating to microservices (edition 2022)
Istio as an enabler for migrating to microservices (edition 2022)
Ahmed Misbah
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design Verification
Daniele Loiacono
CourboSpark
CourboSpark
Christophe Salperwyck
Bitcoin Decision Point - April 2017
Bitcoin Decision Point - April 2017
Jeff Garzik
ISO 20022 for funds presentation
ISO 20022 for funds presentation
Koen Vierendeels
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
Ambassador Labs
SKS_Presentation(2008.10)
SKS_Presentation(2008.10)
aprilnewman
New idc architecture
New idc architecture
Mason Mei
Trending Topics in Payments
Trending Topics in Payments
Nicole Cuilis-Gregor
bcPro Introductory Offer
bcPro Introductory Offer
NghiaCao8
2019 touraine tech - Marteau Hubert
2019 touraine tech - Marteau Hubert
HubertMARTEAU
Cryptocurrencies overview
Cryptocurrencies overview
Trector Rancor
State of GeoServer 2012
State of GeoServer 2012
Jody Garnett
onos-day-dkim-20150914-lkin
onos-day-dkim-20150914-lkin
Dongkyun Kim
The Future of Service Mesh
The Future of Service Mesh
All Things Open
Using Graphs to Take Down Fraudsters in Real Time
Using Graphs to Take Down Fraudsters in Real Time
Neo4j
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
HostedbyConfluent
Más contenido relacionado
Similar a Financial big data loosely coupled, highly structured - andrew elmore
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
Lars Aksel Opsahl
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
DataWorks Summit
Desktop Private Cloud
Desktop Private Cloud
Paul Morse
Istio as an enabler for migrating to microservices (edition 2022)
Istio as an enabler for migrating to microservices (edition 2022)
Ahmed Misbah
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design Verification
Daniele Loiacono
CourboSpark
CourboSpark
Christophe Salperwyck
Bitcoin Decision Point - April 2017
Bitcoin Decision Point - April 2017
Jeff Garzik
ISO 20022 for funds presentation
ISO 20022 for funds presentation
Koen Vierendeels
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
Ambassador Labs
SKS_Presentation(2008.10)
SKS_Presentation(2008.10)
aprilnewman
New idc architecture
New idc architecture
Mason Mei
Trending Topics in Payments
Trending Topics in Payments
Nicole Cuilis-Gregor
bcPro Introductory Offer
bcPro Introductory Offer
NghiaCao8
2019 touraine tech - Marteau Hubert
2019 touraine tech - Marteau Hubert
HubertMARTEAU
Cryptocurrencies overview
Cryptocurrencies overview
Trector Rancor
State of GeoServer 2012
State of GeoServer 2012
Jody Garnett
onos-day-dkim-20150914-lkin
onos-day-dkim-20150914-lkin
Dongkyun Kim
The Future of Service Mesh
The Future of Service Mesh
All Things Open
Using Graphs to Take Down Fraudsters in Real Time
Using Graphs to Take Down Fraudsters in Real Time
Neo4j
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
HostedbyConfluent
Similar a Financial big data loosely coupled, highly structured - andrew elmore
(20)
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
Postgis_Topology_to_secure_data_integrity,_simple_API_and_clean_up_messy_simp...
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
Desktop Private Cloud
Desktop Private Cloud
Istio as an enabler for migrating to microservices (edition 2022)
Istio as an enabler for migrating to microservices (edition 2022)
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design Verification
CourboSpark
CourboSpark
Bitcoin Decision Point - April 2017
Bitcoin Decision Point - April 2017
ISO 20022 for funds presentation
ISO 20022 for funds presentation
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
SKS_Presentation(2008.10)
SKS_Presentation(2008.10)
New idc architecture
New idc architecture
Trending Topics in Payments
Trending Topics in Payments
bcPro Introductory Offer
bcPro Introductory Offer
2019 touraine tech - Marteau Hubert
2019 touraine tech - Marteau Hubert
Cryptocurrencies overview
Cryptocurrencies overview
State of GeoServer 2012
State of GeoServer 2012
onos-day-dkim-20150914-lkin
onos-day-dkim-20150914-lkin
The Future of Service Mesh
The Future of Service Mesh
Using Graphs to Take Down Fraudsters in Real Time
Using Graphs to Take Down Fraudsters in Real Time
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Financial big data loosely coupled, highly structured - andrew elmore
1.
Financial Big Data Loosely
coupled, highly structured Andrew Elmore
2.
Agenda
• Why traditional RDBMSs aren’t an ideal fit • Why storage format is important • Using NoSQL stores to store & query financial data • Examples using GemFire, GigaSpaces & Integration Objects • Scaling complex processing using grids Copyright © 2013 C24 Technologies Ltd.
3.
Message Structure
4.
The Humble MT103
{1:F01MIDLGB22AXXX0548100000}{2:I103BKTRUS33XBR • A commonly used SWIFT DN2}{3:{108:valid}}{4: :20:8861198-0706 message :23B:CRED :32A:000612USD73025, :33B:USD73025, :50K:GIAN ANGELO IMPORTS NAPLES :52A:EFGHITMM500 • Used for credit transfers :53A:BCITUS33 :54A:IRVTUS3N :57A:BNPAFRPPGRE :59:/20041010050500001M02606 KILLY S.A. • Mid-complexity compared to GRENOBLE :70:Invoice: 100001 :71A:SHA other SWIFT messages :72:/ABCD/narrative //more narrative /EFGH/narrative -} Copyright © 2013 C24 Technologies Ltd.
5.
MT103 Structure Copyright ©
2013 C24 Technologies Ltd.
6.
MT103 Structure
• The MT103 Block 4 has 22 fields • Some are optional • Some repeat a variable number of times • Some are grouped into (repeating) sequences Copyright © 2013 C24 Technologies Ltd.
7.
MT103 Structure
• Some fields have multiple options Copyright © 2013 C24 Technologies Ltd.
8.
MT103 Structure
• Each field can be composed of multiple components Copyright © 2013 C24 Technologies Ltd.
9.
The Full MT103
Definition Copyright © 2013 C24 Technologies Ltd.
10.
To Make Matters
Worse... • There are hundreds of SWIFT messages • Each organisation typically uses a subset • The standard changes every year • Firms may need to support multiple versions concurrently • This is just one of many standards • FIX, FpML, SEPA, ... Copyright © 2013 C24 Technologies Ltd.
11.
RDBMS & Financial
Messages • Storing the full message in a relational structure is very time consuming and fragile • In practice it’s not viable • What options do we have? Copyright © 2013 C24 Technologies Ltd.
12.
Lossy Storage
• If we only care about a subset of fields, we can construct a simpler storage structure {1:F01MIDLGB22AXXX0548100000}{2:I… :20:8861198-0706 :23B:CRED • Information thrown away :32A:000612USD73025, 100000 USD 73025 BCIT :33B:USD73025, :50K:GIAN ANGELO IMPORTS however is lost forever NAPLES :52A:EFGHITMM500 • Advances in processing capacity :53A:BCITUS33 :54A:IRVTUS3N however mean that new applications :57A:BNPAFRPPGRE :59:/20041010050500001M02606 are being found for historical data KILLY S.A. GRENOBLE :70:Invoice: 100001 :71A:SHA :72:/ABCD/narrative • Hybrid solution is to store //more narrative /EFGH/narrative -} original source message alongside in a CLOB • Not readily accessible but at least it’s available Copyright © 2013 C24 Technologies Ltd.
13.
XML Overload
• As message formats are hierarchical, we can construct an XML representation of the messages 100000 USD 73025 BCIT <?xml version="1.0"?> • Store this in a CLOB <MT103 xmlns="http://www.c24.biz/IO/SWIFT2012"> <Block1> <ApplicationID>F</ApplicationID> <ServiceID>01</ServiceID> <LTAddress> <BankCode>MIDL</BankCode> <CountryCode>GB</CountryCode> • Either create additional columns for indexed <LocationCode> <LocationCode1>2</LocationCode1> <LocationCode2>2</LocationCode2> values or use RDBMS with XML index support </LocationCode> <LogicalTerminalCode>A</ LogicalTerminalCode> <BranchCode>XXX</BranchCode> </LTAddress> <SessionNumber>0548</SessionNumber> <SequenceNumber>100005</ SequenceNumber> </Block1> <Iblock2> • Extraction of data either requires multiple XPath …. </MT103> queries or parsing the XML to resurrect the original message/object • Slower than querying simple cols and (probably) an artificial format Copyright © 2013 C24 Technologies Ltd.
14.
Message Format
• Our in-memory structure needs to model all the data we need for parsing/processing in an efficient structure • e.g. a tree of (Java) objects • Ideally we’ll persist the same structure directly into the store • Persistence and loading are therefore fast • Do we even need to persist to disk? • We need ready access to the original message format too • The store must be able to index arbitrary attributes in the persisted object tree for querying • The store should be capable of extracting individual values from the object tree Copyright © 2013 C24 Technologies Ltd.
15.
Future Proofing
• New clients == new message formats • Same purpose, just different presentation • Systems architected to work regardless of format & transport Flat File, XML, Native, ZIP, (S)FTP, .. SWIFT FIX JMS, MQ, AMQP, ... Parse Validate Process FpML HTTP, REST, WebServices, ... SMTP Copyright © 2013 C24 Technologies Ltd.
16.
C24 Integration Objects
• Pre-built models for common standards • Updated as standards evolve • Easy to import/create new models • Java binding toolkit • Parse messages into queryable objects Copyright © 2013 C24 Technologies Ltd.
17.
Partners & Platforms Copyright
© 2013 C24 Technologies Ltd.
18.
Typical Integration Flow
• Linear process, potentially replicated File Channel Adapter ErrorChannel JMS C24 Store Unmarshalling Validating Process Channel Adapter Channel Adapter Transformer Message Filter Valid Region Invalid ErrorChannel ErrorChannel HTTP Channel Adapter ErrorChannel Copyright © 2013 C24 Technologies Ltd.
19.
Acquisition
<!-- Read the SWIFT message files --> <int-file:inbound-channel-adapter directory="data/swift" filename-pattern="*.dat" prevent-duplicates="true" channel="parse-swift-message-channel" id="inbound-transport-swift-file"> <poller fixed-rate="100" task-executor="swiftThreadPool"/> </int-file:inbound-channel-adapter> <task:executor id="swiftThreadPool" pool-size="8"/> <channel id="parse-swift-message-channel"/> Copyright © 2013 C24 Technologies Ltd.
20.
Flow-based Processing
<!-- SWIFT Parsing --> <c24:model id="swiftMessageModel" base-element="biz.c24.io.swift2012.MT103Element"/> <int-c24:unmarshalling-transformer input-channel="parse-swift-message-channel" output-channel="validate-message-channel" model-ref="swiftMessageModel" source-factory-ref="textualSourceFactory"/> <!-- Validate and filter out invalid messages --> <int-c24:validating-selector id="c24Validator" throw-exception-on-rejection="true"/> <filter input-channel="validate-message-channel" output-channel="process-message-channel" ref="c24Validator" discard-channel="errorChannel"/> <!-- Process - stub out for now--> <bridge input-channel="process-message-channel" output-channel="persist-valid-message-channel"/> <!-- Persist --> <channel id="persist-valid-message-channel"/> Copyright © 2013 C24 Technologies Ltd.
21.
Persistence
<int-gfe:outbound-channel-adapter id="validStore" channel="persist-valid-message-channel" region="valid"> <int-gfe:cache-entries> <beans:entry key="headers[id]" value="payload"/> </int-gfe:cache-entries> </int-gfe:outbound-channel-adapter> Copyright © 2013 C24 Technologies Ltd.
22.
Flow-based Architecture
• All of the scalability is at the front-end • Requires us to spread input load evenly across processing units Acquire Parse Validate Store • Alternatively we can use the store and make our flow event- based Copyright © 2013 C24 Technologies Ltd.
23.
Event-based Architecture
Parser Validator Acquirer Processor Store • The store ‘triggers’ processing • Listeners • Subscriptions • Querying • ... Copyright © 2013 C24 Technologies Ltd.
24.
Scaling Up
Parser Validator Parser Validator Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Processor Acquirer Processor Store • Multiple components can register an interest in the events • We can scale out and add redundancy by replicating the listeners Copyright © 2013 C24 Technologies Ltd.
25.
Scaling Out
• We can control what is replicated and to where • The node that parses the message doesn’t have to be the same as the one that processes it Parser Validator Parser Validator Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Processor Acquirer Processor Parser Validator Parser Validator Store Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Processor Acquirer Processor Replicate Store Store Parser Validator Parser Validator Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Parser Validator Processor Acquirer Processor Acquirer Processor Copyright © 2013 C24 Technologies Ltd.
26.
Ingesting Data
27.
Demo Time
• GemFire Groovy DSL is still in development • Allows simple definition of regions and event-based handlers region 'new', shortcut: REPLICATE, { listener ( afterCreate: { EntryEvent e -> try { // Parse the raw message String message = e.getNewValue(); source.setReader(new StringReader(message)); parsedRegion.put(e.getKey(), source.readObject(element)); } catch(Exception ex) { exceptionRegion.put(e.getNewValue(), ex); } finally { newRegion.remove(e.getKey()); } } ) } Copyright © 2013 C24 Technologies Ltd.
28.
Demo Takeaways
• We had the full message available to us • With a meaningful schema • We can query on and extract the full message or individual values • Approach is message format neutral • We just need to know the hierarchical location of the information we’re interested in • Our persistence code is extremely simple • We can index on any [derived] attribute Copyright © 2013 C24 Technologies Ltd.
29.
Scaling Out
• Nodes can be replicated and sharded/partitioned • Queries are automatically distributed • Data can be ingested on any node • The same event-based architecture works well for distributed processing & analysis • And as we’ve seen our data grids allow us to embed our logic in them Copyright © 2013 C24 Technologies Ltd.
30.
Processing
31.
Real-Time Analytics
• Complex Event Processing • Inter-dependent set of variables & rules • Typically very high volume & latency critical • Run on multiple nodes to cope with • Load • Volume • Resilience Copyright © 2013 C24 Technologies Ltd.
32.
Rule Engines
• Track dependencies between rules • Recalculate when necessary • Trigger behaviour when all rules are met • Changes typically triggered by time & message state change • Fits well with our event-based architecture • See RETE algorithms & derivates for creating highly performant rule trees Copyright © 2013 C24 Technologies Ltd.
33.
A Simple Architecture
• We’ll use Gigaspaces XAP for some variety • Distributed in-memory data grid + elastic application platform • Sample application consists of 3 key areas: • ‘Space Object’ ie what are we going to be storing • Feeder - how do we get data into the grid • Processors - what happens to data once it’s in the grid mvn os:create -DgroupId=my.group -DartifactId=Test1 -Dtemplate=basic mvn package mvn os:deploy -Dlocators=localhost:4174 • Message Processors • ParseProcessor - parse raw messages into domain objects • RuleProcessor - trigger rule updates on message state change Copyright © 2013 C24 Technologies Ltd.
34.
Configuring XAP
• Multiple ways to mark up our code • We’ll use annotations for simplicity • First we need the type we’re going to store in the space Copyright © 2013 C24 Technologies Ltd.
35.
The Space Class
@SpaceClass public class Message<T extends ComplexDataObject> implements Serializable { public static enum State { NEW, VALID, PROCESSED, INVALID }; private String id = null; private State state = null; private String rawMessage = null; private T message = null; private Throwable failure = null; @SpaceId(autoGenerate=true) public String getId() { return id; } Copyright © 2013 C24 Technologies Ltd.
36.
The Feeder
• Simple implementation to read raw messages from the filesystem • In practice we’ll be acquiring from multiple sources • Use an outbound channel adapater as a feeder? public class Feeder implements InitializingBean, DisposableBean { @GigaSpaceContext private GigaSpace gigaSpace; ... public void loadFiles() { ...; for(File file : dir.listFiles()) { Scanner scanner = new Scanner(file); String fileContent = scanner.useDelimiter("A").next(); scanner.close(); gigaSpace.write(new Message(fileContent)); } } Copyright © 2013 C24 Technologies Ltd.
37.
ParseProcessor
• Declare the Processor and tell it when to trigger @EventDriven @Notify @NotifyType(write = true, update = true) public class ParseProcessor<T extends ComplexDataObject> { /** * We process any messages in a new state * @return */ @EventTemplate Message templateMessage() { Message msg = new Message(); msg.setState(Message.State.NEW); return msg; } Copyright © 2013 C24 Technologies Ltd.
38.
ParseProcessor
• ...and how to process the messages it receives @SpaceDataEvent public Message<T> processData(Message<T> data) { try { log.info("Parsing " + data.getId()); source.setReader(new StringReader(data.getRawMessage())); data.setMessage((T)source.readObject(element)); data.setState(State.VALID); log.info("Parsed " + data.getId()); } catch(Throwable ex) { log.info("Message invalid " + data.getId()); data.setFailure(ex); data.setState(State.INVALID); } return data; } Copyright © 2013 C24 Technologies Ltd.
39.
RuleProcessor
/** * We process any messages in a valid state * @return */ @EventTemplate Message templateMessage() { Message msg = new Message(); msg.setState(Message.State.VALID); return msg; } @SpaceDataEvent public Message<T> processData(Message<T> data) { log.info("Processing " + data.getId()); engine.process(data.getMessage()); data.setState(State.PROCESSED); return data; } Copyright © 2013 C24 Technologies Ltd.
40.
Demo
41.
General Concerns
42.
Serialisation
• At some point your objects are likely to be serialized • ...and it won’t be during your local testing • Create unit tests MyType obj = ...; ByteArrayOutputStream bos = new ByteArrayOutputStream(); ObjectOutputStream oos = new ObjectOutputStream(bos); oos.writeObject(obj); oos.close(); • Check what you’re actually serialising assertThat(bos.toByteArray().length, is(lessThan(some size))); Copyright © 2013 C24 Technologies Ltd.
43.
Interface Semantics
• Exclusive lock vs shared lock vs take • Who gets notified? • Callback thread behaviour • What happens if we hold on to the thread for a long time? • What if we generate an exception? • None of these technologies make coordination problems disappear • They can however make it easier to apply good practice Copyright © 2013 C24 Technologies Ltd.
44.
Thank You!
? andrew.elmore@c24.biz @andrew_elmore Copyright © 2013 C24 Technologies Ltd.
Descargar ahora