Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

226 visualizaciones

Publicado el

This is the 2nd defense of my Ph.D. double degree.
More details - https://kkpradeeban.blogspot.com/2019/08/my-phd-defense-software-defined-systems.html

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

  1. 1. Software-Defined Systems for Network-Aware Service Composition and Workflow Placement Pradeeban Kathiravelu Supervisors: Prof. Luís Veiga Prof. Peter Van Roy Louvain-la-Neuve, Belgium. August 23rd , 2019
  2. 2. 2/37 Promising Trends of the Internet ● Growth of the Internet bandwidth. – ↟ demand and $$$.↡ $$$. ● Innovation with cloud ecosystems. – Service providers and tenants. – Dedicated connectivity* of the cloud providers. ● Increasing geographical presence. ● Well-provisioned network → Low latency links. * James Hamilton, VP, AWS (AWS re:invent 2016).
  3. 3. 3/37 ● Disparities – Pricing. ● E.g. IP Transit price per Mbps, 2014 – USA: 0.94 $ – Kazakhstan: 15 $ – Uzbekistan: 347 $ – Latency. ● Multi-domain Workflows? – Interoperability – Control Challenges in Practice
  4. 4. 4/37 A Case for Cloud-Assisted Networks ● A large-scale overlay network built over cloud VMs ● Can a network overlay built over cloud instances be a better connectivity provider? – High-performance – Cost effectiveness – Network Services on the fly!
  5. 5. 5/37 Low Latency with Cloud Routes • Cloud-based data transfer A → Z via the path A → B → Z. – Cloud region B is closer to the origin server A. – B and Z are cloud VMs connected by a cloud overlay.
  6. 6. 6/37 Network Services Dilemma ● Network Services: On-Premise vs. Centralized Cloud? Edge! ● Network Service Chaining (NSC) ● Service chain placement abiding by tenant Service Level Objectives (SLOs).
  7. 7. 7/37 Can Network Softwarization help? ● Network Control, Reusability, and Interoperability. ● Typically focuses on a single provider. ● Network-awareness for multi-domain workflows.
  8. 8. 8/37 Enablers of Network Softwarization ● Software-Defined Networking (SDN) ● Network Functions Virtualization (NFV) – Network middleboxes → Virtual Network Functions (VNFs) ● Software-Defined Systems (SDS) – Storage, Security, Data center, .. – Improved configurability
  9. 9. 9/37 Motivation ● Better control for tenants composing service workflows. – Optimal placement abiding by their policies & SLOs. ● Challenges: technical, economic, and policy.
  10. 10. 10/37 Thesis Goals Network-Aware Service Composition and Workflow Placement Scale Intra-Domain Multi-Domain Edge The Internet
  11. 11. 11/37 Q1: Execution Migration Across Development Stages Can we seamlessly scale and migrate network applications through network softwarization across development and deployment stages? Scale: Data center (CoopIS’16, SDS’15, and IC2E’16)
  12. 12. 12/37 Q2: Economic & Performance Benefits Can network softwarization offer economic and performance benefits to the end users? Scale: Data center → Inter-cloud (Networking’18 and IM’17)
  13. 13. 13/37 Q3: Service Chain Placement Can we efficiently chain services from several edge and cloud providers to compose tenant workflows, by federating SDN deployments of the providers, using SOA? Scale: Multi-domain → Edge (ETT’18, ICWS’16, and SDS’16)
  14. 14. 14/37 Q4: Interoperability Can we enhance the interoperability of diverse network applications, by leveraging network softwarization and SOA? Scale: Data center → Multi-domain and Edge (CLUSTER’18, DAPD’19, SDS’17, and CoopIS’15)
  15. 15. 15/37 Q5: Application to Big Data Can we improve the performance, modularity, and reusability of big data applications, by leveraging network softwarization and SOA? Scale: Data center → the Internet (CCPE’19 and SDS’18)
  16. 16. 16/37 Thesis Contributions
  17. 17. 17/37 [1] SENDIM: Unified Network Modeling and Deployment
  18. 18. 18/37 [2] SMART: Application-level tenant policies to network with middleboxes
  19. 19. 19/37 [3] NetUber: Cloud-Assisted Networks as a Connectivity Provider • A third-party virtual connectivity provider with no fixed infrastructure. – Better network paths compared to public Internet paths.
  20. 20. 20/37 NetUber Application Scenarios • Cheaper data transfers between two endpoints. • Higher throughput and lower latency. • Network services. • Alternative to Software-as-a-Service replication.
  21. 21. 21/37 NetUber Inter-Cloud Architecture • Deploy SaaS applications in one or a few regions. – Fast access from more regions with NetUber. Ohio London Belgium AWS GCP
  22. 22. 22/37 Monetary Costs to Operate NetUber A.Cost of Cloud VMs (per second) – Spot instances: volatile, but up to 90% savings. B.Cost of Bandwidth (per transferred data volume). C.Cost to connect to the cloud provider (per port-hour).
  23. 23. 23/37 Evaluation • NetUber prototype with AWS r4.8xlarge spot instances. • Cheaper point-to-point connectivity. ● Better throughput and reduced latency & jitter. – Origin: RIPE Atlas Probes and our distributed servers. – Destination: VMs of multiple AWS regions. ● Network Services: Compression
  24. 24. 24/37 1) Cost: NetUber vs. connectivity providers • 10 Gbps point-to-point connectivity: from EU & USA. – Cheaper for data transfers <50 TB/month.
  25. 25. 25/37 2) Latency (Ping times): ISP vs. NetUber (via region, % Improvement) • NetUber cuts Internet latencies by up to 30%. • Direct Connect would make NetUber even faster.
  26. 26. 26/37 3) Throughput: ISP, NetUber, and Selectively Using NetUber ● Better throughput with NetUber via near cloud region. – Selective use of overlay when no proximate region.
  27. 27. 27/37 4) Jitter: ISP vs. NetUber ● NetUber for latency-sensitive web applications.
  28. 28. 28/37 Related Work • Connectivity provider that does not own the infrastructure. – Low latency cloud-assisted overlay network. – Better data rate than ISPs. • Previous research do not consider economic aspects. – A cheaper alternative (< 50 TB/month). • Similar industrial efforts. – Voxility, an alternative to transit providers. – Teridion, Internet fast lanes for SaaS providers.
  29. 29. 29/37 [4] Évora: Service Chain Orchestration ● SDN with Message-Oriented Middleware (MOM). – For multi-domain edge environments. ● Graph-based algorithm – To incrementally construct user workflows ● as service chains at the edge. – Place and migrate user service chains. ● Adhering to the user policies.
  30. 30. 30/37 Évora Orchestration: Deployment Architecture
  31. 31. 31/37 1) Initialize Orchestrator in each User Device ● Construct a service graph in the user device. ― As a snapshot of the service instances at the edge.
  32. 32. 32/37 2) Identify Potential Workflow Placements ● Construct potential chains incrementally. – Subgraphs from service graph to match user chain. – Noting individual service properties. ● A complete match? – Save as a potential service chain placement.
  33. 33. 33/37 3) Service Chain Placement ● Calculate a penalty value for potential placements. – Normalized values: Cost, Latency, and Throughput. – α,β,γ ← User-specified weights. ● Place NSC on composition with minimal penalty value. – Mixed Integer Linear Problem. – Extensible with powers and more properties.
  34. 34. 34/37 Evaluation ● Model sample edge environment. – Service nodes and a user device. – User policies for the service workflow. ● Microbenchmark Évora workflow placement. – Effectiveness in satisfying user policies. – Efficacy in closeness to optimal results ● ↡ $$$.Penalty value ➡ ↟ Quality of Experience
  35. 35. 35/37 User Policies with Two Properties ● Equal weights to 2 properties among T, C, and L. ● Darker circles – compositions with minimal penalty. – The ones that Évora chooses (circled). T↑ & C↓ T↑ & L↓ C↓ & L↓
  36. 36. 36/37 User Policies with Three Properties T C L ● T↑, C ↓, and L ↓ with weighted properties: – Prominence to one (w=10) than the other two (w=3). ● Radius – Monthly Cost ● Effectively satisfying the user policies.
  37. 37. 37/37 Conclusion ● Seamless migration across development and deployments. ● A case for Cloud-Assisted Networks as a connectivity provider. ● Composing & placing workflows in multi-domain networks. ● Increased interoperability with network softwarization & SOA. ● Applicability of our contributions in the context of Big Data. Future Work ● NetUber as an enterprise connectivity provider. ● Adaptive network service chains on hybrid networks. Thank you! Questions?
  38. 38. 38/37 Additional Slides
  39. 39. 39/37 (0) *Overview*
  40. 40. 40/37 Publications
  41. 41. 41/37 Multitenancy and the Tenant Users of a Cloud Environment
  42. 42. 42/37 Contributions and Relationships
  43. 43. 43/37 Why SOA for our SDS? ● Beyond data center scale. – Thanks to the standardization of services. ● SOA and RESTful reference architectures. – Multiple implementation approaches such as Message- Oriented Middleware (MOM). ● Publish/subscribe to a message broker over the Internet. ● Service endpoints to handover messages to the broker. ● Flexibility, modularity, loose-coupling, and adaptability.
  44. 44. 44/37 OpenDaylight ● Incremental development of OSGi bundles – Checkpointing and versioning of the modules. ● State of executions and transactions – Stored in the controller distributed data tree.
  45. 45. 45/37 What MOM got to do with the controller? ● Expose the internals from controller (e.g. OpenDaylight) – Through a message-based northbound API ● e.g. AMQP (Advanced Message Queuing Protocol). – Publish/Subscribe with a broker (e.g. ActiveMQ). ● What can be exposed – Data tree (internal data structures of the controller) – Remote procedure calls (RPCs) – Notifications. ● Thanks to Model-Driven Service Abstraction Layer (MD-SAL) of OpenDaylight. – Compatible internal representation of data plane. – Messaging4Transport Project.
  46. 46. 46/37 State-aware Adaptive Scaling ● Adaptive scaling through shared state. – Horizontal scalability through In-Memory Data Grids (IMDGs). – State of the executions for scaling decisions. ● Pause-and-resume executions. – Parallel multi-tenant executions.
  47. 47. 47/37 (1) *SENDIM* ● Simulation, Emulation, aNd Deployment Integration Middleware ● CoopIS’16, SDS’15, and IC2E’16
  48. 48. 48/37 Introduction ● Networks simulated or emulated at early stages . ● Programmable networks → continuous development. – Native integration of emulators into SDN. – Network simulators supporting SDN and emulation. – Cloud simulators extended for clouds with SDN. ● Lack of “Software-Defined” network simulators. – Policy/algorithms locked in simulator-imperative code. ● Demand for easy migration and programmability.
  49. 49. 49/37 Motivation ● An integrated network simulation and emulation. ● Extend SDN controllers for cloud network simulations. – Bring the benefits of SDN to its own simulations! ● Reusability, Scalability, Easy migration, . . . – Run control plane code in controller itself (portability). – Simulate the data plane (scalability, efficiency). ● by programmatically invoking the southbound.
  50. 50. 50/37 Integrated Modeling and Development
  51. 51. 51/37 Our Proposal: SENDIM ● Separation of the Application Logic From the Execution Environment.
  52. 52. 52/37 SENDIM Execution
  53. 53. 53/37 “Software-Defined Simulations” ● Application Logic expressed in “descriptors”. – Deployed into the SDN controller, with a Java API. ● System simulated in the simulation sandbox.
  54. 54. 54/37
  55. 55. 55/37
  56. 56. 56/37 Prototype Implementation ● Oracle Java 1.8.0 - Development language. ● Apache Maven 3.1.1 - Build the bundles and execute the scripts. ● Infinispan 7.2.0.Final - Distributed cluster. ● Apache Karaf 3.0.3 - OSGi run time. ● OpenDaylight Beryllium - Default controller. ● Multiple deployment options: – As a stand-alone simulator. – Distributed execution with an SDN controller. – As a bundle in an OSGi-based SDN controller.
  57. 57. 57/37 Evaluation ● A cluster of up to 6 identical computers. – Intel Core TM i7-4700MQ CPU @ 2.40GHz 8 CPU. – 8 GB memory, Ubuntu 14.04 LTS 64 bit. ● Simulating routing algorithms in fat-tree topology. – Up to 100,000 nodes and changing degrees. ● Simulation Performance: Benchmark against CloudSimSDN/Cloud2Sim. ● Evaluate the migration performance. – Emulation (Mininet) → Simulation (SENDIM) – Simulation (SENDIM) → Emulation (Mininet)
  58. 58. 58/37 Automated Code Migration: Simulation → Emulation ● Time taken to programmatically convert a SENDIM simulation script into a Mininet script.
  59. 59. 59/37 Modeling Performance ● Network Construction Efficiency and Adaptiveness. – Simulate when resources are scarce for emulation.
  60. 60. 60/37 Simulation Performance and Scalability ● Higher performance for larger simulations. ● Smart scale-out → Higher horizontal scalability
  61. 61. 61/37 Performance with Incremental Updates ● Smaller simulations: up to 1000 nodes. ● SENDIM: controller and middleware execution completion time.
  62. 62. 62/37 Performance with Incremental Updates ● Initial execution takes longer - Initializations.
  63. 63. 63/37 Performance with Incremental Updates ● Faster, once SENDIM & controller initialized.
  64. 64. 64/37 Test-driven Development ● Faster executions once the system is initialized.
  65. 65. 65/37 Subsequent Incremental Updates ● Even faster executions for subsequent simulations.
  66. 66. 66/37 Deploy Changesets to the Controller● No change in simulated environment
  67. 67. 67/37 Revert Changesets from the Controller ● No change in simulated environment
  68. 68. 68/37 Scale/Migrate Simulated Environment ● No change in controller.
  69. 69. 69/37
  70. 70. 70/37 Key Findings ● SENDIM, Separation of execution from the infrastructure. – Easy migration between simulations and emulations. – Enabling an incremental modeling of cloud networks. ● Performance and scalability. – Reuse the same controller code to simulate larger deployments. – Adaptive parallel and distributed simulations. Future Work ● Extension points for easy migrations. – More emulator and controller integrations.
  71. 71. 71/37 (2) *NetUber* (Complementary Slides) ● Networking’18
  72. 72. 72/37 Cost of Cloud Spot VMs ● 10 Gbps R4 instance pairs offered only maximum of 1.2 Gbps of data transfer inter-region. – 10 Gbps only inside a placement group.
  73. 73. 73/37 Price disparity is real! Cost of Bandwidth Regions 1 - 9 (US, Canada, and EU) much cheaper than the others.
  74. 74. 74/37 Potential for Network Services ● NetUber uses memory-optimized R4 spot instances. – Each with 244 GB memory, 32 vCPU, and 10 GbE interface. ● Deploy network services at the instances – Value-added services for the customer. ● Encryption, WAN-Optimizer, load balancer, .. – Services for cost-efficiency. ● Compression.
  75. 75. 75/37 (3) *SMART* ● SDN Middlebox Architecture for Reliable Transfers. ● EI2N’16 and IM’17
  76. 76. 76/37 Introduction ● Differentiated QoS in multi-tenant cloud networks. – Different priorities among tenant processes. – Application-level user preferences and system policies. – Performance guarantees at the network-level. ● Network is shared among the tenants. – SLA guarantee despite congestion for critical flows.
  77. 77. 77/37 Motivation ● Cross-layer optimization of clouds with SDN. – Centralized network-as-a-service control plane.
  78. 78. 78/37 Our Proposal: SMART ● Cross-layer architecture for differentiated QoS of flows. ● FlowTags - Software middlebox to tag the network flows with contextual information. – Application-level preferences to the control plane as tags. – Dynamic flow routing modifications based on the tags. ● Timely delivery of priority flows by dynamically diverting them or cloning them to a less congested path. – Selective Redundancy – Adaptive approach in cloning and diverting.
  79. 79. 79/37 SMART Approach ● Divert or clone subflows by setting breakpoints in the priority flows, to avert congestion. – Trade-off of redundancy to ensure the SLA. – Adaptiveness with contextual information.
  80. 80. 80/37
  81. 81. 81/37
  82. 82. 82/37 SMART Deployment
  83. 83. 83/37 SMART Workflow
  84. 84. 84/37 I: Tag Generation for Priority Flows ● Tag generation query and response. – between hosts and FlowTags controller. ● A centralized controller for FlowTags. ● Tag the flows at the origin. ● FlowTagger software middlebox. – A generator of the tags. – Invoked by the host application layer. – Similar to the FlowTags-capable middleboxes for NATs
  85. 85. 85/37 II: Regular Routing until Policy Violation
  86. 86. 86/37 III: When a Threshold is Met ● Controller is triggered through the OpenFlow API. ● A series of control flows inside the control plane. ● Modify flow entries in the relevant switches.
  87. 87. 87/37 SMART Control Flows: Rules Manager ● A software middlebox in the control plane. ● Consumes the tags from the packet. – Similar to FlowTags-capable firewalls.
  88. 88. 88/37 Rules Manager Tags Consumption ● Interprets the tags – as input to the SMART Enhancer
  89. 89. 89/37 SMART Enhancer● Gets the input to the enhancement algorithms. ● Decides the flow modifications. – Breakpoint node and packet. – Clone/divert decisions.
  90. 90. 90/37 Prototype Implementation ● Developed in Oracle Java 1.8.0. ● OpenDaylight Beryllium as the core SDN controller. ● Enhancer & Rules Manager middlebox: controller extensions. – Deployed in OpenDaylight Karaf runtime as OSGi bundles. ● FlowTags middlebox controller deployed with SDN controller. – FlowTags, originally a POX extension. ● Network nodes and flows emulated with Mininet. – Larger scale cloud deployments simulated.
  91. 91. 91/37 Evaluation Strategy ● Data center network with 1024 nodes and leaf-spine topology. – Path lengths of more than two-hops. – Up to 100,000 of short flows. ● Flow completion time < 1 s. ● A few non-priority elephant flows. – SLA → maximum permitted flow completion time for priority flows – Uniformly randomized congestion. ● hitting a few uplinks of nodes concurrently. ● overwhelming amount of flows through the same nodes and links. ● Benchmark: SMART enhancements over base routing algorithms. – Performance (SLA-awareness), redundancy, and overhead.
  92. 92. 92/37 SMART Adaptive Clone/Replicate ● Replicate subsequent flows once a previous flow was cloned. – Shortest path and Equal-Cost Multi-Path (ECMP)
  93. 93. 93/37 Related Work ● Multipath TCP (MPTCP) uses the available multiple paths between the nodes concurrently to route the flows. – Performance, bandwidth utilization, & congestion control – through a distributed load balancing. ● ProgNET: WS-Agreement and SDN for SLA-aware cloud. ● pFabric for deadline-constrained data flows with minimal completion time. ● QJump linux traffic control module for latency-sensitive applications.
  94. 94. 94/37 Key Findings ● SMART leverages redundancy in the flows – Improve the SLA of the priority flows. ● Cross-layer optimizations through tagging the flows. – For differentiated QoS. Future Work ● Implementation of SMART on a real data center network. ● Evaluate against the related work quantitatively.
  95. 95. 95/37 (4) *Mayan* ● Software-Defined Service Compositions ● ICWS’16 and SDS’16
  96. 96. 96/37 Introduction ● eScience workflows – Computation-intensive. – Execute on highly distributed networks. ● Complex service composition workflows – To automate scientific and enterprise business processes.
  97. 97. 97/37 Motivation ● Better orchestration of service workflow compositions in wide area networks. ● Software-Defined Service Composition
  98. 98. 98/37 Our Proposal: Mayan ● SDN-based approach for adaptively composing multi-domain service workflows – An efficient service instance selection. – Loose coupling of service definitions and implementations. – Availability of a logically centralized control plane. ● State of executions and transactions stored in the controller distributed data tree. – Clustered and federated deployments with MOM.
  99. 99. 99/37 Alternative Representations
  100. 100. 100/37 Mayan Services Registry: Modeling Language
  101. 101. 101/37 Service Composition Representation ● <Service3,(<Service1, Input1>, <Service2, Input2>)>
  102. 102. 102/37 Service Instances: Alternative Implementations and Deployments
  103. 103. 103/37 Solution Architecture ● Mayan Controller Farm: Inter-Domain Compositions
  104. 104. 104/37
  105. 105. 105/37
  106. 106. 106/37
  107. 107. 107/37 Evaluation ● Evaluation Environment: – Smaller physical deployments in a cluster. – Larger deployments as simulations and emulations (Mininet). ● Evaluation Strategy: – A workflow performing distributed data cleaning and consolidation. ● A distributed web service composition. vs. ● Mayan approach with the extended SDN architecture.
  108. 108. 108/37 Speedup and Horizontal Scalability ● No performance degradation for larger deployments.
  109. 109. 109/37 Controller Throughput ● No. of messages entirely processed by the controller. – Publisher → Controller → Receiver. ● 5000 messages/s in a concurrency of 10 million msg.
  110. 110. 110/37 Processing Time ● Total time to process the complete set of messages – Against a varying number of messages. ● Linear scaling with the number of parallel messages. – 10 million messages in 40 minutes.
  111. 111. 111/37 Success Rate ● Success rate of the controller vs. number of messages processed in parallel. – 99.5% for up to 10 million parallel messages.
  112. 112. 112/37 Scalability of the Mayan Controller ● Presented results for a single stand-alone controller. ● Mayan is designed as a federated deployment. – Scales horizontally to ● manage a wider area with a more substantial number of service nodes and improved latency. ● handle more concurrent messages in each controller domain.
  113. 113. 113/37 Key Findings ● SDN-based approach that enables efficient and flexible large-scale service composition workflows . – Multi-tenant and multi-domain executions. – Service composition with web services and distributed execution frameworks. ● Related Works on SDN for distributed frameworks and service workflows. – Palantir: SDN for MapReduce performance with the network proximity data.
  114. 114. 114/37 (5) *Évora* (Complementary Slides) ● ETT’18
  115. 115. 115/37 A User-Defined NSC Among the Edge Nodes
  116. 116. 116/37 Problem Scale: Representation of the Service Graph from the Node Graph ● The number of links in this service graph grows – linearly with the number of edges or links between the edge nodes. – exponentially with the average number of services per edge node.
  117. 117. 117/37 Two given more prominence (weight = 10), than the third (weight = 3).
  118. 118. 118/37 MILP and Graph Matching can be Computation Intensive ● But initialization is once per user chain with a given policy. – This procedure does not repeat once initialized. – unless updates received from the edge network. ● New node with the service offering at the edge. ● An existing node or a service offering fails to respond. ● Services in each NSC is typically 5 – 10. – Évora algorithm follows a greedy approach, rather than a typical graph matching.
  119. 119. 119/37 Performance and Scalability of Évora Orchestrator Algorithms
  120. 120. 120/37
  121. 121. 121/37
  122. 122. 122/37
  123. 123. 123/37 (6) *SD-CPS* ● Software-Defined Cyber-Physical Systems ● CLUSTER’18, SDS’17, and M4IoT’15
  124. 124. 124/37 Cyber-Physical System (CPS) ● A system composed of cyber and physical elements. ● Challenges in CPS. – Modeling – Large-scale heterogeneous execution environments. – Decision making: communication and coordination. – Management and orchestration of the intelligent agents.
  125. 125. 125/37 Motivation ● An SDS to address the challenges of CPS. Desired Properties in a new CPS framework ● Easy to adopt from current CPS approaches. ● Should not introduce more/new challenges.
  126. 126. 126/37 ● An SDS framework for CPS workflows at the edge. – CPS workload as edge service workflows. ● A dual (physical and virtual/cyber) execution environment for CPS executions. – Efficient CPS modelling and simulations. – Mitigate the unpredictability of the physical execution environment. ● Resilience for critical flows with a differentiated QoS. – End-to-end delivery guarantees. Our Proposal: Software-Defined Cyber-Physical Systems (SD-CPS)
  127. 127. 127/37 SD-CPS Controller Architecture
  128. 128. 128/37 Controller Farm and Software-Defined Sensor Networks
  129. 129. 129/37 Modeling and Simulating CPS ● Cyberspace to model the smart devices as virtual intelligent agents. ● Mapped interactions between the actors in physical & cyber spaces. ● Incrementally model and load from the controller farm.
  130. 130. 130/37 Evaluation Environment● Edge nodes and service resource requirements – with properties normalized. ● Resource requirement – Negative value – even the smallest node satisfies. – High positive value – higher demand for resource.
  131. 131. 131/37 Service Deployment Over the Nodes ● How each service is deployed across nodes. ● How each node hosts several services.
  132. 132. 132/37 Parallel Execution of 1 million workflows ● Minimal idling nodes. ● High resource utilization.
  133. 133. 133/37 Related Work ● SDN for Heterogeneous Devices. – Sensor OpenFlow: SD-Wireless Sensor Networks. ● Scaling SDN: Clustering SDN controller with Akka. ● OpenDaylight Federation ● Conceptual Data Tree projects. ● SDS for Smart Environments. ● Albatross: Taming challenges of distributed systems
  134. 134. 134/37 Key Findings ● Increased resource efficiency using edge workflows. ● An approach to mitigate the design and operations challenges in CPS. ● Benefits of SDN to CPS. – Unified and centralized control. – Improved QoS, management, and resilience. – Reduced repeated effort in modeling.
  135. 135. 135/37 (7) *Obidos* ● OOn-demand BBig Data IIntegration, DDistribution, and OOrchestration SSystem ● DAPD’19, CoopIS’15, and DMAH’17
  136. 136. 136/37 Introduction ● Volume, variety, and distribution of big data are rising. – Structured, semi-structured, unstructured, or ill-formed. ● Integration of data is crucial for data science. – Multiple types of data: Imaging, clinical, and genomic. – Numerous data sources: No shared messaging protocol. – Do we really need to integrate all the data? ● Sharing of integrated data and results for reproducibility.
  137. 137. 137/37 Human-in-the-loop On-Demand Data Integration● Service-based data access through APIs. – Thanks to specifications such as HL7 FHIR. ● The researchers possess domain knowledge. ● Integrate On-Demand. – Avoid eager loading of binary data or its textual metadata. – Use the researcher query as an input in loading data. ● Scalable storage in-house. – Load, integrate, index, and query unstructured data.
  138. 138. 138/37 Data Sharing Intra-Organization ● Load data only once per organization. – Bandwidth and storage efficiency.
  139. 139. 139/37 Data Sharing Inter-Organization ● Do not duplicate data! – We ``own`` our interest; not the data. ● Point to the data in the data sources. – Pointers to data like Dropbox Shared Links. ● Avoids outdated duplicate data. ● Easy to maintain. ● APIs – Access the list of research data sets.
  140. 140. 140/37 Problems ● How to.. – Load data from several big data sources. ● Avoid repeated loading and near duplicate data. – Integrate disparate data and persist for future accesses. – Share pointers to data internally and externally.
  141. 141. 141/37 Our Proposal: Óbidos ● Define subsets of data that are of interest. – using hierarchical structure of medical data. ● Medical Images (DICOM), Clinical data, .. ● User query → Narrow down the search space. OOn-demand BBig Data IIntegration, DDistribution & OOrchestration SSystem
  142. 142. 142/37 Óbidos Approach ● Hybrid of virtual and materialized data integration approaches. – Lazy load of metadata: Load the matching subset of metadata. – Store integrated data and query results → scalable storage. ● Track already loaded data. – Near duplicate detection. – Download only updates (changesets). ● Efficient SQL queries on NoSQL storage. ● Share pointers to the datasets rather than the dataset itself. ● Generic design; implementation for medical research data.Generic design; implementation for medical research data.
  143. 143. 143/37 Óbidos Architecture
  144. 144. 144/37
  145. 145. 145/37 Data Sharing with Óbidos
  146. 146. 146/37
  147. 147. 147/37 Data Structures of the Replicaset Holder
  148. 148. 148/37 Evaluation ● Evaluation Data: – Clinical data and TCIA DICOM imaging collections. ● Benchmark Óbidos against eager and lazy ETL. – Performance of loading and querying data. ● Óbidos (inter- and intra- org) against binary data sharing. – Space/bandwidth efficiency of data sharing.
  149. 149. 149/37 Workload Characterization Various Entries in Evaluated Collections
  150. 150. 150/37 Data Load Time Change in total data volume (Same query and same interest) ● Load time for eager & lazy ETL with total volume↟ ● Load time for Óbidos remains constant.
  151. 151. 151/37 Change in studies of interest (Same query and constant total data volume) Data Load Time ● Load time for eager and lazy ETL remains constant. ● Load time increases for Óbidos with the interest. – Converges to the load time of lazy ETL.
  152. 152. 152/37 Load Time from the Remote Data Sources ● Eager and lazy ETL take much longer – To load more data and metadata over the Internet.
  153. 153. 153/37 Query Completion Time for the Integrated Data Repository ● Corresponding data already loaded in Óbidos. ● Indexed scalable NoSQL architecture of Óbidos → Better performance.
  154. 154. 154/37 Efficiency in Sharing Medical Research Data ● Replicaset – Pointers of marginal size, yet increases with entries of same granularity.
  155. 155. 155/37 Key Findings ● Óbidos offers on-demand service-based big data integration. – Fast and resource-efficient data analysis. – SQL queries over NoSQL data store for the integrated data. – Efficient data sharing without replicating the actual data. Future Work – Consume data from repositories beyond medical domain. ● EUDAT – Óbidos distributed virtual data warehouses. ● Leverage the proximity in data integration and sharing.
  156. 156. 156/37 (8) *Mayan-DS* ● Software-Defined Data Services (SDDS) ● CCPE’19 and SDS’18 (Best Paper Award) ● Work-in-progress
  157. 157. Introduction ● Data services: Service APIs to big data → Interoperability. ● Related data and services distributed far from each other → Bad performance with scale. ● Chaining of data services. – Composing chains of numerous data services. – Data access → Data cleaning → Data integration. ● How to scale out efficiently? – How to minimize communication overheads?
  158. 158. 158/37 Motivation ● Software-Defined Networking (SDN). – A unified controller to the data plane devices. – Brings network awareness to the applications. ● Data services – Make big data executions interoperable. ● Can we bring SDN to the Data Services? – Software-Defined Data Services (SDDS)
  159. 159. 159/37 Our Proposal: Software-Defined Data Services (SDDS) ● SDDS as a generic approach for data services. – Extending and leveraging SDN. ● Mayan-DS, an SDDS framework. – Efficient management of data services. – Interoperability and scalability.
  160. 160. 160/37 Solution Architecture
  161. 161. 161/37 SDDS Approach ● Define all the data operations as interoperable services. ● SDN for distributing data and service executions – Inside a data center (e.g. Software-Defined Data Centers). – Beyond data centers (extend SDN with MOM). ● Optimal placement of data and service execution. – Minimize communication overhead and data movements. ● Execute data service on the best-fit server, until interrupted.
  162. 162. 162/37 Efficient Data and Execution Placement {i, j} – related data objects D – datasets of interest n – execution node ξ – spread of the related data objects
  163. 163. 163/37 Prototype Implementation
  164. 164. 164/37 Simulated Environment (with Modeled Latency in ms)
  165. 165. 165/37 Ping Times (ms) Between Two Nodes: Regular Internet vs. Mayan-DS
  166. 166. 166/37 Latency: Ping Times of Mayan-DS ● Up to 33% reduction in latency – with a fraction of the path through a direct link. ● 75% or more reduction with significant portion of direct link.
  167. 167. 167/37 Key Findings ● Software-Defined Data Services (SDDS) for interoperability and scalability in big data executions. ● Mayan-DS leverages SDN for big data workflows at Internet-scale. ● Limited focus of industrial offerings. – Storage or one or a few specific services. Future Work ● Extend Mayan-DS for edge and IoT/CPS environments.

×