Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Snaplogic Live: Big Data in Motion

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 9 Anuncio

Snaplogic Live: Big Data in Motion

Descargar para leer sin conexión

Watch this recorded demonstration of SnapLogic from our team of experts who answer your hybrid cloud and big data integration questions.
demo, ipaas, elastic integration, cloud data, app integration, data integration, hybrid could integration, big data, big data integration

Watch this recorded demonstration of SnapLogic from our team of experts who answer your hybrid cloud and big data integration questions.
demo, ipaas, elastic integration, cloud data, app integration, data integration, hybrid could integration, big data, big data integration

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Anuncio

Similares a Snaplogic Live: Big Data in Motion (20)

Más de SnapLogic (20)

Anuncio

Más reciente (20)

Snaplogic Live: Big Data in Motion

  1. 1. SnapLogic Live – Big Data in Motion
  2. 2. 2 Source: Internap http://www.internap.com/resources/infographic-data-motion-vs-data-rest/
  3. 3. 3 The Data Lake “Just as data integration is the foundation of the data warehouse, an end-to-end data processing capability is the core of the data lake. The new environment needs a new workhorse.” - Mark Madsen, Third Nature snaplogic.com/resources
  4. 4. z Data Acquisition Data Access z Data Management Data Lake Components Add information and improve data Spark Python Scala Java R Pig Collect and integrate data from multiple sources HDFS AWS S3 MS Azure Blob On Prem Apps and Data • ERP • CRM • RDBMS Cloud Apps and Data • CRM • HCM • Social IoT Data • Sensors • Wearables • Devices Lakeshore Data Mart • MS Azure • AWS Redshift BI / Analytics • Tableau • MS PowerBI / Azure • AWS QuickSight Organize and prepare data for visualization HDFS AWS S3 MS Azure Blob Hive Batch Streaming Schedule and manage: Oozie, Ambari Kafka, Sqoop, Flume Real-time Impala, HiveSQL, SparkSQL Current Data Lake Architecture
  5. 5. z Data Acquisition Data Access z Data Management The Modern Data Lake Powered by SnapLogic Sort, Aggregate, Join, Merge, Transform SnapLogic abstracts and operationalizes with MapReduce or Spark pipelines Collect and integrate data from multiple sources SnapLogic pipelines with standard mode execution Organize and prepare data for visualization SnapLogic pipelines with standard mode execution On Prem Apps and Data • ERP • CRM • RDBMS Cloud Apps and Data • CRM • HCM • Social IoT Data • Sensors • Wearables • Devices Lakeshore Data Mart • MS Azure • AWS Redshift BI / Analytics • Tableau • MS PowerBI / Azure • AWS QuickSight Schedule and manage: SnapLogic Batch Streaming Real-time Modern Data Lake Architecture SnapLogic Pipeline
  6. 6. 6 SnapLogic in the Modern Data Fabric ConsumeStore&ProcessSource z z z z HANA Data Warehouses & Data Marts Big Data and Data Lakes INGEST INGEST Data Integration and Transformation On Prem Applications Relational Databases Cloud Applications NoSQL Databases Web Logs Internet of Things DELIVER DELIVER
  7. 7. Modern Architecture: Hybrid and Elastic Streams: No data is stored/cached Secure: 100% standards-based Elastic: Scales out & handles data and app integration use cases Metadata Data Databases On Prem Apps Big Data Cloud Apps and DataCloud-Based Designer, Manager, Dashboard Cloudplex Groundplex Hadooplex Firewall
  8. 8. Discussion snaplogic.com Unified Platform Self-Service UX Modern Architecture Connected: 400+ Snaps

Notas del editor

  • http://blog.econocom.com/en/blog/whats-a-data-lake/
  • This is just a sampling of the available technologies that may go into a data lake.

    To date, most data lake deployments have been built through manual coding, open source tools and custom integration.

    Manual coding of data processing applications is common because data processing is thought of in terms of application-specific work. Unfortunately, this manual effort is a dead-end investment over the long term because the underlying technologies are constantly changing.

    Older data warehouse environments and ETL type integration tools are good at what they do, but they can’t meet many of the new needs. The new environments are focused on data processing, but require a lot of manual work.

    The data lake must incorporate aspects of old data warehouse environments like connecting to and extracting data from ERP or transaction processing systems, yet do this without clunky and inefficient tools like Sqoop. The data lake also must support new capabilities like reliable collection of large volumes of events at high speed and
    timely processing to make data available immediately. It must also support data coming from multiple sources in a hybrid model. This exceeds the abilities of traditional data integration tools.

  • SnapLogic accelerates development of a modern data lake through:

    Data acquisition: collecting and integrating data from multiple sources. SnapLogic goes beyond developer tools such as Sqoop and Flume with a cloud-based visual pipeline designer, and pre-built connectors for 300+ structured and unstructured data sources, enterprise applications and APIs.

    Data transformation: adding information and transforming data. SnapLogic minimizes the manual tasks associated with data shaping and makes data scientists and analysts more efficient. SnapLogic includes Snaps for tasks such as transformations, joins and unions without scripting.

    Data access: organizing and preparing data for delivery and visualization. SnapLogic makes data processed on Hadoop or Spark easily available to off-cluster applications and data stores such as statistical packages and business intelligence tools.

  • Here is an example of a SnapLogic deployment.

    The SnapLogic control plane – including he Designer, Manager and Dashboard - does not store your data. It’s metadata only.
    Once a pipeline is executed, it looks for the associated Snaplex or Hadooplex. The plex dynamically scales out, adding more nodes as needed.
    We like to say that SnapLogic “respects data gravity” and runs as close to the data as need be. If you are integrating only cloud applications, it would make no sense to run your integrations behind the firewall. Similarly, if you’re doing ground to ground or cloud to ground, you may want to run your Snaplex on Window or Linux servers.

    Note that the dotted line is sending instructions via metadata to the plex, which is waiting to run. The solid line indicates how data movies bi-directionally between systems.

×