http://www.learntek.org/product/apache-flink/
Apache Flink is an open source stream processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming dataflow engine written in Java and Scala. Apache Flink’s dataflow programming model provides event-at-a-time processing on both finite and infinite datasets. At a basic level, Flink programs consist of streams and transformations. Conceptually, a stream is a (potentially never-ending) flow of data records, and a transformation is an operation that takes one or more streams as input, and produces one or more output streams as a result. Programs can be written in Java, Scala, Python, and SQL and are automatically compiled and optimized into dataflow programs that are executed in a cluster or cloud environment.
http://www.learntek.org
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses. We are dedicated to designing, developing and implementing training programs for students, corporate employees and business professional.
2. Apache Flink
The following topics will be covered in our
Apache Flink Online Training:
Copyright @ 2015 Learntek. All Rights Reserved. 2
3. What is Apache Flink?
Apache Flink is an open source stream processing framework developed by
the Apache Software Foundation. The core of Apache Flink is a distributed
streaming dataflow engine written in Java and Scala. Apache Flink’s dataflow
programming model provides event-at-a-time processing on both finite and
infinite datasets. At a basic level, Flink programs consist of streams and
transformations.
Copyright @ 2015 Learntek. All Rights Reserved. 3
4. ….. Continues
Conceptually, a stream is a (potentially never-ending) flow of data records, and
a transformation is an operation that takes one or more streams as input, and
produces one or more output streams as a result. Programs can be written in
Java, Scala, Python, and SQL and are automatically compiled and optimized into
dataflow programs that are executed in a cluster or cloud environment.
Copyright @ 2015 Learntek. All Rights Reserved. 4
5. Why Apache Flink?
• Flink provides a high-throughput, low-latency streaming engine as well as
support for event-time processing and state management. Flink applications
are fault-tolerant in the event of machine failure and support exactly-once
semantics. Flink executes arbitrary dataflow programs in a data-parallel and
pipelined manner. Flink’s pipelined runtime system enables the execution of
bulk/batch and stream processing programs. Furthermore, Flink’s runtime
supports the execution of iterative algorithms natively.
Copyright @ 2015 Learntek. All Rights Reserved. 5
6. Flink Introduction
• Architecture
• Distributed Execution
• Job Manager
• Task Manager
• Features
• Deploying Flink on Google Cloud and AWS
Copyright @ 2015 Learntek. All Rights Reserved. 6
7. Data Stream API
• Execution environment
• Data sources
• Transformations
• Data sinks
• Connectors
Copyright @ 2015 Learntek. All Rights Reserved. 7
8. Batch Processing API
• Data sources
• Transformations
• Broadcast Variable
• Connectors to various Systems
• Iterations
Copyright @ 2015 Learntek. All Rights Reserved. 8
9. Structure data handling using Table API
• Registering tables
• Accessing the registered table
• Operators
• Data types
• SQL
Copyright @ 2015 Learntek. All Rights Reserved. 9
10. Complex event processing
• Introduction to CEP and Flink CEP
• Event Streams
• Pattern API
• Continuity
• Selecting from Pattern
Copyright @ 2015 Learntek. All Rights Reserved. 10
12. Integration between Flink and Hadoop
• Flink-Yarn Session
• Job Submission to Flink
• Execution of a Flink job on YARN
• Flink and YARN interaction details
Copyright @ 2015 Learntek. All Rights Reserved. 12