Blogs

Getting Started With Pyflink on AWS - Part 3 AWS Managed Flink and MSK

September 4, 202313 min read Data Streaming Getting Started With Pyflink on AWS Amazon MSK Apache Kafka AWS Docker Kpow Pyflink Python

In this series of posts, we discuss a Flink (Pyflink) application that reads/writes from/to Kafka topics. In the previous posts, I demonstrated a Pyflink app that targets a local Kafka cluster as well as a Kafka cluster on Amazon MSK. The app was executed in a virtual environment as well as in a local Flink cluster for improved monitoring. In this post, the app will be deployed via Amazon Managed Service for Apache Flink.

August 28, 202320 min read Data Streaming Getting Started With Pyflink on AWS Amazon MSK Apache Flink Apache Kafka AWS Kpow Pyflink Python

In this series of posts, we discuss a Flink (Pyflink) application that reads/writes from/to Kafka topics. In part 1, an app that targets a local Kafka cluster was created. In this post, we will update the app by connecting a Kafka cluster on Amazon MSK. The Kafka cluster is authenticated by IAM and the app has additional jar dependency. As Amazon Managed Service for Apache Flink does not allow you to specify multiple pipeline jar files, we have to build a custom Uber Jar that combines multiple jar files. Same as part 1, the app will be executed in a virtual environment as well as in a local Flink cluster for improved monitoring with the updated pipeline jar file.

August 17, 202316 min read Data Streaming Getting Started With Pyflink on AWS Apache Flink Apache Kafka Docker Kpow Pyflink Python

Apache Flink is widely used for building real-time stream processing applications. On AWS, Amazon Managed Service for Apache Flink is the easiest option to develop a Flink app as it provides the underlying infrastructure. Updating a guide from AWS, this series of posts discuss how to develop and deploy a Flink (Pyflink) application via KDA where the data source and sink are Kafka topics. In part 1, the app will be developed locally targeting a Kafka cluster created by Docker. Furthermore, it will be executed in a virtual environment as well as in a local Flink cluster for improved monitoring.

August 10, 202316 min read Data Streaming Kafka, Flink and DynamoDB for Real Time Fraud Detection Amazon DynamoDB Apache Flink Apache Kafka AWS Docker Kpow Python

Apache Flink is widely used for building real-time stream processing applications. On AWS, Amazon Managed Service for Apache Flink is the easiest option to develop a Flink app as it provides the underlying infrastructure. Re-implementing a solution from an AWS workshop, this series of posts discuss how to develop and deploy a fraud detection app using Kafka, Flink and DynamoDB. Part 1 covers local development using Docker while deployment via KDA will be discussed in part 2.

July 20, 202314 min read Data Streaming Security Kafka Development With Docker Apache Kafka Docker Python

In the previous posts, we discussed how to implement client authentication by TLS (SSL or TLS/SSL) and SASL authentication. One of the key benefits of client authentication is achieving user access control. In this post, we will discuss how to configure Kafka authorization with Java and Python client examples while SASL is kept for client authentication.

July 13, 202311 min read Data Streaming Security Kafka Development With Docker Apache Kafka Docker Python SASL

In the previous post, we discussed TLS (SSL or TLS/SSL) authentication to improve security. It enforces two-way verification where a client certificate is verified by Kafka brokers. Client authentication can also be enabled by Simple Authentication and Security Layer (SASL), and we will discuss how to implement SASL authentication with Java and Python client examples in this post.

July 6, 202314 min read Data Streaming Security Kafka Development With Docker Apache Kafka Docker Python TLS

To improve security, we can extend TLS (SSL or TLS/SSL) encryption either by enforcing two-way verification where a client certificate is verified by Kafka brokers (SSL authentication). Or we can choose a separate authentication mechanism, which is typically Simple Authentication and Security Layer (SASL). In this post, we will discuss how to implement SSL authentication with Java and Python client examples while SASL authentication is covered in the next post.

July 3, 202314 min read Data Integration Data Streaming Kafka Connect for AWS Services Integration Amazon DynamoDB Amazon MSK Apache Camel Apache Kafka AWS Kafka Connect Kpow

As part of investigating how to utilize Kafka Connect effectively for AWS services integration, I demonstrated how to develop the Camel DynamoDB sink connector using Docker in Part 2. Fake order data was generated using the MSK Data Generator source connector, and the sink connector was configured to consume the topic messages to ingest them into a DynamoDB table. In this post, I will illustrate how to deploy the data ingestion applications using Amazon MSK and MSK Connect.

June 29, 202313 min read Data Streaming Security Kafka Development With Docker Apache Kafka Docker Python TLS

We can configure Kafka clients and other components to use TLS (SSL or TLS/SSL) encryption to secure communication. It is a one-way verification process where a server certificate is verified by a client via SSL Handshake. Moreover we can improve security by adding client authentication. In this post, we will discuss how to configure SSL encryption with Java and Python client examples while client authentication will be covered in later posts.

June 22, 202312 min read Data Streaming Kafka Development With Docker Apache Kafka AWS AWS Glue Schema Registry Docker Kpow Python

In Part 4, we developed Kafka producer and consumer applications using the kafka-python package without integrating schema registry. Later we discussed the benefits of schema registry when developing Kafka applications in Part 5. In this post, I'll demonstrate how to enhance the existing applications by integrating AWS Glue Schema Registry.

Getting Started With Pyflink on AWS - Part 3 AWS Managed Flink and MSK

Getting Started With Pyflink on AWS - Part 2 Local Flink and MSK

Getting Started With Pyflink on AWS - Part 1 Local Flink and Local Kafka

Kafka, Flink and DynamoDB for Real Time Fraud Detection - Part 1 Local Development

Kafka Development With Docker - Part 11 Kafka Authorization

Kafka Development With Docker - Part 10 SASL Authentication

Kafka Development With Docker - Part 9 SSL Authentication

Kafka Connect for AWS Services Integration - Part 3 Deploy Camel DynamoDB Sink Connector

Kafka Development With Docker - Part 8 SSL Encryption

Kafka Development With Docker - Part 7 Producer and Consumer With Glue Schema Registry