Data Streaming

Kafka Development With Docker - Part 7 Producer and Consumer With Glue Schema Registry

June 22, 202312 min read Data Streaming Kafka Development With Docker Apache Kafka AWS AWS Glue Schema Registry Docker Kpow Python

In Part 4, we developed Kafka producer and consumer applications using the kafka-python package without integrating schema registry. Later we discussed the benefits of schema registry when developing Kafka applications in Part 5. In this post, I'll demonstrate how to enhance the existing applications by integrating AWS Glue Schema Registry.

June 15, 202312 min read Data Integration Data Streaming Kafka Development With Docker Apache Kafka AWS AWS Glue Schema Registry Docker Kafka Connect Kpow

In Part 3, we developed a data ingestion pipeline using Kafka Connect source and sink connectors without enabling schemas. Later we discussed the benefits of schema registry when developing Kafka applications in Part 5. In this post, I'll demonstrate how to enhance the existing data ingestion pipeline by integrating AWS Glue Schema Registry.

June 8, 20237 min read Data Streaming Kafka Development With Docker Apache Kafka AWS AWS Glue Schema Registry Kafka Connect Schema Registry

The Glue Schema Registry supports features to manage and enforce schemas on data streaming applications using convenient integrations with Apache Kafka and other AWS managed services. In order to utilise those features, we need to use the client library. In this post, I'll illustrate how to build the client library after introducing how it works to integrate the Glue Schema Registry with Kafka producer and consumer apps.

June 4, 202313 min read Data Streaming Kafka Connect for AWS Services Integration Amazon DynamoDB Apache Camel Apache Kafka AWS Docker Kafka Connect

The suite of Apache Camel Kafka connectors and the Kinesis Kafka connector from the AWS Labs can be effective for building data ingestion pipelines that integrate AWS services. In this post, I will illustrate how to develop the Camel DynamoDB sink connector using Docker. Fake order data will be generated using the MSK Data Generator source connector, and the sink connector will be configured to consume the topic messages to ingest them into a DynamoDB table.

June 1, 20237 min read Data Streaming Kafka Development With Docker Apache Kafka Docker Python

Kafka includes the Producer/Consumer APIs that allow client applications to send/read streams of data to/from topics in a Kafka cluster. While the main Kafka project maintains only the Java clients, there are several open source projects that provide the Kafka client APIs in Python. In this post, I'll demonstrate how to develop producer/consumer applications using the kafka-python package.

May 25, 20239 min read Data Integration Data Streaming Kafka Development With Docker Apache Kafka Docker Kafka Connect

Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other systems. In this post, I will illustrate how to set up a data ingestion pipeline using Kafka connectors. Fake customer and order data will be ingested into the corresponding topics using the MSK Data Generator source connector. The topic messages will then be saved into a S3 bucket using the Confluent S3 sink connector.

May 18, 20238 min read Data Streaming Kafka Development With Docker Apache Kafka Docker Kafka-Ui Kpow

A Kafka management app can be a good companion for development, which helps monitor and manage resources on an easy-to-use user interface. An app can be more useful if it supports features that are desirable for Kafka development on AWS. Those features cover IAM access control and integration with MSK Connect and Glue Schema Registry. In this post, I'll introduce several management apps that meet those requirements.

May 4, 20239 min read Data Streaming Kafka Development With Docker Apache Kafka Data Streaming Docker

Apache Kafka is one of the key technologies for modern data streaming architectures on AWS. Developing and testing Kafka-related applications can be easier using Docker and Docker Compose. In this series of posts, I will demonstrate reference implementations of those applications in Dockerized environments.

May 3, 20234 min read Data Integration Data Streaming Kafka Connect for AWS Services Integration Amazon MSK Apache Kafka AWS Kafka Connect

Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other systems. It can be used to build real-time data pipeline on AWS effectively. In this post, I will introduce available Kafka connectors mainly for AWS services integration. Also, developing and deploying some of them will be covered in later posts.

April 12, 202326 min read Data Streaming Apache Kafka AWS Glue Schema Registry AWS Lambda Kpow Python

Glue Schema Registry provides a centralized repository for managing and validating schemas for topic message data. Its features can be utilized by many AWS services when building data streaming applications. In this post, we will discuss how to integrate Python Kafka producer and consumer apps in AWS Lambda with the Glue Schema Registry.

Kafka Development With Docker - Part 7 Producer and Consumer With Glue Schema Registry

Kafka Development With Docker - Part 6 Kafka Connect With Glue Schema Registry

Kafka Development With Docker - Part 5 Glue Schema Registry

Kafka Connect for AWS Services Integration - Part 2 Develop Camel DynamoDB Sink Connector

Kafka Development With Docker - Part 4 Producer and Consumer

Kafka Development With Docker - Part 3 Kafka Connect

Kafka Development With Docker - Part 2 Management App

Kafka Development With Docker - Part 1 Cluster Setup

Kafka Connect for AWS Services Integration - Part 1 Introduction

Integrate Glue Schema Registry With Your Python Kafka App