Amazon MSK

Use External Schema Registry With MSK Connect – Part 2 MSK Deployment

April 3, 20227 min read Data Integration Data Streaming Integrate Schema Registry With MSK Connect Amazon ECS Amazon MSK Apache Kafka Apicurio Registry AWS Change Data Capture (CDC)Debezium Docker Kafka Connect

We'll continue the discussion of a Change Data Capture (CDC) solution with a schema registry and its deployment to AWS. All major resources are deployed in private subnets and VPN is used to access them in order to improve developer experience. The Apicurio registry is used as the schema registry service and it is deployed as an ECS service. In order for the connectors to have access to the registry, the Confluent Avro Converter is packaged together with the connector sources. The post ends with illustrating how schema evolution is managed by the schema registry.

December 19, 202111 min read Data Engineering Data Integration Data Streaming Data Lake Demo Using Change Data Capture Amazon EMR Amazon MSK Apache Hudi Apache Kafka AWS Change Data Capture (CDC)Debezium Kafka Connect

Change data capture (CDC) on Amazon MSK and ingesting data using Apache Hudi on Amazon EMR can be used to build an efficient data lake solution. In this post, we'll build a Hudi DeltaStramer app on Amazon EMR and use the resulting Hudi table with Athena and Quicksight to build a dashboard.

December 12, 202117 min read Data Engineering Data Integration Data Streaming Data Lake Demo Using Change Data Capture Amazon EMR Amazon MSK Apache Hudi Apache Kafka AWS Change Data Capture (CDC)Debezium Kafka Connect

Change data capture (CDC) on Amazon MSK and ingesting data using Apache Hudi on Amazon EMR can be used to build an efficient data lake solution. In this post, we'll build CDC with Amazon MSK and MSK Connect.

December 5, 202118 min read Data Engineering Data Integration Data Streaming Data Lake Demo Using Change Data Capture Amazon EMR Amazon MSK Apache Hudi Apache Kafka AWS Change Data Capture (CDC)Debezium Kafka Connect

Change data capture (CDC) on Amazon MSK and ingesting data using Apache Hudi on Amazon EMR can be used to build an efficient data lake solution. As a starting point, we’ll discuss the source database and CDC streaming infrastructure in the local environment.

Use External Schema Registry With MSK Connect – Part 2 MSK Deployment

Data Lake Demo Using Change Data Capture (CDC) on AWS – Part 3 Implement Data Lake

Data Lake Demo Using Change Data Capture (CDC) on AWS – Part 2 Implement CDC

Data Lake Demo Using Change Data Capture (CDC) on AWS – Part 1 Local Development