We'll discuss how to create a Spark local dev environment for EMR using Docker and/or VSCode. A range of Spark development examples are demonstrated and Glue Catalog integration is illustrated as well.
We'll continue the discussion of a Change Data Capture (CDC) solution with a schema registry and its deployment to AWS. All major resources are deployed in private subnets and VPN is used to access them in order to improve developer experience. The Apicurio registry is used as the schema registry service and it is deployed as an ECS service. In order for the connectors to have access to the registry, the Confluent Avro Converter is packaged together with the connector sources. The post ends with illustrating how schema evolution is managed by the schema registry.
We'll discuss a Change Data Capture (CDC) architecture with a schema registry. As a starting point, a local development environment is set up using Docker Compose. The Debezium and Confluent S3 connectors are deployed with the Confluent Avro converter and the Apicurio registry is used as the schema registry service. A quick example is shown to illustrate how schema evolution can be managed by the schema registry.
Change data capture (CDC) on Amazon MSK and ingesting data using Apache Hudi on Amazon EMR can be used to build an efficient data lake solution. As a starting point, we’ll discuss the source database and CDC streaming infrastructure in the local environment.
In this post, it is demonstrated how AWS Lambda can be integrated with Apache Airflow using a custom operator inspired by the ECS Operator.
Traefik is a modern HTTP reverse proxy and load balancer. In this post, it'll be demonstrated how path-based routing can be set up by Traefik with Docker. Also a centralized authentication will be illustrated with the Forward Authentication feature of Traefik.
In this post, I'll illustrate how a web service is created using FastAPI framework where tasks are sent to multiple workers. The workers are built with Celery and Rserve. Redis is used as a message broker/result backend for Celery and a key-value store for Rserve. Demos can be run in both Docker Compose and Kubernetes.
In this post, I'll demonstrate how to create a Linux development environment on Windows using WSL. Also an example app (Rserve web service with a sidecar container) on Minikube will be demonstrated.
LocalStack provides an easy-to-use test/mocking framework for developing AWS applications. In this post, I'll demonstrate how to utilize LocalStack for development using a web service.