Jaehyeon Kim
Jaehyeon Kim

  • Blog
    • Archives

    • Series

      List of series.

    • Categories

      List of categories.

    • Tags

      List of tags.


/

  • Github Linkedin Paypal RSS

  • Font Size
  • Palette
  • Mode
  1. Home
  2. Tags
  3. Python

Serverless Application Model (SAM) for Data Professionals

featured.png
July 18, 20227 min read DevelopmentAWSAWS LambdaAWS SAMPythonS3Serverless Application Model (SAM)

We'll discuss how to build a serverless data processing application using the Serverless Application Model (SAM). A Lambda function is developed, which is triggered whenever an object is created in a S3 bucket. 3rd party packages are necessary for data processing and they are made available by Lambda layers.

Read More

Data Warehousing ETL Demo With Apache Iceberg on EMR Local Environment

featured.png
June 26, 202212 min read Data EngineeringAmazon EMRApache IcebergApache SparkAWSPySparkPython

We'll discuss how to implement data warehousing ETL using Iceberg for data storage/management and Spark for data processing. A Pyspark ETL app will be used for demonstration in an EMR local environment. Finally the ETL results will be queried by Athena for verification.

Read More

Local Development of AWS Glue 3.0 and Later

featured.png
November 14, 20218 min read Data EngineeringAWSAWS GlueDockerPySparkPython

Recently AWS Glue 3.0 was released but a docker image for this version is not published. In this post, I’ll illustrate how to create a development environment for AWS Glue 3.0 (and later versions) by building a custom docker image.

Read More

AWS Glue Local Development With Docker and Visual Studio Code

featured.png
August 20, 20219 min read Data EngineeringApache SparkAWSAWS GlueDockerPySparkPython

In this post, I'll demonstrate how to build development environments for AWS Glue 1.0 and 2.0 using the Docker image and the Visual Studio Code Remote - Containers extension.

Read More

Thoughts on Apache Airflow AWS Lambda Operator

featured.png
April 13, 20209 min read Data EngineeringApache AirflowAWSAWS LambdaDockerPython

In this post, it is demonstrated how AWS Lambda can be integrated with Apache Airflow using a custom operator inspired by the ECS Operator.

Read More

Dynamic Routing and Centralized Auth With Traefik, Python and R Example

featured.png
November 29, 20199 min read DevelopmentDockerFastAPIPythonRTraefik

Traefik is a modern HTTP reverse proxy and load balancer. In this post, it'll be demonstrated how path-based routing can be set up by Traefik with Docker. Also a centralized authentication will be illustrated with the Forward Authentication feature of Traefik.

Read More

Distributed Task Queue With Python and R Example

featured.png
November 15, 20198 min read DevelopmentFastAPIPythonRRserve

In this post, I'll illustrate how a web service is created using FastAPI framework where tasks are sent to multiple workers. The workers are built with Celery and Rserve. Redis is used as a message broker/result backend for Celery and a key-value store for Rserve. Demos can be run in both Docker Compose and Kubernetes.

Read More

Linux Dev Environment on Windows

featured.png
November 1, 201912 min read DevelopmentDockerKubernetesMinikubePythonRWSL

In this post, I'll demonstrate how to create a Linux development environment on Windows using WSL. Also an example app (Rserve web service with a sidecar container) on Minikube will be demonstrated.

Read More

AWS Local Development With LocalStack

featured.png
July 20, 20196 min read DevelopmentAWSAWS LambdaDockerFlaskLocalStackPython

LocalStack provides an easy-to-use test/mocking framework for developing AWS applications. In this post, I'll demonstrate how to utilize LocalStack for development using a web service.

Read More

Serverless Data Product POC Backend Part IV - Serving R ML Model via S3

featured.png
April 17, 20179 min read Development Serverless Data ProductAmazon API GatewayAWSAWS LambdaPythonR

In the previous posts, it is discussed how to package/deploy an R machine learning model with AWS Lambda and to expose the Lambda function via Amazon API Gateway. In this post, I'll demonstrate how to host a web application that services the backend API in a serverless environment.

Read More
  • ««
  • «
  • 3
  • 4
  • 5
  • 6
  • 7
  • »
  • »»
Profile
Jaehyeon Kim
Jaehyeon Kim
Developer Experience at Factor House | Technical Content Creator
Taxonomies
Data Streaming 63 Data Engineering 30 Development 27 Data Analysis 17 Data Integration 12 Kubernetes 5 Security 5 Data Processing 3 Data Architecture 2
Python 66 Apache Kafka 58 AWS 50 Docker 47 R 37 Apache Flink 28 Apache Beam 17 Kafka Connect 15 Amazon MSK 14 AWS Lambda 14 Apache Spark 13 Dbt 13 Amazon EMR 11 Kubernetes 8 Pyflink 8 Change Data Capture (CDC) 6 Debezium 6 Amazon DynamoDB 5 Apache Airflow 5 PostgreSQL 5 PySpark 5 Amazon API Gateway 4 Amazon Athena 4 AWS Glue 4 AWS Glue Schema Registry 4 BigQuery 4 Minikube 4 R Shiny 4 RServe 4 Amazon EKS 3 Amazon QuickSight 3 Apache Hudi 3 EMR on EKS 3 FastAPI 3 GCP 3 GRPC 3 Kpow 3 SparkR 3 WebSocket 3 Amazon Redshift 2 ALL 105
Kafka Development With Docker 11 Apache Beam Python Examples 10 Real Time Streaming With Kafka and Flink 7 DBT Pizza Shop Demo 6 Tree Based Methods in R 6 Apache Beam Local Development With Python 5 DBT for Effective Data Transformation on AWS 5 Kafka Connect for AWS Services Integration 5 Serverless Data Product 4 Data Lake Demo Using Change Data Capture 3 Getting Started With Pyflink on AWS 3 Kafka Development on Kubernetes 3 Parallel Processing on Single Machine 3 Realtime Dashboard With FastAPI, Streamlit and Next.js 3 API Development With R 2 DBT Guide for Production 2 Deploy Python Stream Processing App on Kubernetes 2 Getting Started With Real-Time Streaming in Kotlin 2 Integrate Schema Registry With MSK Connect 2 Kafka, Flink and DynamoDB for Real Time Fraud Detection 2 ALL 21
2025 7 2024 29 2023 39 2022 15 2021 7 2020 1 2019 5 2018 2 2017 6 2016 6 2015 15 2014 5
Posts
  • featured.png
    Meet the Streamhouse Trio - Paimon, Fluss, and Iceberg for Unified Data Architectures
    May 6, 2025
  • featured.gif
    Run Flink SQL Cookbook in Docker
    April 15, 2025
  • featured.gif
    Realtime Dashboard With FastAPI, Streamlit and Next.js - Part 1 Data Producer
    February 18, 2025
  • featured.png
    Change Data Capture (CDC) Local Development With PostgreSQL, Debezium Server and Pub/Sub Emulator
    November 7, 2024
  • featured.png
    Guide to Running DBT in Production
    September 13, 2024
  • featured.png
    DBT CI/CD Demo With BigQuery and GitHub Actions
    September 5, 2024
  • featured.png
    Cache Data on Apache Beam Pipelines Using a Shared Object
    August 22, 2024
  • featured.png
    Apache Beam Python Examples - Part 1 Calculate K Most Frequent Words and Max Word Length
    July 4, 2024
  • featured.png
    Deploy Python Stream Processing App on Kubernetes - Part 1 PyFlink Application
    May 30, 2024
  • featured.png
    Apache Beam Local Development With Python - Part 1 Pipeline, Notebook, SQL and DataFrame
    March 28, 2024
  • featured.png
    Kafka Clients With Avro - Schema Registry and Order Events
    May 27, 2025
  • featured.png
    Kafka Clients With JSON - Producing and Consuming Order Events
    May 20, 2025
  • featured.png
    Meet the Streamhouse Trio - Paimon, Fluss, and Iceberg for Unified Data Architectures
    May 6, 2025
  • featured.gif
    Run Flink SQL Cookbook in Docker
    April 15, 2025
  • featured.gif
    Realtime Dashboard With FastAPI, Streamlit and Next.js - Part 3 Next.js Dashboard
    March 4, 2025
  • featured.gif
    Realtime Dashboard With FastAPI, Streamlit and Next.js - Part 2 Streamlit Dashboard
    February 25, 2025
  • featured.gif
    Realtime Dashboard With FastAPI, Streamlit and Next.js - Part 1 Data Producer
    February 18, 2025
  • featured.png
    Apache Beam Python Examples - Part 10 Develop Streaming File Reader Using Splittable DoFn
    December 19, 2024
  • featured.png
    Apache Beam Python Examples - Part 9 Develop Batch File Reader and PiSampler Using Splittable DoFn
    December 5, 2024
  • featured.png
    Apache Beam Python Examples - Part 8 Enhance Sport Activity Tracker With Runner Motivation
    November 21, 2024
Actions
Go back Reload Copy URL

Jaehyeon Kim

Developer Experience at Factor House | Technical Content Creator

Copyright © 2023-2025 Jaehyeon Kim. All Rights Reserved.