AWS Glue

Data Build Tool (Dbt) for Effective Data Transformation on AWS – Part 5 Athena

December 6, 202215 min read Data Engineering DBT for Effective Data Transformation on AWS Amazon Athena Amazon QuickSight AWS AWS Glue Dbt

The data build tool (dbt) is an effective data transformation tool and it supports key AWS analytics services - Redshift, Glue, EMR and Athena. In the last part of the dbt on AWS series, we discuss data transformation pipelines using dbt on Amazon Athena. Subsets of IMDb data are used as source and data models are developed in multiple layers according to the dbt best practices.

October 9, 202218 min read Data Engineering DBT for Effective Data Transformation on AWS Amazon QuickSight Apache Spark AWS AWS Glue Dbt

The data build tool (dbt) is an effective data transformation tool and it supports key AWS analytics services - Redshift, Glue, EMR and Athena. In part 2 of the dbt on AWS series, we discuss data transformation pipelines using dbt on AWS Glue. Subsets of IMDb data are used as source and data models are developed in multiple layers according to the dbt best practices.

November 14, 20218 min read Data Engineering AWS AWS Glue Docker PySpark Python

Recently AWS Glue 3.0 was released but a docker image for this version is not published. In this post, I’ll illustrate how to create a development environment for AWS Glue 3.0 (and later versions) by building a custom docker image.

August 20, 20219 min read Data Engineering Apache Spark AWS AWS Glue Docker PySpark Python

In this post, I'll demonstrate how to build development environments for AWS Glue 1.0 and 2.0 using the Docker image and the Visual Studio Code Remote - Containers extension.

Data Build Tool (Dbt) for Effective Data Transformation on AWS – Part 5 Athena

Data Build Tool (Dbt) for Effective Data Transformation on AWS – Part 2 Glue

Local Development of AWS Glue 3.0 and Later

AWS Glue Local Development With Docker and Visual Studio Code