Parquet on Jaehyeon Kim

Parquet on Jaehyeon Kimhttps://jaehyeon.me/tags/parquet/Recent content in Parquet on Jaehyeon KimHugo -- gohugo.ioenCopyright © 2023-2026 Jaehyeon Kim. All Rights Reserved.Thu, 21 May 2026 00:00:00 +0000One Simulation, Two Pipelines: Batch Training and Live Inference with Dynamic DES v0.8.1https://jaehyeon.me/blog/2026-05-25-dynamic-des-parquet-support/Thu, 21 May 2026 00:00:00 +0000https://jaehyeon.me/blog/2026-05-25-dynamic-des-parquet-support/Training a machine learning model on simulated data is straightforward until you try to deploy it. The disconnect usually happens at the pipeline level: training requires massive, historical batch data (like Parquet files in an S3 bucket), but production inference requires real-time, event-driven streams (like Kafka or Redis). Maintaining two separate simulation codebases, one for generating training data and another for streaming live events, introduces friction, schema mismatches, and duplicated engineering effort.