Big Data

Self-Service Data Platform via a Multi-Tenant SQL Gateway

July 17, 202515 min read Big Data Data Architecture Data Engineering Data Platform Data Streaming Apache Flink Apache Kyuubi Apache Langer Apache Spark Data Governance Data Lakehouse Data Lineage Hive Metastore Marquez Multi-Tenancy OpenLineage Self-Service Analytics SQL Gateway Trino

Providing direct access to big data engines like Spark and Flink often creates chaos. A gateway-centric architecture solves this by introducing a robust control plane. This article presents a detailed blueprint using Apache Kyuubi, a multi-tenant SQL gateway, to provision and manage on-demand Spark, Flink, and Trino engines. Learn how this model delivers true self-service analytics with centralized governance, finally resolving the conflict between user empowerment and platform stability.

May 6, 20256 min read Big Data Data Architecture Data Engineering Data Streaming Apache Flink Apache Iceberg Apache Paimon Fluss

The world of data is converging. The traditional divide between batch processing for historical analytics and stream processing for real-time insights is becoming increasingly blurry. Businesses demand architectures that handle both seamlessly. Enter the “Streamhouse” - an evolution of the Lakehouse concept, designed with streaming as a first-class citizen.

Today, we’ll introduce three key open-source technologies shaping this space: Apache Paimon™, Fluss, and Apache Iceberg. While each has unique strengths, their true power lies in how they can be integrated to build robust, flexible, and performant data platforms.

Self-Service Data Platform via a Multi-Tenant SQL Gateway

Meet the Streamhouse Trio - Paimon, Fluss, and Iceberg for Unified Data Architectures