Data Architecture

Self-Service Data Platform via a Multi-Tenant SQL Gateway

July 17, 202515 min read Big Data Data Architecture Data Engineering Data Platform Data Streaming Apache Flink Apache Kyuubi Apache Langer Apache Spark Data Governance Data Lakehouse Data Lineage Hive Metastore Marquez Multi-Tenancy OpenLineage Self-Service Analytics SQL Gateway Trino

Providing direct access to big data engines like Spark and Flink often creates chaos. A gateway-centric architecture solves this by introducing a robust control plane. This article presents a detailed blueprint using Apache Kyuubi, a multi-tenant SQL gateway, to provision and manage on-demand Spark, Flink, and Trino engines. Learn how this model delivers true self-service analytics with centralized governance, finally resolving the conflict between user empowerment and platform stability.

May 6, 20256 min read Big Data Data Architecture Data Engineering Data Streaming Apache Flink Apache Iceberg Apache Paimon Fluss

The world of data is converging. The traditional divide between batch processing for historical analytics and stream processing for real-time insights is becoming increasingly blurry. Businesses demand architectures that handle both seamlessly. Enter the “Streamhouse” - an evolution of the Lakehouse concept, designed with streaming as a first-class citizen.

Today, we’ll introduce three key open-source technologies shaping this space: Apache Paimon™, Fluss, and Apache Iceberg. While each has unique strengths, their true power lies in how they can be integrated to build robust, flexible, and performant data platforms.

November 2, 20237 min read Data Architecture Data Streaming Apache Flink Apache Kafka Streaming Analytics

Stream processing technology is becoming more and more popular with companies big and small because it provides superior solutions for many established use cases such as data analytics, ETL, and transactional applications, but also facilitates novel applications, software architectures, and business opportunities. Beginning with traditional data infrastructures and application/data development patterns, this post introduces stateful stream processing and demonstrates to what extent it can improve the traditional development patterns. A consulting company can partner with her clients on their journeys of adopting stateful stream processing, and it can bring huge opportunities. Those opportunities are summarised at the end.

Self-Service Data Platform via a Multi-Tenant SQL Gateway

Meet the Streamhouse Trio - Paimon, Fluss, and Iceberg for Unified Data Architectures

Benefits and Opportunities of Stateful Stream Processing