๐Ÿ“ฆ apache / fluss

Apache Fluss is a streaming storage built for real-time analytics.

โ˜… 1.7k stars โ‘‚ 486 forks ๐Ÿ‘ 1.7k watching โš–๏ธ Apache License 2.0
big-dataflusshacktoberfestlakehousereal-time-analyticsstreaming
๐Ÿ“ฅ Clone https://github.com/apache/fluss.git
HTTPS git clone https://github.com/apache/fluss.git
SSH git clone git@github.com:apache/fluss.git
CLI gh repo clone apache/fluss
Yang Wang Yang Wang [kv] Add roaring bitmap aggregate function for aggregation merge engine (#2390) 864bbe6 2 hours ago ๐Ÿ“ History
๐Ÿ“‚ main View all commits โ†’
๐Ÿ“ .github
๐Ÿ“ .idea
๐Ÿ“ .mvn
๐Ÿ“ docker
๐Ÿ“ fluss-client
๐Ÿ“ fluss-common
๐Ÿ“ fluss-dist
๐Ÿ“ fluss-filesystems
๐Ÿ“ fluss-flink
๐Ÿ“ fluss-jmh
๐Ÿ“ fluss-kafka
๐Ÿ“ fluss-lake
๐Ÿ“ fluss-metrics
๐Ÿ“ fluss-protogen
๐Ÿ“ fluss-rpc
๐Ÿ“ fluss-server
๐Ÿ“ fluss-spark
๐Ÿ“ fluss-test-utils
๐Ÿ“ helm
๐Ÿ“ tools
๐Ÿ“ website
๐Ÿ“„ .asf.yaml
๐Ÿ“„ .gitignore
๐Ÿ“„ .scalafmt.conf
๐Ÿ“„ copyright.txt
๐Ÿ“„ DISCLAIMER
๐Ÿ“„ LICENSE
๐Ÿ“„ mvnw
๐Ÿ“„ mvnw.cmd
๐Ÿ“„ NOTICE
๐Ÿ“„ pom.xml
๐Ÿ“„ README.md
๐Ÿ“„ README.md

Apache Fluss logo

Documentation | QuickStart | Development

CI License Slack Ask DeepWiki

What is Apache Fluss (Incubating)?

Apache Fluss (Incubating) is a streaming storage built for real-time analytics which can serve as the real-time data layer for Lakehouse architectures.

It bridges the gap between data streaming and data Lakehouse by enabling low-latency, high-throughput data ingestion and processing while seamlessly integrating with popular compute engines like Apache Flink, while Apache Spark, and StarRocks are coming soon.

Fluss (German: river, pronounced /flus/) enables streaming data continuously converging, distributing and flowing into lakes, like a river ๐ŸŒŠ

Features

  • Sub-Second Latency: Low-latency streaming reads/writes optimized for real-time applications with Apache Flink.
  • Columnar Stream: 10x improvement in streaming read performance with efficient pushdown projections.
  • Streaming & Lakehouse Unification: Unified data streaming and Lakehouse with low latencies for powerful analytics.
  • Real-Time Updates: Cost-efficient partial updates for large-scale data without expensive join operations.
  • Changelog Generation: Complete changelogs for streaming processors, streamlining analytics workflows.
  • Lookup Queries: Ultra-high QPS for primary key lookups, enabling efficient dimension table serving.

Building

Prerequisites for building Apache Fluss:

  • Unix-like environment (we use Linux, Mac OS X, Cygwin, WSL)
  • Git
  • Maven (we require version >= 3.8.6)
  • Java 11
git clone https://github.com/apache/fluss.git
cd fluss
./mvnw clean package -DskipTests

Apache Fluss is now installed in build-target. The build command uses Maven Wrapper (mvnw) which ensures the correct Maven version is used.

Contributing

Apache Fluss (Incubating) is open-source, and weโ€™d love your help to keep it growing! Join the discussions, open issues if you find a bug or request features, contribute code and documentation, or help us improve the project in any way. All contributions are welcome!

License

Apache Fluss (Incubating) project is licensed under the Apache License 2.0.