This Week in Databend #82
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
Special Note: This Week in Databend will be gradually migrated to the Databend Blog. We will keep the content in sync until the final migration is complete.
Check out what we've done this week to make Databend even better for you.
Features & Improvements ✨
- select from stage support uri with connect*ion options (#9926)
- Iceberg/create-catalog (#9017)
- enrich core pipelines processors (#10098)
- create stage, select stage, copy, infer_schema support named file format (#10084)
- query result cache (#10042)
- table data cache (#9772)
- native storage format support nested data types (#9798)
Code Refactoring 🎉
- support exchange sorting (#10149)
- add check processor graph completed (#10166)
- apply constant folder at physical plan builder (#9889)
- use accumulating to impl single state aggregator (#10125)
Build/Testing/CI Infra Changes 🔌
Bug Fixes 🔧
- no longer return Variant as common super type (#9961)
- allow auto cast from string and variant (#10111)
- fix limit query hang in cluster mode (#10006)
- wrong column statistics when contain tuple type (#10068)
- compact not work as expected with add column (#10070)
- fix add column min/max stat bug (#10137)
What's On In Databend
Stay connected with the latest news about Databend.
Query Result Cache
In the past week, Databend now supports caching of query results!
┌─────────┐ 1 ┌─────────┐ 1 │ ├───►│ ├───►Dummy───►Downstream Upstream────►│Duplicate│ 2 │ │ 3 │ ├───►│ ├───►Dummy───►Downstream └─────────┘ │ │ │ Shuffle │ ┌─────────┐ 3 │ │ 2 ┌─────────┐ │ ├───►│ ├───►│ Write │ Upstream────►│Duplicate│ 4 │ │ 4 │ Result │ │ ├───►│ ├───►│ Cache │ └─────────┘ └─────────┘ └─────────┘
- PR | feat(query): query result cache
- Docs | RFC: Query Result Cache
- Tracking Issue | RFC: query result cache
Table Data Cache
Databend now supports table data cache:
- disk cache: raw column(compressed) data of the data block.
- in-memory cache(experimental): deserialized column objects of a data block.
For cache-friendly workloads, the performance gains are significant.
Deb Source & Systemd Support
Databend now offers the official Deb package source and supports the use of
systemd to manage the service.
For DEB822 Source Format:
sudo curl -L -o /etc/apt/sources.list.d/datafuselabs.sources https://repo.databend.rs/deb/datafuselabs.sources sudo apt update sudo apt install databend sudo systemctl start databend-meta sudo systemctl start databend-query
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Service Activation Progress Report
When starting a Query/Meta node, it is necessary to perform checks and output them explicitly to help the user diagnose faults and confirm status.
storage check succeed meta check failed: timeout, no response. endpoints: xxxxxxxx . status check failed: address already in use.
Issue 10193: Feature: output the necessary progress when starting a query/meta node
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.
You can check the changelog of Databend Nightly for details about our latest developments.
Thanks a lot to the contributors for their excellent work this week.
Connect With Us
We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.
DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.
- Databend Official Website
- GitHub Discussions (Feature requests, bug reports, and contributions)
- Twitter (Stay in the know)
- Slack Channel (Chat with the community)