This week in Databend #74
Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .
Special Note: This Week in Databend will be gradually migrated to the Databend Blog. We will keep the content in sync until the final migration is complete.
Check out what we've done this week to make Databend even better for you.
Features & Improvements ✨
- remove stream when a watch client is dropped (#9334)
- support selectivity estimation for range predicates (#9398)
- support copy on error (#9312)
- support databend-local (#9282)
- external storage support location part prefix (#9381)
- rangefilter support in (#9330)
- try to improve object storage io read (#9335)
- support table compression (#9370)
- add more metrics for fuse compact and block write (#9399)
- add no-fail-fast support (#9391)
Code Refactoring 🎉
- adopt rustls entirely, removing all deps to native-tls (#9358)
- split fuse source to read data and deserialize (#9353)
- avoid io copy in read parquet data (#9365)
- add uncompressed buffer for parquet reader (#9379)
- add read/write settings (#9359)
Bug Fixes 🔧
- fix align_flush with header only (#9327)
- use logical CPU number as default value of num_cpus (#9396)
- the data type on both sides of the union does not match (#9361)
- false alarm (warning log) about query not exists (#9380)
- refactor sqllogictest http client and fix expression string like (#9363)
What's On In Databend
Stay connected with the latest news about Databend.
Inspired by clickhouse-local, databend-local allows you to perform fast processing on local files, without the need of launching a Databend cluster.
> export CONFIG_FILE=tests/local/config/databend-local.toml > cargo run --bin=databend-local -- --sql="SELECT * FROM tbl1" --table=tbl1=/path/to/databend/docs/public/data/books.parquet exec local query: SELECT * FROM tbl1 +------------------------------+---------------------+------+ | title | author | date | +------------------------------+---------------------+------+ | Transaction Processing | Jim Gray | 1992 | | Readings in Database Systems | Michael Stonebraker | 2004 | | Transaction Processing | Jim Gray | 1992 | | Readings in Database Systems | Michael Stonebraker | 2004 | +------------------------------+---------------------+------+ 4 rows in set. Query took 0.015 seconds.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Compressing Short Strings
When processing the same queries with short strings involved, Databend usually reads more data than other databases, such as Snowflake.
SELECT SearchPhrase, MIN(URL), COUNT(*) AS c FROM hits WHERE URL LIKE '%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
Such queries might be more efficient if short strings (URLs, etc) are compressed.
Issue 9001: performance: compressing for short strings
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.
You can check the changelog of Databend Nightly for details about our latest developments.
Thanks a lot to the contributors for their excellent work this week.
Connect With Us
We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.
DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.
- Databend Official Website
- GitHub Discussions (Feature requests, bug reports, and contributions)
- Twitter (Stay in the know)
- Slack Channel (Chat with the community)