Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's Changed

Below is a list of some major changes that we don't want you to miss.

Exciting New Features ✨

RFC

  • Multiple Catalog (#8254)

meta

  • allow to join a cluster if a meta node has no log (#8384)
  • add key space Expire (#8441)

datablock

  • add metainfo in datablock (#8417)

functions

  • improve performance of count distinct (#8317)

parser

  • relax the order requirement about the COPY options (#8341)
  • support SELECT output nothing (#8390)

planner

  • optimize join plan to make filter push down (#8377)

handlers

  • http handler no longer need to call final_uri explicitly. (#8299)

rbac

  • support SET ROLE and current_role() (#8392)

storage

  • Compact Segment (#8261)
  • Add cache operator in common-storage (#8306)
  • introduce data metrics for table (#8363)
  • add gc status to data metrics and show in processlist (#8389)
  • allow loading data from local fs (#8431)
  • new system table: system.catalogs (#8423)

new expression

  • implement array function slice, remove_fist, remove_last (#8326)
  • add tuple() and get() for tuple (#8372)

tests

  • add join tests under large dataset (#8351)

Code Refactor 🎉

meta

  • merge two to-meta-server rpc into one (#8308)
  • try best to leave a cluster (#8298)

planner

  • use right mark join as subquery's default join type (#8427)

query

  • optimize get/upsert copied file info (#8282)
  • re-org query crates (#8336)

storage

  • use commit_mutation in segment compaction (#8350)
  • decouple meta readers from TableContext (#8395)

new expression

  • move expressoin test to function-v2 (#8397)

Build/Testing/CI Infra Changes 🔌

  • replace cargo udeps with cargo machete (#8343)
  • migrate deprecating set-output commands (#8381)

Thoughtful Bug Fix 🔧

config

  • throw errors while loading config failed (#8462)

planner

  • fix aggregation in cluster mode (#8333)
  • left join panic (#8325)
  • remove unnecessary required columns (#8443)

processor

  • try fix data lost when resize multi outputs (#8319)
  • try fix lost last message if finish at same time (#8333)

query

  • StringSearchLike vector_vector can not match '\n' (#8359)
  • optimize upsert table copied file info (#8409)
  • div zero return err (#8464)

storage

  • support mutation during insertion (#8205)
  • add a threshold for compact block (#8322)

News

Let's take a look at what's new at Datafuse Labs & Databend each week.

RFC: Multiple Catalog

Databend supports multiple catalogs now, but only in a static way.

To allow accessing the hive catalog, users need to configure hive inside databend-query.toml in this way:

[catalog]
meta_store_address = "127.0.0.1:9083"
protocol = "binary"

Users can't add/alter/remove the catalogs during runtime.

By allowing users to maintain multiple catalogs for the databend, we can integrate more catalogs like iceberg more quickly.

Learn More

Jepsen Test for Databend Meta Service

Jepsen is an open source software library for system testing. It is an effort to improve the safety of distributed databases, queues, consensus systems, etc.

For the past period of time, @lichuang has been working on the design and implementation of a Jepsen test solution for the Databend Meta Service.

If you are interested in this test, please check the corresponding GitHub Repo, which contains the steps, scripts and clients for the test.

Learn More

Issues

Meet issues you may be interested in and try to solve it.

New Key-Value services support for OpenDAL

OpenDAL means Open Data Access Layer and its goal is Access data freely, painlessly, and efficiently .

In past, OpenDAL has completed support for different storage backends such as local file system, AWS s3, Azure Blob, etc. And to support storing volatile data to provide cache solutions and temporary storage of data, OpenDAL has designed and implemented Key-Value service support (with kv::Adapter).

The following backends are currently available for the Key-Value service:

  • memory: Service based on BtreeMap
  • moka: Service based on the high-performance caching library moka.
  • redis: Service based on redis.

The community also plans to add support for the following Key-Value services:

If you find these interesting, try to solve them or participate in discussions and PR reviews. Or you can click on https://link.databend.rs/i-m-feeling-lucky to pick up a good first issue, good luck!

Changlogs

You can check the changelogs of Databend nightly to learn about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyb41shBohuTANGClSlaiddantengskydrmingdrmer
andylokandyb41shBohuTANGClSlaiddantengskydrmingdrmer
everpcpcflaneur2020guzzitleiyskylichuangmergify[bot]
everpcpcflaneur2020guzzitleiyskylichuangmergify[bot]
miles170RinChanNOWWWsoyeric128sundy-liTCeasonTszKitLo40
miles170RinChanNOWWWsoyeric128sundy-liTCeasonTszKitLo40
wubxXuanwoxudong963youngsofunzhang2014zhyass
wubxXuanwoxudong963youngsofunzhang2014zhyass

Meet Us

Please join the DatafuseLabs Community if you are interested in Databend.

We are looking forward to seeing you try our code. We have a strong team behind you to ensure a smooth experience in trying our code for your projects. If you are a hacker passionate about database internals, feel free to play with our code.

You can submit issues for any problems you find. We also highly appreciate any of your pull requests.