What's Changed

Below is a list of some major changes that we don't want you to miss.

Exciting New Features ✨


  • Multiple Catalog (#8254)


  • allow to join a cluster if a meta node has no log (#8384)
  • add key space Expire (#8441)


  • add metainfo in datablock (#8417)


  • improve performance of count distinct (#8317)


  • relax the order requirement about the COPY options (#8341)
  • support SELECT output nothing (#8390)


  • optimize join plan to make filter push down (#8377)


  • http handler no longer need to call final_uri explicitly. (#8299)


  • support SET ROLE and current_role() (#8392)


  • Compact Segment (#8261)
  • Add cache operator in common-storage (#8306)
  • introduce data metrics for table (#8363)
  • add gc status to data metrics and show in processlist (#8389)
  • allow loading data from local fs (#8431)
  • new system table: system.catalogs (#8423)

  • implement array function slice, remove_fist, remove_last (#8326)
  • add tuple() and get() for tuple (#8372)


  • add join tests under large dataset (#8351)

Code Refactor 🎉


  • merge two to-meta-server rpc into one (#8308)
  • try best to leave a cluster (#8298)


  • use right mark join as subquery's default join type (#8427)


  • optimize get/upsert copied file info (#8282)
  • re-org query crates (#8336)


  • use commit_mutation in segment compaction (#8350)
  • decouple meta readers from TableContext (#8395)

  • move expressoin test to function-v2 (#8397)

Build/Testing/CI Infra Changes 🔌

  • replace cargo udeps with cargo machete (#8343)
  • migrate deprecating set-output commands (#8381)

Thoughtful Bug Fix 🔧


  • throw errors while loading config failed (#8462)


  • fix aggregation in cluster mode (#8333)
  • left join panic (#8325)
  • remove unnecessary required columns (#8443)


  • try fix data lost when resize multi outputs (#8319)
  • try fix lost last message if finish at same time (#8333)


  • StringSearchLike vector_vector can not match '\n' (#8359)
  • optimize upsert table copied file info (#8409)
  • div zero return err (#8464)


  • support mutation during insertion (#8205)
  • add a threshold for compact block (#8322)


RFC: Multiple Catalog

Databend supports multiple catalogs now, but only in a static way.

To allow accessing the hive catalog, users need to configure hive inside databend-query.toml in this way:

meta_store_address = ""
protocol = "binary"

Users can't add/alter/remove the catalogs during runtime.

By allowing users to maintain multiple catalogs for the databend, we can integrate more catalogs like iceberg more quickly.

Jepsen Test for Databend Meta Service

Jepsen is an open source software library for system testing. It is an effort to improve the safety of distributed databases, queues, consensus systems, etc.

For the past period of time, @lichuang has been working on the design and implementation of a Jepsen test solution for the Databend Meta Service.

If you are interested in this test, please check the corresponding GitHub Repo, which contains the steps, scripts and clients for the test.

New Key-Value services support for OpenDAL

OpenDAL means Open Data Access Layer and its goal is Access data freely, painlessly, and efficiently .

In past, OpenDAL has completed support for different storage backends such as local file system, AWS s3, Azure Blob, etc. And to support storing volatile data to provide cache solutions and temporary storage of data, OpenDAL has designed and implemented Key-Value service support (with kv::Adapter).

The following backends are currently available for the Key-Value service:

  • memory: Service based on BtreeMap
  • moka: Service based on the high-performance caching library moka.
  • redis: Service based on redis.

The community also plans to add support for the following Key-Value services:

