Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's Changed

Below is a list of some major changes that we don't want you to miss.

Exciting New Features ✨

toolchain

  • upgrade to 1.67 nightly (#8631)

multiple catalog

  • multiple catalog create (planner and catalog manager) (#8620)

compact

  • optimize compact for data load (#8644)

planner

  • optimize left/single join (#8583)

query

  • support copy from xml (#8404)
  • add collation (#8610)
  • copy files order by last modified time asc (#8628)
  • improve sort, 10%~50% faster than the old one (#8452)

new expression

  • add to_xxx() cast functions (#8599)
  • add inlist expr in new expression (#8676)

Code Refactor 🎉

format

  • refactor output format with FieldEncoders (#8700)

planner

  • move plan from query/planner to sql/planner (#8660)

query

  • remove sqlparser-rs (#8670)
  • try move list files to read_partitions (#8673)

storage

  • move and group sub-crates in storages (#8613, #8621, #8627, etc.)
  • compact segments, which strictly preserves the order of ingestion (#8590)

new expression

  • migrate deserializations to expression (#8637)
  • use to_xxx() to evaluate CAST(xxx AS xxx) (#8637)

Build/Testing/CI Infra Changes 🔌

  • rust-toolchain nightly 1.67.0 (nightly-2022-11-07) (#8641)

Thoughtful Bug Fix 🔧

compatibility

  • problem when using Trino Mysql connector (#8668)

meta

  • emit kv change events after committing a transaction (#8674)

query

  • union's pairs are handled incorrectly (#8638)
  • max_threads can not determined automatically (#8707)

News

Let's take a look at what's new at Datafuse Labs & Databend each week.

Support Copy from XML

After #8404 was merged, Databend now offers support for loading data from XML formatted files.

Similar to the use of other formats, in the SQL statement it is only necessary to set the format option to XML and an example of using the streaming load API is given below.

curl -sH "insert_sql:insert into test_xml format XML" \
-F "upload=@/tmp/simple_v1.xml" \
-u root: -XPUT "http://localhost:${QUERY_HTTP_HANDLER_PORT}/v1/streaming_load"

The content of your XML file needs to match one or more of the following types:

  • Column names as attributes and column values as attribute values:
<row column1="value1" column2="value2" .../>
  • Column names as tags and column values as the content of these tags:
<row>
  <column1>value1</column1>
  <column2>value2</column2>
</row>
  • Column names are the name attributes of tags, and values are the contents of these tags:
<row>
  <field name='column1'>value1</field>
  <field name='column2'>value2</field>
</row>

Learn More

Support for Char Collation

After #8610 was merged, Databend now supports setting collation to select the string encoding to be considered.

By default, collation is set to 'binary', as Databend stores string columns in binary format by default, which you can change to 'utf-8' with a statement like the following:

set collation = 'utf8';

This may help you to get the expected results when working with non-English strings.

statement query TI
select substr('城区主城区其他', 1, 6), length('我爱中国');

----
城区	12


statement ok
set collation = 'utf8';


statement query TI
select substr('城区主城区其他', 1, 6), length('我爱中国');

----
城区主城区其	4

Learn More

Issues

Meet issues you may be interested in and try to solve it.

Enable Xor Filter Index for IN

Databend introduced the Xor Fliter to replace the Bloom Filter (#7870), which in some scenarios gives about twice the performance improvement and requires very little data to be scanned.

Initially, we simply added this index for the string columns.Then, in #7958, it is enabled for the integer columns.

Now, we want to enable Xor Filter index for IN .

SELECT * FROM t1 where xx IN ('', '')

Issue 8625: performance: enable xor filter index for IN

If you find it interesting, try to solve it or participate in discussions and PR reviews. Or you can click on https://link.databend.rs/i-m-feeling-lucky to pick up a good first issue, good luck!

Changelogs

You can check the changelogs of Databend nightly to learn about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyb41shBohuTANGChasen-ZhangClSlaiddantengsky
andylokandyb41shBohuTANGChasen-ZhangClSlaiddantengsky
dependabot[bot]drmingdrmereliasyaoyclichuangmergify[bot]RinChanNOWWW
dependabot[bot]drmingdrmereliasyaoyclichuangmergify[bot]RinChanNOWWW
soyeric128sundy-liTCeasonXuanwoxudong963youngsofun
soyeric128sundy-liTCeasonXuanwoxudong963youngsofun
zhang2014ZhiHanZ
zhang2014ZhiHanZ

Meet Us

Please join the DatafuseLabs Community if you are interested in Databend.

We are looking forward to seeing you try our code. We have a strong team behind you to ensure a smooth experience in trying our code for your projects. If you are a hacker passionate about database internals, feel free to play with our code.

You can submit issues for any problems you find. We also highly appreciate any of your pull requests.