Table has more than one bucket keys, but "show create table xxx" only displays one #11090

madeirak · 2024-09-06T09:43:56Z

Apache Iceberg version

1.4.3

Query engine

Spark

Please describe the bug 🐞

Through "select * from xx.xx.partitions" above, it can be seen that this table has two bucket keys.
But "show create table xx.xx"as below,only display one bucket key

manuzhang · 2024-09-19T06:41:33Z

The table has two partition keys from two partition transforms, one of which is bucket.

madeirak · 2024-09-19T06:49:29Z

The table has two partition keys from two partition transforms, one of which is bucket.

Are these two partition transforms equivalent? name_bucket_10 and id_bucket_10

Are the principle both hash?

manuzhang · 2024-09-19T07:19:41Z

Sorry, I missed name_bucket_10 part. How did you create your table? With which catalog?

madeirak · 2024-09-19T07:41:11Z

Sorry, I missed name_bucket_10 part. How did you create your table? With which catalog?

Similar to the following process:

create table   dbxx.tbxx (id INT COMMENT '11', name STRING COMMENT '') USING iceberg PARTITIONED BY (name, bucket(10, name), bucket(10, id ));
insert into tbxx values (1, '1');
show create table dbxx.tbxx ;
select * from dbxx.tbxx.partitions;

madeirak · 2024-09-24T11:14:11Z

Sorry, I missed name_bucket_10 part. How did you create your table? With which catalog?

With HiveCatalog

lurnagao-dahua · 2024-09-25T03:39:49Z

create table dbxx.tbxx (id INT COMMENT '11', name STRING COMMENT '') USING iceberg PARTITIONED BY (name, bucket(10, name), bucket(10, id ));
insert into tbxx values (1, '1');
show create table dbxx.tbxx ;
select * from dbxx.tbxx.partitions;

I am quite puzzled why name is used as both partition and bucket. In this case, all the data under the name partition is in the same bucket, and the bucketing effect is meaningless.

madeirak · 2024-09-25T03:46:54Z

create table dbxx.tbxx (id INT COMMENT '11', name STRING COMMENT '') USING iceberg PARTITIONED BY (name, bucket(10, name), bucket(10, id ));
insert into tbxx values (1, '1');
show create table dbxx.tbxx ;
select * from dbxx.tbxx.partitions;

I am quite puzzled why name is used as both partition and bucket. In this case, all the data under the name partition is in the same bucket, and the bucketing effect is meaningless.

This is just an example, not a real table. The main issue is that multiple bucket fields only display one in "show create table xxx"

manuzhang · 2024-09-25T04:24:22Z

The show create table result is following Spark SQL syntax, which only supports one bucket field.

madeirak · 2024-09-25T06:29:48Z

The show create table result is following Spark SQL syntax, which only supports one bucket field.

ok, fine. It would be better if it could be as shown in the Iceberg document:
ref: https://iceberg.apache.org/docs/latest/spark-ddl/#partitioned-by

madeirak added the bug Something isn't working label Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table has more than one bucket keys, but "show create table xxx" only displays one #11090

Table has more than one bucket keys, but "show create table xxx" only displays one #11090

madeirak commented Sep 6, 2024 •

edited

Loading

manuzhang commented Sep 19, 2024 •

edited

Loading

madeirak commented Sep 19, 2024 •

edited

Loading

manuzhang commented Sep 19, 2024

madeirak commented Sep 19, 2024

madeirak commented Sep 24, 2024

lurnagao-dahua commented Sep 25, 2024

madeirak commented Sep 25, 2024 •

edited

Loading

manuzhang commented Sep 25, 2024

madeirak commented Sep 25, 2024

Table has more than one bucket keys, but "show create table xxx" only displays one #11090

Table has more than one bucket keys, but "show create table xxx" only displays one #11090

Comments

madeirak commented Sep 6, 2024 • edited Loading

Apache Iceberg version

Query engine

Please describe the bug 🐞

manuzhang commented Sep 19, 2024 • edited Loading

madeirak commented Sep 19, 2024 • edited Loading

manuzhang commented Sep 19, 2024

madeirak commented Sep 19, 2024

madeirak commented Sep 24, 2024

lurnagao-dahua commented Sep 25, 2024

madeirak commented Sep 25, 2024 • edited Loading

manuzhang commented Sep 25, 2024

madeirak commented Sep 25, 2024

madeirak commented Sep 6, 2024 •

edited

Loading

manuzhang commented Sep 19, 2024 •

edited

Loading

madeirak commented Sep 19, 2024 •

edited

Loading

madeirak commented Sep 25, 2024 •

edited

Loading