Skip to content

[doc] COLLECT_LIST / COLLECT_SET examples in 3.x/2.1: undefined tables + outdated array display format #3896

@boluor

Description

@boluor

Pages

COLLECT_LIST and COLLECT_SET function references — version-3.x and version-2.1, EN + ZH:

  • versioned_docs/version-3.x/sql-manual/sql-functions/aggregate-functions/collect-list.md
  • versioned_docs/version-3.x/.../collect-set.md (+ version-2.1 and the i18n/zh-CN copies)

Problems

1. Undefined tables. The examples query collect_list_test / collect_set_test, but neither page ever creates them, so copy-pasting fails with table does not exist.

2. Outdated ARRAY display format. Even after defining the tables, the printed outputs use a legacy array rendering that current Doris (verified on 3.1.4 and 2.1.11) no longer produces:

  • String/date elements are shown unquoted, e.g. the docs print [hello], [world], [2023-01-01,2023-01-02], but Doris renders string/date arrays quoted: ["hello"], ["world"], ["2023-01-01", "2023-01-02"].
  • No space after the comma, e.g. docs [1,2,2,3,3,4,4] vs Doris [1, 2, 2, 3, 3, 4, 4].

(Note this is internally inconsistent already — e.g. the sibling ARRAY_AGG page prints quoted string arrays.)

3. COLLECT_SET extras. The set-ordering example prints [4,3,2,1], but COLLECT_SET returns an unordered set — element order is not deterministic, so the literal output isn't reproducible. The same example's output table also has a malformed bottom border (+----...----+ merged across both columns).

Suggested fix

Add CREATE TABLE + INSERT setup for both pages and refresh the expected outputs to the current array rendering (quoted string elements, space after comma). For the COLLECT_SET ordering example, either sort for determinism (e.g. array_sort(collect_set(...))) or relax the expected output, and fix the table border.

Left out of the phantom-table-setup PR series because it needs the documented outputs rewritten to match current rendering (an editorial change on these older release lines), not just a missing-table addition.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions