Skip to content

GenerateUnihanCollators should not use ICU Unicode properties #1336

@markusicu

Description

@markusicu

GenerateUnihanCollators relies on ICU Unicode properties. This means that when we generate data for an upcoming Unicode version, and especially when the Unicode Tools --> ICU dependency is outdated, we get properties data for an old version of Unicode.

Examples:

    private static final UnicodeSet NOT_NFC = new UnicodeSet("[:nfc_qc=no:]").freeze();
    private static final UnicodeSet NOT_NFD = new UnicodeSet("[:nfd_qc=no:]").freeze();
    private static final UnicodeSet NOT_NFKD = new UnicodeSet("[:nfkd_qc=no:]").freeze();
    private static final UnicodeSet UNIHAN_LATEST =
            new UnicodeSet("[[:ideographic:][:script=han:]]").removeAll(NOT_NFC).freeze();

We should use Unicode Tools APIs to get properties.

@macchiati FYI

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions