Skip to content

Make isxidstart() and isxidcontinue() module only functions in unicodedata #143897

@serhiy-storchaka

Description

@serhiy-storchaka

unicodedata functions isxidstart() and isxidcontinue() were added in #129117. But it added also corresponded methods in unicodedata.ucd_3_2_0 which are identical to module level functions except that they return False for codes not assigned in Unicode 3.2.0.

unicodedata.ucd_3_2_0 exists solely for implementation of obsolete IDNA2003 (RFC 3490 and RFC 3491) in the idna module. isxidstart() and isxidcontinue() are not needed for this. I am not even sure they return correct values for Unicode 3.2.0, because the XID_Start and XID_Continue properties can be added to assigned codes (at least this happened with other properties).

So, I think that isxidstart() and isxidcontinue() should only be exposed as unicodedata functions, not as unicodedata.ucd_3_2_0 methods. There is already a precedence -- functions related to grapheme cluster breaking are only exposed at the module level. This is because the grapheme cluster break algorithm was completely different in older versions of Unicode, and many properties that are used now did not even exist in older versions of Unicode.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.15new features, bugs and security fixesextension-modulesC modules in the Modules dirtype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions