traits_inst.lookup_collatename and the regex FSM is underspecified with
regards to ClassAtomCollatingElementSection: 28.6.12 [re.grammar] Status: New Submitter: Hubert Tong Opened: 2017-06-25 Last modified: 2017-07-12
Priority: 3
View other active issues in [re.grammar].
View all other issues in [re.grammar].
View all issues with New status.
Discussion:
For a user to implement a regular expression traits class meaningfully, the relationship between the return value of traits_inst.lookup_collatename to the behaviour of the finite state machine corresponding to a regular expression
needs to be better specified.
traits_inst.lookup_collatename
only feeds clearly into two operations:
a test if the returned string is empty ([re.grammar]/8), and
a test if the result of traits_inst.transform_primary, with the returned string,
is empty ([re.grammar]/10).
Note: It is unclear if bullet 14.3 in [re.grammar]/14 refers to the result of traits_inst.lookup_collatename when
it refers to a "collating element"; and if it does, it is unclear what input is to be used.
traits_inst.lookup_collatename substitutes another member of the
equivalence class as its output.
For example, when processing "[[.AA.]]" as a pattern under a locale da_DK.utf8, what is the expected
behaviour difference (if any) should traits_inst.lookup_collatename return, for "AA", "\u00C5"
(where U+00C5 is A with ring, which sorts the same as "AA")?
[2017-07 Toronto Monday issue prioritization]
Priority 3
Proposed resolution: