Uploaded image for project: 'COmanage'
  1. COmanage
  2. CO-2632

UTF-8 Oriented Match Rules

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Unresolved
    • Minor
    • COmanage Match Future
    • COmanage Match 1.1.0 (Crisp Apple)
    • Match
    • None

    Description

      There are a couple of different categories of possible match rules here, eg: # Latin character diacritic folding. eg BjörnBjorn, and maybe even Bjoern should all be considered equivalent. This is sort of straightforward in that there are a relatively limited number of mappings (although they vary somewhat by language), and Postgres 12 and later has support for nondeterministic ICU collations that might be useful.

      1. Multi-lingual representations. eg Yayoi Kusama in English is 草間 彌生 in Japanese, which could also (but less likely) be rendered phonetically as くさま やよい or even クサマ ヤヨイ. In theory this could be handled via the Attribute Mapping capability (the same thing that allows Mike and Michael to be treated as the same value) but it’s not quite the same thing since this is really more of a transliteration than a nickname.

      Attachments

        Activity

          People

            benn.oshrin@at.internet2.edu Benn Oshrin (internet2.edu)
            benn.oshrin@at.internet2.edu Benn Oshrin (internet2.edu)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: