Locale-Specific Collation

When sorting tables, the default collation rules should suffice for most situations.

  • values are sorted in alpha-numeric order, according to the latin alphabet
  • numbers are sorted before letters
  • upper-case letters are sorted before lower-case letters
  • numbers are compared numerically (as opposed to lexographically), so 10 is greater than 2

By providing the locale {tag} sub-command, psv can also sort a table's rows using the collation rules appropriate for the {tag} locale.

Example: psv sort locale de

Example locale tags:

Language Tag
default none
english en
deutsch de
français fr
dansk da
svenska sv
Schweizerdeutsch gsw
日本語 ja

Locale tags must match the BCP 47 format, but typically a simple 2-character language identifier should suffice.

See Common Language Subtags -- wikipedia for a more comoplete list of common language (sub-)tags.

Examples

Default (undefined) locale:

The default collation order should suffice for most western european languages: | values | | ------ | | C | | 10 | | 2 | | A | | c | | 1 | | B | | a | | b |

German:

German special characters are treated exactly the same as their un-special counterparts:

  • ß is exactly the same as ss
  • ä is exactly the same as a
  • ö is exactly the same as o
  • ü is exactly the same as u
| german | | ------ | | ßr | | ßt | | sr | | ß | | s | | ss | | ß | | st | | ss | | ß |

Swedish

The scandinavian Å is sorted after z in sweden:

Sorting "Å" with 'sv' locale: | swedish | | ------- | | a | | Aa | | Å | | å | | ab | | b | | aa | | A | | z | end of a-z alphabet

Danish

In Denmark, the Å character is equivalent to Aa.

In contrast to Sweden, however, the Danish sort Å and Aa after z!

Danish sorting of "Å" and "AA": | danish | | ------- | | aa | | a | | Å | | Aa | | å | | b | | ab | | A | | z | end of a-z alphabet