Sorting Tables Numerically
Columns with numerical alignment hints (.) will be sorted according to their numerical values.
To do this, psv parses each value and looks for the last numerical value in each item in the column. While admittedly not perfect, this allows values such as 2 @ 1.99 EUR to be sorted in a, hopefully, meaningful manner.
Supported Numerical Formats (introductory overview)
| Example | Description |
|---|---|
| 1234 | decimal integers |
| -1234 | signed numbers |
| 12.34 | decimal floating point numbers |
| 12.34e2 | decimal exponentiation |
| +12.34e-2 | signed exponents |
| 0x0a10 | hexadecimal integers |
Number Parsing Details
What does psv understand as being a number?
Many common programming languages use a similar format for specifying literal numbers.
psv follows the specification used by golang.
Digit Grouping
Large numbers, such as the speed of light, 299792458 m/s, tend to be hard for people to read.
Many cultures allow such numbers to be "grouped" to make reading them easier. e.g. 299,792,458 (english), 299.792.458 (germany) or 2,99,79,24,58 (india).
Because psv has no way of knowing which format to expect, it only understands the . as the decimal point, and ignores , entirely.
However, psv does allow the _ character to be used within a number to visually group digits, i.e. the _ must appear between two digits, and is only valid in the integer part of a number!
The _ character was chosen to avoid confusion between . and , which are used interchangebly in many locales.
The _ character can be used between any two digits in the integer part of a number (not necessarily in groups of 3) and only serves the purpose of improved readability of large numbers.
So you could write the speed of light as 299_792_458, 2_99_79_24_58 or even 2_99_792_4_5_8 and psv would consider all of these to be identical.
Minus Signs
psv allows the use of the mathematical minus sign, unicode u+2212. This symbol looks exactly the same as the normal "hyphen-minus" character (ASCII 45 u+002d) which makes finding this error especially difficult.
Can you see the difference? I certainly cannot!
This is to alleviate the pain of copy/pasting tables of data from e.g. wikipedia where this symbol is often used in tables of numerical data.
Decimal Numbers
{sign}
{integer part}
{point}
{fraction part}
{exponentiation e or E}
{exponent sign}
{integer exponent}
[+-]? [1-9][0-9]*(_[0-9]+)* [.] [0-9]+ [eE] [+-]? [0-9]+
Examples:
Hexadecimal Numbers
{sign}
{hexadecimal 0x prefix}
{integer part}
{point}
{fraction part}
{exponentiation p or P}
{exponent sign}
{integer exponent}
Unsupported Number Formats
- any numbers which do not use so called arabic numerals, e.g. roman numerals, eastern arabic numerals, chinese, japanese, etc.
- octal numbers are just treated as decimal values - probably wrong
- imaginary numbers
- numbers with units are not scaled in any way, e.g. 1kg will be sorted well before 100g
- any kind of mathematical formula