-
- sort - sort and/or merge files
-
- sort [ options ] [ file ... ]
-
- sort sorts lines of all the files together and writes the result on the standard output. The file name - means the standard input. If no
files are named, the standard input is sorted.
- The default sort key is an entire line. Default ordering is lexicographic by bytes in machine collating sequence. The ordering is affected globally by the
following options, one or more of which may appear. See recsort(3) for details.
- For backwards compatibility the -o option is allowed in any file operand position when neither the -c nor the -- options are specified.
-
- -k, --key=pos1[,pos2]|.reclen|.position.length]]
- Restrict the sort key to a string beginning at pos1 and ending at pos2.
pos1 and pos2 each have the form m.n, counting from 1, optionally followed by one or more of the flags CMbdfginprZ; m counts
fields from the beginning of the line and n counts characters from the beginning of the field. If any flags are present they override all the global
ordering options for this key. If .n is missing from pos1, it is taken to be 1; if missing from pos2, it is taken to be the end of the field.
If pos2 is missing, it is taken to be end of line. The second form specifies a fixed record length reclen, and the last form specifies a fixed field
at byte position position (counting from 1) of length bytes. The obsolescent reclen:fieldlen:offset (byte offset from 0) is also accepted.
- -K, --oldkey=pos
- Specified in pairs: -K pos1 -K pos2, where positions count from 0.
- -R, --record|recfmt=format
- Sets the record format to format; newlines will be treated as normal characters. The formats are:
- d[terminator]
- Variable length with record terminator character, \n by default.
- [f]reclen
- Fixed record length reclen.
- v[op...]
- Variable length. h4o0z2bi (4 byte IBM V format descriptor) if op are omitted. op may be a
combination of:
- hn
- Header size is n bytes (default 4).
- on
- Size offset in header is n bytes (default 0).
- zn
- Size length is n bytes (default min(h-o,2)).
- b
- Size is big-endian (default).
- l
- Size is little-endian (default b).
- i
- Record length includes header (default).
- n
- Record length does not include header (default i).
- %
- If the record format is not otherwise specified, and the any input file name, from left to right, ends with %format
or %format.* then the record format is set to format. In addition, the -o path, if specified and if it does not contain %
and if it names a regular file, is renamed to contain the input %format.
- -
- The first block of the first input file is sampled to check for v variable length and f fixed length format records.
Not all formats are detected. sort exits with an error diagnostic if the record format cannot be determined from the sample.
- -b, --ignorespace
- Ignore leading white space (spaces and tabs) in field comparisons.
- -d, --dictionary
- `Phone directory' order: only letters, digits and white space are significant in string comparisons.
- -C, --codeset|convert=codeset|from:to
- The field data codeset is codeset or the field data must be converted from the from
codeset to the to codeset. The codesets are:
- ascii
- 8 bit ascii
- ebcdic
- X/Open ebcdic
- o|ebcdic-o
- mvs OpenEdition ebcdic
- h|ebcdic-h
- ibm OS/400 AS/400 ebcdic
- s|ebcdic-s
- siemens posix-bc ebcdic
- i|ebcdic-i
- X/Open ibm ebcdic (not idempotent)
- m|ebcdic-m
- mvs ebcdic
- u|ebcdic-u
- microfocus cobol ebcdic
- native
- native code set
- -f, --fold|ignorecase
- Fold lower case letters onto upper case.
- -i, --ignorecontrol
- Ignore characters outside the ASCII range 040-0176 in string comparisons.
- -J, --shuffle|jumble=seed
- Do a random shuffle of the sort keys. seed specifies a pseudo random number generator seed. A seed
of 0 generates a seed based on time and pid.
- -n, --numeric
- An initial numeric string, consisting of optional white space, optional sign, and a nonempty string of digits with optional
decimal point, is sorted by value.
- -g, --floating
- Numeric, like -n, with e-style exponents allowed.
- -p, --bcd|packed-decimal
- Compare packed decimal (bcd) numbers with trailing sign.
- -M, --months
- Compare as month names. The first three characters after optional white space are folded to lower case and compared. Invalid
fields compare low to jan.
- -r, --reverse|invert
- Reverse the sense of comparisons.
- -t, --tabs=tab-char
- `Tab character' separating fields is char.
- -c, --check
- Check that the single input file is sorted according to the ordering rules; give no output unless the file is out of sort.
- -j, --processes|nproc|jobs=processes
- Use up to jobs separate processes to sort the input. The current implementation still uses
one process for the final merge phase; improvements are planned.
- -m, --merge
- Merge; the input files are already sorted.
- -u, --unique
- Unique. Keep only the first of two lines that compare equal on all keys. Implies -s.
- -s, --stable
- Stable sort. When all keys compare equal, preserve input order.
- -S, --unstable
- Unstable sort. When all keys compare equal, break the tie by using the entire record, ignoring all but the -r option.
This is the default.
- -o, --output=output
- Place output in the designated file instead of on the standard output. This file may be the same as one of
the inputs. The file - names the standard output. The option may appear among the file arguments, except after --.
- -l, --library=library[,name=value...]
- Load the external sort discipline library with optional comma separated name=value
arguments. Libraries are loaded, in left to right order, after the sort method has been initialized.
- -T, --tempdir=tempdir
- Put temporary files in tempdir. The default value is /usr/tmp.
- -L, --list
- List the available sort methods. See the -x option.
- -x, --method=method
- Specify the sort method to apply:
- rasp
- Initial radix split into a forest of splay trees.
- radix
- Radix sort.
- splay
- Splay tree sort.
- verify
- Verify that the input is sorted.
- copy
- Copy (no sort).
- The default value is rasp.
- -v, --verbose
- Trace the sort progress on the standard error.
- -Z, --zd|zoned-decimal
- Compare zoned decimal (ZD) numbers with embedded trailing sign.
- -z, --size|zip=type[size]
- Suggest using the specified number of bytes of internal store to tune performance. Type is a single character
and may be one of:
- a
- Buffer alignment.
- b
- Input reserve buffer size.
- c
- Input chunk size; sort chunks of this size and disable merge.
- i
- Input buffer size.
- m
- Maximum number of intermediate merge files.
- p
- Input sort size; sort chunks of this size before merge.
- o
- Output buffer size.
- r
- Maximum record size.
- I
- Decompress the input if it is compressed.
- O
- gzip(1) compress the output.
- -y, --size=size
- Equivalent to -zisize.
- -X, --test=test
- Enables implementation defined test code. Some or all of these may be disabled.
- dump
- List detailed information on the option settings.
- io
- List io file paths.
- keys
- List the canonical key for each record.
- read
- Force input file read by disabling memory mapping.
- show
- Show setup information and exit before sorting.
- test
- Immediatly exit with status 0; used to verify this implementation
- -D, --debug=level
- Sets the debug trace level. Higher levels produce more output.
- +pos1 -pos2 is the classical alternative to -k, with counting from 0 instead of 1, and pos2 designating next-after-last instead of last
character of the key. A missing character count in pos2 means 0, which in turn excludes any -t tab character from the end of the key. Thus +1 -1.3 is
the same as -k 2,2.3 and +1r -3 is the same as -k 2r,3.
- Under option -tx fields are strings separated by x; otherwise fields are non-empty strings separated by white space. White space before a
field is part of the field, except under option -b. A b flag may be attached independently to pos1 and pos2.
- When there are multiple sort keys, later keys are compared only after all earlier keys compare equal. Except under option -s, lines with all keys equal
are ordered with all bytes significant. -S turns off -s, the last occurrence, left-to-right, takes affect.
- Sorting is done by a method determined by the -x option. -L lists the available methods. rasp (radix+splay-tree) is the default and current
all-around best.
- Single-letter options may be combined into a single string, such as -cnrt:. The option combination -di and the combination of -n with any
of -diM are improper. Posix argument conventions are supported.
- Options -b, -c, -d, -f, -i, -k, -m, -n, -o, -r, -t, and -u are in the Posix
and/or X/Open standards.
-
- sort comments and exits with non-zero status for various trouble conditions and for disorder discovered under option -c.
-
- comm(1), join(1), uniq(1),
recsort(3)
-
- The never-documented default pos1=0 for cases such as sort -1 has been abolished. An input file overwritten by -o is not replaced until
the entire output file is generated in the same directory as the input, at which point the input is renamed.
-
- version
- sort (AT&T Research) 2007-09-05
- author
- Glenn Fowler <gsf@research.att.com>
- author
- Phong Vo <kpv@research.att.com>
- author
- Doug McIlroy <doug@research.bell-labs.com>
- copyright
- Copyright © 1996-2008 AT&T Intellectual Property
- license
- http://www.opensource.org/licenses/cpl1.0.txt