TCOLS v2.10 - a table column filter |
Documentation revised 21 Oct 00 - Copyright (c) 1996-2000 by Rune Berg. TextTools Freeware. |
Contents:
Usage | Top || Next |
tcols [log logfile] [options] [from infile] [to outfile] expression [...]
See Understanding The Usage Section for details.
Description | Top || Previous || Next |
tcols is a filter for projecting and transforming data columns in text files.
tcols runs from the command line or from batch files.
Input and output data are plain ASCII text lines, each line being treated as whitespace-separated (but see -iC and -csvi options) fields. Files are typically used for input and output data.
For example, consider a text file "data", containing the following table (3 columns, 4 lines):
john 45 tennis al 31 squash tom 25 beer paul 38 women
The command:
tcols from data $3 $2
writes the third and second columns, separated by a tab (but see -o and -csvo options), to the screen:
tennis 45 squash 31 beer 25 women 38
What made tcols write just that was the expressions, $3 and $2, specified on the command line.
Here's another example, using the same file "data". The command:
tcols from data to results $1 /loves/ $3.upp
writes the following to the file "results":
john loves TENNIS al loves SQUASH tom loves BEER paul loves WOMEN
The last example shows the use of applying a function to an expression. tcols has functions for string manipulation, formatting, decimal/hex/octal conversions, maths, and a few other things.
The above examples show only a few of tcols's capabilities, so read the next sections for a full description.
Note: All usage examples in this document are for tcols running on a Win95/98/NT/2000 command prompt. Running tcols on a Unix(-like) shell requires quoting appropriate for the particular shell.
Options | Top || Previous || Next |
tcols recognizes the following command line options:
Option | Function |
---|---|
-iC | Separate fields in infile by character C (except \). Use \t to form a tab. |
-csvi | Do CSV (comma separated values) style parsing of input fields from infile. Unless the -iC option is given, use a comma as the field separator. |
-oS | Separate output fields by string S, instead of the default tab character. Use \t to form a tab, \\ to form a backslash. -o recognizes no other escaped characters. |
-csvo | Print output fields CSV (comma separated values) style. Unless the -oS option is given, use a comma as the field separator. |
-fppN | Use floating-point precision N (0..15, default 6) decimal digits for comparisons/output. See separate discussion on floating point numbers for more details. |
-fpfF |
Use floating point format F for output and internal representation.
F must be one of:
|
-w |
Do not abort on a processing error, just skip bad line and
write a warning to standard error (or logfile, if used).
See Errors During Processing section for more details. Note: too long input lines will cause a program exit even if this option is used. |
-r | Print a one-line report to standard error (or logfile, if given) after processing. This option has no effect if processing is aborted due to an error. |
-D | Diagnostic mode. Prints, to standard output, information about parsed command line arguments (file names, options, expressions) and per-input-line processing. Useful for debugging expressions, or to get an idea of how tcols works internally. Note: regular output should be sent to a file to avoid it being mixed up with diagnostics information. |
-v | Print version banner and usage info to standard error (or logfile, if given), then exit. |
-he | Print summary of expression usage to standard error (or logfile, if given), then exit. |
-hfc | Print summary of character functions to standard error (or logfile, if given), then exit. |
-hfs | Print summary of string functions to standard error (or logfile, if given), then exit. |
-hfp | Print summary of pattern functions to standard error (or logfile, if given), then exit. |
-hfv | Print summary of conversion functions to standard error (or logfile, if given), then exit. |
-hfm | Print summary of maths functions to standard error (or logfile, if given), then exit. |
-hfx | Print summary of miscellanous functions to standard error (or logfile, if given), then exit. |
-hf name | Print summary of named function to standard error (or logfile, if given), then exit. |
-hp | Print summary of pattern usage to standard error (or logfile, if given), then exit. |
See prf and rnd functions for how to override the -fpp and -fpf options on a per expression basis.
Here are some examples of how to use the -o option:
Input Data : Fields and Separators | Top || Previous || Next |
The input data to tcols is ordinary ASCII text lines.
tcols sees each line as consisting of zero or more fields, denoted $1,
$2, ...
The following sections describe tcols's ways of breaking an input
line into fields, i.e. tcols's parsing styles.
Input fields separated by whitespace (default) |
(Unless you invoke tcols with the -iC option or the -csvi option, this is the effective style.)
tcols sees each field as separated by at least one tab or blank, e.g.:
john 37 butcher (end-of-line)
<--> <> <-----> $1 $2 $3
Trailing whitespace does not result in a last, empty field. This is somewhat inconsistent with the behaviour when the -iC option is used.
If an input line has no fields (i.e., consists of whitespace only), then tcols will write an empty line to the output, without evaluating the expression(s).
Remarks:
If tcols finds an unmatched quote on an input line, then tcols reads that quote and the rest of the line as one field. For example:
12 5654 'I feel good 8899 (newline)
<> <--> <------------------------> $1 $2 $3
When tcols reads a quoted field from the input, tcols considers the surrounding quotes part of the field. Use the suqt and duqt functions if you want to get rid of such quotes.
Input fields are comma-separated values (CSV) |
If you invoke tcols with the -csvi option, tcols treats input data as standard comma-separated vales (CSV) as exported by e.g. MS Excel.
CSV data obeys the following rules:
As an example:
tcols -csvi
would see the following input line:
Eastwood,"The Good, the Bad, and the Ugly",1969,"""Talk to me, Blondie!""",Western
like this:
$1 = Eastwood $2 = The Good, the Bad, and the Ugly $3 = 1969 $4 = "Talk to me, Blondie!" $5 = Western
Input fields separated by a user-defined character |
If you invoke tcols with the -iC option (but not also the -csvi option), tcols uses the character C to separate the fields on an input line. In this discussion, we'll consider input fields separated by semicolons.
Note that this style (which you might consider TextTools specific) is not suitable for standard comma-separated values (CSV) as exported by e.g. MS Excel. Handling CSV data is described in the previous section.
As an example, this is how tcols -i; would see the following input line:
Al; 42; shoe salesman;married;2; Dodge (newline)
<> <---> <------------> <-----> - <------> $1 $2 $3 $4 $5 $6
Any text between the start of the line and the first semicolon, between two semicolons, or between the last semicolon and the end of the line, constistutes a field.
Two semicolons right next to each other constitute an empty field; this is perfectly legal.
If an input line consists of whitespace only, then tcols will write an empty line to the output, without evaluating the expression(s). Otherwise, whitespace has no special significance when you're using the -iC option.
Remarks:
Make sure your input data does not have unwanted spaces at the end of lines.
(Let us step aside...) |
... one minute to compare and discuss the three field/separator styles discussed in the previous three sections. (And please do remember, this applies to input fields. See separate section on corresponding issues for output fields.)
First of all, let me be the first to admit that the 'User-defined separator' style is poorly thought out w.r.t. the 'whitespace separated style'. There are just too many needless inconsistencies between these two styles. For reasons of backward compatability, however, I cannot now clean up the behaviour of the 'User-defined separator' style. I recommend that you don't use the 'User-defined separator' style if you can in any way avoid it. Instead, use the CSV style.
That said, the table below should help you on the 'finer points' of input field separation:
Aspect: \ Style: | Whitespace sep'd | CSV | User-def'd sep. |
Separator | One or more whitespace | , (or C given in the -iC option) | C given in the -iC option |
Single quotes significant | YES1 | no | no |
Double quotes significant | YES1 | YES1,5 | no |
Backslash significant | YES2 | no | YES3 |
Surrounding quotes automatically stripped | no | YES4 | no (n/a) |
Escaping backslashes automatically stripped | no | no (n/a) | YES |
Escaping double quote automatically stripped | no (n/a) | YES | no (n/a) |
Input fields start at fixed positions and are of fixed length |
To make tcols deal with fields starting at fixed character positions and being of fixed lengths, use the $r (raw line) expression and the subs function.
For example, the command:
tcols -o
where the file "alpha.txt" contains the line:
abcdefghijklmnopqrstuvwxyz
yields the following output:
abcdefgh
Output Data : Fields and Separators | Top || Previous || Next |
The output data from tcols is ordinary ASCII text lines, normally one output line per input line.
The contents of the output lines are determined by the contents of the input lines and the expressions given on the command line when invoking tcols.
The following sections describe tcols's ways of printing output fields (i.e. the results of evaluating the expressions).
Output fields separated by a tab (default) |
(Unless you invoke tcols with the -oC option or the -csvo option, this is the effective style.)
This style really is as simple as the heading says: tcols prints expression results separated by a tab character.
This style provides no 'smartness' (automatic quoting or escaping backslashing) for when the output field values themselves contain tabs. (The resc, sqt, and dqt functions are useful here, though.)Output fields are comma-separated values (CSV) |
If you invoke tcols with the -csvo option, tcols prints output fields as standard comma-separated vales (CSV), described in the Input fields are comma-separated values (CSV) section.
This style provides automatic quoting of fields that contain the separator or quotes.
By default, the CSV output style uses a comma as the output separator, but you can select another one-character output separator using the -oC option.
Output fields separated by a user-defined character(s) |
If you invoke tcols with the -oC option (but not also the -csvo option), tcols uses the character C to separate the output fields.
This style provides no 'smartness' (automatic quoting or backslashing) for when the output field values themselves contain the separator. (The resc, sqt, and dqt functions are useful here, though.) This is not consistent with the corresponding input style, which does provide automatic stripping of escaping backslashes.
This style allows a multi-character or empty output separator.
(Let us step aside...) |
... one minute to compare and discuss the three field/separator styles discussed in the previous three sections. (And please do remember, this applies to output fields. See separate section on corresponding issues for input fields.)
The table below should help you on the 'finer points' of output field separation:
Aspect: \ Style: | Tab sep'd | CSV | User-def'd sep. |
Separator | tab | , (or C given in the -oC option) | C given in the -oC option |
Auto. insertion of surrounding quotes | no | YES1 | no |
Auto. insertion of escaping backslashes | no | no | no |
Output fields start at fixed positions and are of fixed length |
To make tcols output fields starting at fixed character positions and being of fixed lengths, use an empty output separator and the ljf, rjf, and prf functions.
For example, consider the input file "data.txt":
Linux free 55% NT $200 20% Solaris $2 2%
To print this as three fixed-length columns, run the command:
tcols -o from data $1.ljf(10) $2.rjf(5) $3.rjf(4)
To get the following output:
Linux free 55% NT $200 20% Solaris $2 2%
Expressions | Top || Previous || Next |
Expressions specify how tcols should map input data to output data. tcols applies the expressions to each input line in turn, producing a corresponding output line. The only exception is empty input lines (lines that contain only whitespace); they result in an empty line on the output, without being evaluated.
Syntax errors in expressions will cause tcols to exit with an appropriate error message, before any processing.
An expression should not contain spaces, except in string literals (in which case the whole expression must be surrounded by double quotes, e.g.: "/hi there/".)
The Expression Syntax section describes the exact grammar.
Basic Expressions |
This section describes tcols's basic expressions. They are best illustrated by examples:
Expression | Yields |
---|---|
$3 | the third field on the input line |
$1..4 | the first ... fourth field on the input line |
$l | the last field on the input line |
$2..l | the second ... last field on the input line |
$c | the count of fields on the input line |
$r | the entire input line, whitespace and all |
$n | the current data line number (starting at 1) |
$e(/V/) | the value of environment variable V |
532 | the literal integer 532 |
72.184 | the literal floating point number 72.184 |
/hey/ | the literal string hey |
Some notes on basic expressions:
Example: printing 3rd and 5th fields separated by just a colon:
tcols -o: from myfile $3 $5
Example: swapping the 3rd and 8th column in an 12 column table:
tcols from myfile $1..2 $8 $4..7 $3 $9..12
Expressions with String Operators |
This sections describes how to use tcols's string operator: #.
The concatenation operator # simply tacks the result of one expression onto the result of another, as shown by these examples:
Expression | Yields |
---|---|
$1#$2 | the second field concatenated onto the first field |
$1#$2#$3 | the third field concatenated onto the second field concatenated onto the first field |
$1..3#/G/ | a 'G' concatenated onto the end of the first, second and third fields |
For example, the expressions:
$1#$2 $1#$2#$3 $1..3#/G/
applied to the input line:
a b c
yields the following output:
ab abc aG bG cG
Note that the concatenation operator has higher precendence than the the arithmetic operators (described below).
Expressions with Arithmetic |
This sections describes how to use tcols's arithmetic operators: + - * / %.
All these operators work on integers. All except % work on floating point numbers. An operation involving only integers gives an integer result. An operation involving a floating point number gives a floating point result.
The use of arithmetic operators is best shown by examples:
Expression | Yields |
---|---|
$1+$2 | the sum of the first and the second fields |
100-$6 | the difference between 100 and the sixth field |
$1*$2 | the product of the first and second fields |
$3/2.5 | the third field divided by 2.5 |
$1%10 | the remainder of (the first field divided by 10)1 |
-$2 | the second field negated2 |
1: When invoking tcols from a batch file, you need an extra % to prevent the Win95/98/NT command line interpreter from treating %10 as the 10th command line argument: $1%%10
2: If you invoke tcols to use standard input/output, and the first expression starts with a '-', then put that first expression in brackets, e.g. (-$2), so tcols won't think it's a command line option.
Shortcuts are possible. For example, the expression:
($2,$3,$1)*10
applied to the input line:
1 2.7 3
yields the following output:
27.000000 30 10
(Note default floating point precision of 6 digits.)
Note that the right hand side of + - * / and % must evaluate to exactly one number.
Unary - (minus) has the highest precedence, so the following are equivalent:
-$2-4 (-$2)-4
* / and % have equal, and next highest precedence. They're evaluated left to right, so the following are equivalent:
$2*$4/$2 ($2*$4)/$2
+ and binary - have the lowest precedence, and are evaluated left to right, so the following are equivalent:
$1-$2+$5*7 ($1-$2)+($5*7)
Parenthesis ( ) can be used to override precedence:
($1+$2)*100
Expressions with Function Calls |
This section describes how to form expressions with function calls.
A function call has one the forms:
expression.functionname expression.functionname(arguments)
Here are some example function calls:
Expression | Yields |
---|---|
$1.suqt | first field with surrounding single quotes removed |
$2.clip(3,5) | second field with 3 leftmost and 5 rightmost characters clipped off |
$1..5.rjf(8) | first .. fifth fields right justified in fields (sorry!) of 8 spaces |
As a shortcut, expressions can be grouped with ( ) and then fed to a function:
Expression | Yields |
---|---|
($1,$3,$4,$8).suqt | said fields stripped of surrounding single quotes |
This saves you from writing:
Expression | Yields |
---|---|
$1.suqt $3.suqt $4.suqt $8.suqt | ditto |
Some functions are only meaningful when applied to several expressions:
Expression | Yields |
---|---|
($1,$4,$7).cat | concatenation of the first, fourth, and seventh fields |
Function calls can be chained:
Expression | Yields |
---|---|
$r.subs(1,10).upp | first 10 characters in upper case |
$1..l.sum.rjf(10).padl(0) | sum of all fields, right justified in field of 10 characters padded with 0's |
Any expression can be used as a function argument:
Expression | Yields |
---|---|
$3.rig($1.len) | N rightmost characters of the third field, where N is the length of the first field |
If a function is given the wrong number of arguments, or the wrong type of arguments, tcols will print an error message to standard error (or logfile, if used) and exit. However, if you use the -w command line option, tcols will skip the offending input line, print a warning to standard error (or logfile, if used), and continue processing the next input line; see the Errors During Processing section.
Note: Due to the introduction of floating point support in version 2.00, you can no longer apply a function directly to a literal integer, as in 33.sqt, because tcols will consider the '33.' part a floating point number. Instead, write e.g. (33).sqt or /33/.sqt.
The Function Library section describes all functions and their required arguments.
Errors During Processing | Top || Previous || Next |
A processing error occurs if the contents of an input line prevent tcols from evaluating your expressions.
tcols's default error action is to print a relevant error message and exit.
However, if you set the -w command line option, tcols will skip the bad input line and continue processing the next input line. tcols prints a warning anyway.
tcols prints error messages and warnings to standard error (or the logfile, if used).
Here are some typical processing errors:
tcols is rather strict about input data. For example, the sum function will only work on integer and floating point arguments, even though I could have made it ignore non-numeric arguments. My reasoning is: tcols will often be used for processing hand-typed data. Typists sometimes hit the wrong keys. If tcols were lax about bad input data, it might quietly produce bad output data.
More Examples | Top || Previous || Next |
This section gives more examples of complete tcols commands.
These examples start with the file "books" which contains:
Poe 'Edgar Allen' "Selected Stories" 1879 horror Thompson Jim "The Killer Inside Me" 1950 crime Lem Stanislaw "Return From the Stars" 1961 sf Crumley James "Dancing Bear" 1983 crime 'Le Carre' John "Smiley's People" 1972 spy
Now, this file looks a bit messy. You want to reformat it to look cleaner, with first names and surnames together, no single quotes around the names, and no year of publication. The command:
tcols -o from books to books2 "($1.suqt,/, /,$2.suqt).cat.ljf(20)" $3.ljf(25) $5
prints the following to "books2":
Poe, Edgar Allen "Selected Stories" horror Thompson, Jim "The Killer Inside Me" crime Lem, Stanislaw "Return From the Stars" sf Crumley, James "Dancing Bear" crime Le Carre, John "Smiley's People" spy
Allright. To ease future processing, you want your book list on a field-oriented format. The command:
tcols -o from books2 to books3 $r.subs(1,16).trt.dqt.ljf(20) $r.subs(21,43).trt.ljf(25)
prints the following to "books3":
"Poe, Edgar Allen" "Selected Stories" horror "Thompson, Jim" "The Killer Inside Me" crime "Lem, Stanislaw" "Return From the Stars" sf "Crumley, James" "Dancing Bear" crime "Le Carre, John" "Smiley's People" spy
Now, you can use another TextTools program, trows, to print all your crime books. The command:
trows from books3 $3=/crime/
prints to the screen:
"Thompson, Jim" "The Killer Inside Me" crime "Crumley, James" "Dancing Bear" crime
Or, you can sort your books on author name, using yet another TextTools program: tsort. The command:
tsort from books3 $1
prints to the screen:
"Crumley, James" "Dancing Bear" crime "Le Carre, John" "Smiley's People" spy "Lem, Stanislaw" "Return From the Stars" sf "Poe, Edgar Allen" "Selected Stories" horror "Thompson, Jim" "The Killer Inside Me" crime
Expression Syntax | Top || Previous || Next |
This section defines the exact tcols expression syntax rules.
Note: The spaces used in these rules are for clarity; spaces are not allowed in actual expressions (except to denote a space in a literal string).
expr ::= list list ::= arit , list | arit arit ::= arit + term | arit - term | term term ::= term * neg | term / neg | neg neg ::= - neg | concat concat ::= concat # call | call call ::= call . funcname ( list ) | call . funcname | simple simple ::= $ integer ; integer must be >= 1 | $ integer .. integer ; integers must be >= 1, first <= second | $ integer .. l ; integer must be >= 1 | $ l | $ c | $ r | $ n | $ e ( list ) ; list should eval. to one string | integer | floating-point | / string / | ( list ) integer ::= [+|-][0-9]+ floating- ::= [+|-][0-9]*.[0-9]* point | [+|-][0-9]+.[0-9]*e[+|-][0-9]* ; scientific notation | [+|-][0-9]*.[0-9]+e[+|-][0-9]* ; scientific notation string ::= one or more printable characters, but use \/ for forward-slash, \\ for backslash
Function Library | Top || Previous || Next |
This section describes all tcols's functions.
E, E1, etc., in this discussion denotes expressions, as far as syntax is concerned, and the result of evaluating expressions as far as evaluation is concerned.
Character Functions |
E.cc(s,t) yields E with any characters in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
For example:
tcols $1.cc(/def/,/DEF/)
applied to the input lines:
abcdefghi
aaeejjff
yields the following output lines:
abcDEFghi aaEEjjFF
E.ccl(s,t) yields E with leading characters in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
For example:
tcols $1.ccl(/*/,/-/)
applied to the input line:
****::::****::**
yields the following output line:
----::::****::**
E.cco(s,t,m,n) yields E with mth..nth occurence of the character(s) in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
m and n must integers >= 1, with m <= n.
For example:
tcols $1.cco(/xy/,/./,3,6)
applied to the input lines:
nnxxxxxxxxxxxxx
nxxxxyyyy
yields the following output lines:
nnxx....xxxxxxx nxx....yy
E.ccp(s,t,m,n) yields E with characters at positions m..n equal to the character(s) in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
m and n must integers >= 1, with m <= n.
For example:
tcols $1.ccp(/*/,/-/,2,10)
applied to the input line:
****::::****::**
yields the following output line:
*---::::--**::**
E.cct(s,t) yields E with trailing characters in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
For example:
tcols $1.cct(/*/,/-/)
applied to the input line:
****::::****::**
yields the following output line:
****::::****::--
E.ccto(s,t,n) yields E with (up to) n trailing characters in s changed into the single or (by relative position) corresponding character in t.
s must be at least one character long.
t must be exactly one character long or the same length as s.
n must be >= 0.
For example:
tcols $1.ccto(/*/,/-/,3)
applied to the input lines:
****::::****::*****
****::::****::***
****::::****::**
yields the following output lines:
****::::****::**--- ****::::****::--- ****::::****::--
E.dc(s) yields E with any characters in s deleted.
s must be at least one character long.
For example:
tcols $1.dc(/,/)
applied to the input lines:
9,200
1,000,000
678
yields the following output lines:
9200 1000000 678
E.dcl(s) yields E with leading occurences of characters in s deleted.
s must be at least one character long.
For example:
tcols $1.dcl(/$/)
applied to the input lines:
7$6
$1000
$$1000$
yields the following output lines:
7$6 1000 1000$
E.dco(s,m,n) yields E with mth..nth occurences of any characters in s deleted.
s must be at least one character long.
m and n must be integers >= 1, with m <= n.
For example:
tcols $1.dco(/_/,1,2)
applied to the input lines:
red_brick_house
narrow_valley
bottom_of_deep_well
yields the following output lines:
redbrickhouse narrowvalley bottomofdeep_well
E.dcp(s,m,n) yields E with characters at positions m..n equal to the character(s) in s deleted.
s must be at least one character long.
m and n must integers >= 1, with m <= n.
For example:
tcols $1.dcp(/a/,6,10)
applied to the input lines:
aaaaa.....aaaaaa
aaaaab..bbaaaaaa
aaaaa
aaaaabbb
aaaaa.
yields the following output lines:
aaaaaaaaaaa aaaaabbbaaaaaa aaaaa aaaaabbb aaaaa
E.dct(s) yields E with trailing occurences of any characters in s deleted.
s must be at least one character long.
For example:
tcols $1.dct(/0123456789/)
applied to the input lines:
A1000
A1000c
B975
yields the following output lines:
A A1000c B
E.dcto(s,n) yields E with (up to) n trailing occurrences of characters in s deleted.
s must be at least one character long.
n must be >= 0.
For example:
tcols $1.dcto(/./,3)
applied to the input lines:
xxx.
xxx...
xxx.....
yields the following output lines:
xxx xxx xxx..
E.desc yields E with every:
\' changed to ' \" changed to " \\ changed to \ \t changed to tab \n changed to newline
desc changes every \xHH (where HH is exactly two hexadecimal digits) to the corresponding ASCII character.
desc changes every \O (where O is one, two, or three octal digits) to the corresponding ASCII character.
desc makes no other changes. For example, \z is not changed to z.
E.padl(s) yields E with leading blanks replaced by the first character of s.
s must be exactly one character long.
For example:
"/ 55/.padl(/0/)" yields: 0055
E.padt(s) yields E with trailing blanks replaced by the first character of s.
s must be exactly one character long.
For example:
"/ok /.padt(/./)" yields: ok...
E.resc yields E with every:
' changed to \' " changed to \" \ changed to \\ tab changed to \t newline changed to \n
For example, resc applied to:
'ok' yields: \'ok\' a"b' yields: a\"b\' kh\k yields: kh\\k \'\" yields: \\\'\\\"
(Newlines can only occur as the result of desc applied to a string that contains \n)
E.tr yields E without leading or trailing whitespace.
For example:
"/ aa a /.tr.sqt" yields: 'aa a'
E.trl yields E without leading whitespace.
For example:
"/ aaa/.trl.sqt" yields: 'aaa'
E.trt yields E without trailing whitespace.
For example:
"/aaa /.trt.sqt" yields: 'aaa'
String Functions |
(E1,E2,...).app(s) yields s appended to E1, E2, ...
Useful for appending the same string to several expressions.
For example:
(4,5,6).app(/.00/) yields: 4.00 5.00 6.00
(E1,E2,...).cat yields the concatenation of E1, E2,...
For example:
($2,$3,$1).cat
applied to the input line:
56 john zap
yields:
johnzap56
E.clip(i,j) yields E with the i leftmost and j rightmost characters clipped off.
i and j must be integers greater than or equal to 0.
If the length of E is less than or equal to i + j, then E.clip(i,j) yields the empty string.
For example:
/abcdefg/.clip(2,3) yields: cd
dqt works exactly like sqt, but handles double quotes (").
duqt works exactly like suqt, but handles double quotes (").
E.if(f,g) yields: g if E is equal to f; E if E is not equal to f.
If E and f are both integers, they are compared numerically; otherwise they are compared ASCII-wise.
For example:
$1.if(20,/TWENTY/)
applied to the input lines:
20 67 4 0020
yields the following output lines:
TWENTY 67 4 TWENTY
E.ifel(f,g,h) yields: g if E is equal to f; h if E is not equal to f.
If E and f are both integers, they are compared numerically, otherwise they are compared ASCII-wise.
For example:
$1.ifel(20,/TWENTY/,/other/)
applied to the input lines:
20 67 4 +0020
yields the following output lines:
TWENTY other other TWENTY
E.len yields the number of characters in E.
For example:
/mama/.len yields: 4
E.ljf(w) yields E left justified in a field of at least w spaces.
w must be an integer in the range 1..1024.
For example:
45.ljf(7).sqt yields: '45 ' 45.ljf(2).sqt yields: '45' 45.ljf(1).sqt yields: '45'
E.low yields E with all letters in lower case.
low does not touch non-letters.
E.nl appends a string containing just a newline to E.
For example, the command:
tcols -o, from myfile $1 $2.nl $3 $4
applied to the file "myfile" containing:
this is line 1 this is line 2
prints the following to the screen:
this,is line,1 this,is line,2
Note that tcols does not write the output separator after the newlines.
(E1,E2,...).pre(s) yields s prepended to E1, E2, ...
Useful for prepending the same string to several expressions.
For example:
(2,3,4).pre(/#/) yields: #2 #3 #4
(E1,E2,...).prf(s) yields (the format string) s expanded according to format specifications and Es.
A format specification has the general form ([] denotes an optional item):
#[flags][width][.prec]format
flags are one or more of:
Flag | Effect |
---|---|
- | Print left justified. (Default is to print right justified.) |
blank | Always print leading '-' for negative numbers, and leading ' ' (blank)
for positive numbers. Only relevant for use with d, f, g and e formats (see below). |
+ | Always print leading '-' for negative numbers, and leading '+' for
positive numbers. Only relevant for use with d, f, g and e formats (see below). |
width is an integer in the range 1..1024; leading 0 (as in #04) means left-pad integer or floating point with 0s,
prec is an integer in the range 0..15 (default 6). Only relevant for use with f, g and e formats (see below).
format is one of:
Format | Effect |
---|---|
s | Print as string. |
d | Print as decimal integer. |
f | Print as floating point number with always prec digits after '.' If prec is 0, neither '.' nor following digits are printed. |
g | Print as floating point number using up to prec digits after '.'. Non-significant zeros are replaced by blanks. If prec is not 0, at least one zero is printed after '.'. If prec is 0, neither '.' nor following digits are printed. |
e | Print as scientific floating point number. |
Notes:
Also, prf replaces every ## in s by #, every \t by a real tab, and every \n by a real newline.
For example:
(/abc/,55,-123).prf(/#-5s:###05d:#7dX/)
yields:
abc :#00055: -123X
..... ..... ....... 5 5 7
There must be enough Es for the format specifiers. Extra Es are ignored.
Here's an example of formatting floating point values. The command:
tcols "-o " $1.prf(/#08.3f/) $2.prf(/#+6.2g/) $3.prf(/#15.4e/)
or, simpler:
tcols "$1..3.prf(/#08.3f #+6.2g #15.4e/)"
applied to the following input data:
1.1 2.20 3.0034
55.6777 -0.0345 0.01
gives the following output:
0001.100 +2.2 3.0034e+00 0055.678 -0.03 1.0000e-02
........ ...... ............... 8 6 15
Notice especially how the second column is aligned along '.', and only has significant digits.
Bug note: Due to an error in my C compiler's I/O library, combining the blank or + flag, a width indicating leading 0s, and one of the f/g/e formats, will sometimes produce an output field that has one leading 0 too much. Currently, I have no solution for this problem. As a workaround, you can reduce the width by one.
E.rep(i) yields E repeated i times into one string.
i must be an integer greater than or equal to 0.
For example:
/x/.rep(5) yields: xxxxx (/x/,/y/,/ab/).rep(5) yields: xxxxx yyyyy ababababab
E.rev yields E reversed.
For example:
/istanbul/.rev yields: lubnatsi
Note that rev changes \' to '\, etc.
E.rig(i) yields the i last characters of E.
i must be an integer greater than or equal to 0.
If E has less than i characters, E.rig(i) yields E.
E.rjf(w) yields E right justified in a field of at least w spaces.
w must be an integer in the range 1..1024.
For example:
45.rjf(7).sqt yields: ' 45' 45.rjf(2).sqt yields: '45' 45.rjf(1).sqt yields: '45'
E.sqt yields E surrounded by single quotes (').
For example, sqt applied to:
hey yields: 'hey' 'hey yields: 'hey' hey' yields: 'hey' 'hey' yields: 'hey' ' yields: '' '' yields: '' hey\' yields: 'hey\''
sqt applied to the empty string yields: ''
E.subs(i,j) yields the i'th ... j'th characters of E.
i and j must be integers greater than or equal to 1.
j must be greater than or equal to i.
If i is greater than the length of E, E.subs(i,j)
yields the empty string.
If j is greater than the length of E, E.subs(i,j)
yields characters i .. length-of-E of E.
For example:
/abcdefgh/.subs(3,6) yields: cdef
E.suqt yields E without surrounding single quotes (').
For example, suqt applied to:
'hey' yields: hey 'hey yields: hey hey' yields: hey '' yields the empty string ' yields the empty string hey\' yields: hey\'
E.tag(t) yields <t>E</t>. This allows for a basic way of turning table data into HTML fragments.
Only the initial part of t - up until the first whitespace (if any) - gets written in the closing tag.
For example:
tcols -o from myfile "$1..l.tag(/TD align=center/)"
applied to the file "myfile" containing:
Jan 1999 75000
prints the following to the screen:
<TD align=center>Jan</TD><TD align=center>1999</TD><TD align=center>75000</TD>
Another example, creating an entire table row:
tcols -o from myfile $1..l.tag(/TD/).cat.tag(/TR/)
applied to the file "myfile" containing:
Jan 1999 75000
prints the following to the screen:
<TR><TD>Jan</TD><TD>1999</TD><TD>75000</TD></TR>
E.upp yields E with all letters in upper case.
upp does not touch non-letters.
Pattern Functions |
See trows documentation for a description of pattern matching.
Note that all pattern matching works according to the "left-most, maximum munch" principle.
E.cp(p,s) yields E with all occurences of pattern p changed into string s.
For example, the expression:
tcols $r.cp(/[0-9]+/,/NUMBER/)
applied to the input lines:
27 trucks to E23
My age is 31, soon 32
yields the following output lines:
NUMBER trucks to ENUMBER My age is NUMBER, soon NUMBER
E.cpl(p,s,n) yields E with the n last occurences of pattern p changed into string s.
n must be an integer >= 0.
For example, the expression:
tcols $r.cpl(/[0-9]+/,/NUMBER/,2)
applied to the input lines:
4 77 23 101 56 3
677 2 4
yields the following output lines:
4 77 23 101 NUMBER NUMBER 677 NUMBER NUMBER
E.cpo(p,s,m,n) yields E with the mth..nth occurences of pattern p changed into string s.
m and n must be integers >= 1, with m <= n.
For example, the expression:
tcols $r.cpo(/[0-9]+/,/NUMBER/,2,4)
applied to the input line:
4 77 23 101 56 3
yields the following output lines:
4 NUMBER NUMBER NUMBER 56 3
Conversion Functions |
E.d2h yields E in hexadecimal form.
E must be an integer in decimal form.
For example:
256.d2h yields: 100
If E is negative, the number of hexadecimal digits in the result depends on the type of CPU tcols is run on. (tcols uses the C 'long integer' type for internal number representation.)
E.d2h yields E in octal form.
E must be an integer in decimal form.
If E is negative, the number of octal digits in the result depends on the type of CPU tcols is run on. (tcols uses the C 'long integer' type for internal number representation.)
E.h2d yields E in decimal form, possibly preceeded by a minus sign.
E must contain only hexadecimal digits (0..9 a..f A..F).
E.o2d yields E in decimal form, possibly preceeded by a minus sign.
E must contain only octal digits (0..7).
Maths Functions |
E.abs yields the absolute value of E.
E must be an integer or floating point number.
(E1,E2,...).nmax yields the numerically greatest of E1, E2, ..., which must all be integer or floating point numbers.
Note: When used on a mix of integers and floating point numbers, this function will always yield a floating point result.
(E1,E2,...).nmin yields the numerically smallest of E1, E2, ..., which must all be integer or floating point numbers.
Note: When used on a mix of integers and floating point numbers, this function will always yield a floating point result.
E.rnd(n) yields E rounded to n decimal places, in a format obeying the -fpfF option.
E must be a floating point number.
n must be an integer in the range 0..15.
(E1,E2,...).add yields: E1+E2+..
E1, E2, ... must all be integer or floating point numbers.
Note: When used on a mix of integers and floating point numbers, this function will always yield a floating point result.
Miscallenous Functions |
(E1,E2,...).amax yields the greatest of E1, E2, ... when compared as ASCII strings.
For example:
($1,$2,$3).amax
applied to the input line:
lemonade gin port
yields:
port
(E1,E2,...).amin yields the smallest of E1, E2, ... when compared as ASCII strings.
E.dup(i) yields: E i times
i must be an integer greater than or equal to 1.
For example:
$1.dup(3)
applied to the input line:
56
yields:
56 56 56
Another example:
$1..3.dup(2)
applied to the input line:
a b
yields:
a b a b
(E1,E2,...).rng(i,j) yields: Ei ... Ej
i and j must be integers greater than or equal to 1.
i must be within the count of E1,E2,...
j must be greater than or equal to i.
For example:
$1..l.rng(2,4)
applied to the input line:
56 4 11 899 66
yields:
4 11 899
(E1,E2,...).turn yields ... E2 E1
For example:
$1..l.turn
applied to the input line:
56 4 11 899 66
yields:
66 899 11 4 56
Limitations | Top || Previous || Next |
This section describes tcols's limitations. Normally these limitations won't bother you, but anyway, here they are:
tcols will print an error message to standard error (or logfile, if used), if any of the above error situations occurs.
Return Codes | Top || Previous || Next |
tcols returns with one of the following codes ("error levels"):
Code | Meaning |
---|---|
0 | Success |
1 | Skipped bad input data (tcols was invoked with -w option) |
101 | Out of memory |
102 | Incorrect/missing command line arguments |
104 | Error opening file |
105 | I/O Error |
106 | Capacity overrun |
107 | File name clash |
109 | Bad input data |
For more details, see TextTools General Features.
Version History | Top || Previous |
These are the released versions of tcols:
Version | Date | Changes |
---|---|---|
1.10 | 25-Feb-96 | n/a |
1.20 | 13-May-96 |
|
1.30 | 24-Sep-96 |
|
1.31 | 8-Apr-97 |
|
1.50 | 21-Jun-97 |
|
2.00 | 2-Jan-99 |
|
2.10 | 21-Oct-00 |
|
End of document |