GREP 7.2 — Reference Manual
Find Regular Expressions in Files
Program Dated 13 Jan 2003 / Document Dated 13 Jan 2003
Copyright © 1986-2003 Stan Brown, Oak Road Systems
Program Dated 13 Jan 2003 / Document Dated 13 Jan 2003
Copyright © 1986-2003 Stan Brown, Oak Road Systems
Summary: This reference manual gives complete details on treatment of input files, GREP's command-line options, basic and extended regexes, and GREP error and warning messages. Please read the user guide first for an overview of how GREP works.
See also: GREP revision history
PCRE reference for extended regexes
As mentioned in the user guide, you can specify named input files or have GREP read the standard input (possibly with redirection or piping). Redirection and piping are provided by your operating system; this section tells you how to specify named input files.
You can specify named input files two ways: on the
command line and in a list file
referenced with the /@
option. You
can also exclude files or groups of files by using the
/X
option.
Input filespecs and /X
exclusion filespecs use normal DOS
conventions augmented by some features from
UNIX-style filename globbing.
These rules apply to all filespecs, whether or not they contain wildcards:
The case of letters is not significant.
Either slash (\ or /) can be used to separate directories in a path. For example:
grep regex ..\anstia.c *.h d:/dir1/dir2/orich.htm
Anything that starts with a forward slash is
interpreted as an option. If you want file EABC in the root directory,
either use a backslash (\EABC
) or specify the drive
letter (D:/EABC
or D:\EABC
). GREP would read
plain /EABC
as setting the E, A, B, and C options.
Spaces are allowed in file and path names, but then you must enclose the filespec in double quotes. This is a DOS restriction, not a feature only of GREP. For instance,
grep regex c:\Program Files\My Office\*
contains three filespecs, namely c:\Program, Files\My, and Office\*. That's probably not what you meant. Double quotes preserve your intended meaning:
grep regex "c:\Program Files\My Office\*"
Hyphens (-
) are allowed in file and
path names. If a filespec begins with a hyphen, GREP will read it as
an option. To prevent this, use the standard DOS syntax for "current
directory". For example, to search a file named -omega.txt, type
either of
grep regex ./-omega.txt grep regex .\-omega.txt
If you want all files in a directory
you must say so using wildcards
(below); you can't just put the directory name on the command
line. If you make this error, GREP will remind you to use
path\*
for your filespec.
(path\*.*
means the same thing.)
You can suppress this warning with
the /Q3
option.
Beginning with release 7.0, GREP16 and GREP32 treat wildcards in filenames identically. The rules are derived from DOS conventions and UNIX "globbing". There are three wildcard characters (* [ ?). Here are the extra rules, in addition to the rules in the previous section for all filespecs.
Only the filename part of a filespec
can be wildcarded. Paths may not contain *, ?, or [. Therefore
c:\invoices\*\a*.doc
is not a legal filespec. You can get
the desired effect with c:\invoices\a*.doc
and the
/S
option.
* in a filename matches any number of characters, including no characters at all. Any of the matched characters may be a dot (.).
There are two special cases: *.
at the end of a
filename matches all character sequences
that don't contain a dot (filenames with no extension),
and *.*
at the end of a
filename matches any character sequence whatever (just like plain
*
), whether it contains a dot or not. These special cases
match traditional DOS wildcard rules.
Examples: abc*xyz
matches files ABCXYZ, Abcdwxyz,
AbC.xyZ, ABCDEF.wxyz, and so on.
stmt*
and stmt*.*
match STMT itself
and any other filename starting with STMT, but stmt*.
matches only filenames starting with STMT that have no extension.
? in a filename matches one
character, which could be a dot (.). The character must be there, even
if the ? comes at the end of the pattern.
Examples: abc?
matches ABCD but not ABC;
abc?ef
matches ABCDEF and ABC.EF but not ABCEF.
[...] (square brackets)
in a filename matches any character within the class.
Example: *[abc]*
matches any filename
that contains a, b, or c, in upper or lower case.
[^...] or [!...] in a filename
matches any character that is not within the class.
Example: [^abc]*
or [!abc]*
matches any
filename that doesn't start with A, B, or C in upper or lower case.
- (hyphen) within square brackets
indicates a character range.
The endpoints must be two capital letters, two lower-case letters,
or two non-letter characters.
Examples: [ag-jz]?
matches two-character filenames
that start with capital or lower-case A, G, H, I, J, or Z.
?[^ag-jz]
or ?[!ag-jz]
matches any
two-character filename that ends with a character other than A, G, H,
I, J, or Z, in upper or lower case.
Caution: Globbing is not regexes. in a filespec,
[0-9]*
means one decimal digit followed by zero or more
characters; it does not mean zero or more digits.
Normally, GREP will ignore hidden and system files when
expanding wildcards. If you want to include hidden and system files in
the search, use the /A
option.
If you name a specific file, without wildcards, GREP will try to
open it regardless of the /A
option.
It may happen that you mistype an input filespec on
the command line.
At the end of execution, GREP will warn you about
each input filespec that didn't match any files. That warning is suppressed
like the rest if you specify the
/Q3
option.
GREP will give you a similar warning about filespecs from a list
file (/@
option) that don't match any
actual files. That warning will appear right after GREP reads that
filespec from the list file.
Caution: If you exclude files with the /X
option,
you may cause GREP to bypass existing files. Consider this
example:
grep regex abcde.htm /X*.htm
In this situation, GREP will tell you that no files matched
abcde.htm. This is correct, since /X*.htm
makes GREP
exclude every *.HTM file. GREP reminds you of this possibility when
you have /X
file exclusions.
When in doubt about which files GREP is scanning, you can use the
/B
option to make GREP
tell you the name of every file it examines. If you want to know why
GREP is bypassing certain files, use the
/D
option for full debugging display.
Four sections below describe the options in detail, by functional groups: input file options, pattern-matching options, output options, and general options.
Want a quick overview? See the one-sentence summary of every option in the user guide.
/@-
or /@
file — Take Input Filespecs from Keyboard or FileIf you have too many input filespecs to put on the command line, you can put them in a list file for GREP to read. This can also be useful when some program generates a list of files and you want to have GREP examine every file in the list; see an example below.
file must follow the @
with no intervening
space, and ends at the next space; it may not contain wildcards.
If you use a minus
sign for the file (the /@-
option), GREP will accept
filespecs from standard input.
Standard input is the keyboard, unless you
redirect it from a file with the
< character or pipe it from
another command with the | character.
In the list file, filespecs must appear one per line.
They may contain wildcards.
Spaces are legal within a filename; don't put quotes around a filename
that contains spaces.
Leading and trailing spaces will be automatically removed; if you
actually want a space at the start or end of the filespec you can
specify it as [ ]
.
Interactions:
/S
option, it will also apply to
filespecs in the list file.
/X
option,
GREP will not read that file. For example, if you
specified /X*.exe
to exclude all .EXE files, and your
list file contains ABC*
, GREP will process all files
starting with ABC except for ABC*.EXE.
Example: Suppose you want a list of files that contain both "this"
and "that", but not necessarily on the same line. You can GREP once
for "this" and produce a file list with the
/L
option, then GREP a second time
for "that", using just the files that contain "this":
grep this * /L | grep that /@- /L
/A
— Include Hidden and System FilesInclude hidden and system files when expanding wildcards (*, ?, [) in filespecs. Without this option, GREP will ignore hidden and system files while searching for files that match a wildcard. However, if you explicitly specify a file on the command line, GREP will always read it even if it's a hidden or system file.
The /A
option also modifies the action of the
/S
option (if present),
determining whether subdirectories marked hidden or system will be
searched.
The /A
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /A+
.
/R
n — Read and Display Input Files as Binary or TextProcess named input files as text or binary. (Please see Binary Files and Text Files in the user guide for detailed information about the differences.) You can choose from these modes:
/R0 |
Read all input files as text. (This is the default.)
|
/R1 |
(reserved for future use)
|
/R2 |
Read all input files as record-oriented binary. The fixed
record length is given by the
/W option.
|
/R3 |
Read all input files as free-format binary. The
/W option
gives the buffer size. (To find all matches, make sure your buffer
size is at least twice the longest string you expect to find.)
|
/R-1 and /R-2
(registered version only) | |
Examine each input file to decide whether to read and display
it as free-format binary (like /R3 ) or text (like
/R0 ); display "binary" or "text" with the filespec in
the header.
If you gave two numbers with the /W option,
the first number is used as line width for text files
and the second as buffer size for binary files.
How does GREP infer the file type? It reads until it finds a binary character, namely any of the characters ASCII 0-6 or 14-26. The file is binary if it contains any of those characters; otherwise it's treated as text. The difference between the two options is that /R-1 reads only the first 256 bytes
and R-2 reads the whole file or until it finds a binary
character.
Caution: After GREP decides whether the file is text or binary, it either rewinds the file (if it's binary) or closes and reopens it (if it's text). Ordinarily that's not a problem, but if you specify a pseudo-file like COM1 or CON, the bytes that were used to decide whether it's a text file will be discarded. Use /R-1 or /R-2 only with real files.
Should you use /R-1 or /R-2 ?
Experiments show that 256 bytes is plenty for a correct decision
for most file types, including picture files, executable programs,
and MS Office files of all types. Adobe Acrobat PDF files are an
exception, in that the first binary byte shows up well after byte
256; but the displayed text is encrypted in those files so you
can't search for text in them anyway. (If anyone knows of another
file type where binary bytes show up only after byte 256, I'd be
grateful for information.)
Thus /R-2 is theoretically safer than /R-1 ,
but by the same
token /R-2 will be slower on a big file that is
actually text. The difference may or may not be noticeable,
depending on how fast your disk and your CPU are and how your
operating system buffers file reads.
So which one should you use? My own choice is to put put /R-1 in the environment
variable. That way I am confident that GREP will correctly
sense the type of non-PDF binary files, yet not take a long time
to decide that a big text file is actually text.
|
Setting the /R
option correctly lets
you search for regexes in .EXE and .DLL files,
word-processing files, and so forth. /R-1
or /R-2
can be
particularly useful when you don't know whether files are text or
binary. (For instance, Microsoft Word writes some .DOC files in
a binary format and some .DOC files in a text format. Or you might
have some source files and some object files and want to search them
all in one go.)
Only named input files can be read in binary mode. Regardless of the
/R
option value, when you use the
/F
option to read
regexes from a file, that file is read in normal
text mode.
Also, if you don't specify any input files, GREP always scans the
standard input in text mode.
/S
— Scan SubdirectoriesPlease see the section on subdirectory searches in the user guide.
The /S
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /S+
.
/W
width or /W
txwid,bnwid — Specify Line Width or Binary Block LengthExpect text lines up to txwid characters long, or process binary files in records or buffers of bnwid bytes. (If you specify only one number, it's used for both txwid and bnwid.)
txwid and bnwid default to 4096 in GREP32, and you can specify
anything from 2 to 2147483645; the default for GREP16 is 256
and you can specify 2 to 32765.
(The widths are also limited to available memory, which will
depend on your system configuration, what other programs you have
running at the time, and what you specify with the
/P
option. With GREP32, available memory
includes Windows virtual memory.)
(For full details of binary and text file modes, please see that section in the user guide.)
/W
option without /R
or with /R0
)The CR/LF (ASCII 13 or 10 or both) line terminator doesn't count against the specified txwid. If GREP reads a long line from the input, it will break it after txwid+1 characters and treat the remainder as a separate line. The whole line gets scanned, but any match that starts before the break and ends after the break will be missed. Therefore, if possible you should set txwid large enough to hold the longest line in the file.
If GREP does find any lines longer than the specified or
default txwid, it will display a warning message at the end of
execution, telling you the length of the longest line.
(This warning is
suppressed by the /Q3
option.) GREP will also
log every such file in the
debug output; look for "exceeds txwid".
/W
option with /R2
)Files are read in records of bnwid bytes. Make sure that you set bnwid to the exact length of the records in the binary file.
/W
option with /R3
)Files are read in buffers of bnwid bytes.
bnwid must be an even number.
The recommended value of bnwid is at least twice the longest
string you expect to find. For instance, if you're searching for a
regex that might match up to 40 characters, you want to specify
/R3 /W80
, since 2×40=80.
If you're not sure just how long a string in the file will match your
regex, it's better to overestimate a bit than to underestimate.
An internal procedure ensures that if a match exists in the file it will be found, provided the match is not longer than half the buffer. (As always, if one buffer contains multiple matches only the first match in that buffer will be counted.)
/W
option with /R-1
or /R-2
)txwid is used as a line width for any file that is treated
as a text file, and bnwid is used as buffer width for
any file that is treated as free-form binary.
bnwid must be an even number.
If you specify only one number with /W
, it is used for
both purposes and must be an even number.
When you use the /R-1
or /R-2
option, I recommend
that you specify two numbers with the
/W
option. The first number, text
line width, should be rather large so that every line is kept as a
unit. The second number, binary buffer width, should be smaller (see
"Free-form binary mode", just above) so that you don't get whacking
great swathes of binary displayed.
/X
pattern — Exclude Matching Files from ScanOut of the named input
files, don't scan any that match the pattern.
The pattern may contain the same wildcards
as an input filespec, but no drive or path information.
pattern must follow the X
with no intervening
space, and ends at the next space.
If a filespec on the command line or in the list file
(/@
option) matches an exclusion you
specified with the /X
option, GREP will not read that
file. For example, if you specified /X*.exe
to exclude
all .EXE files, and your list file contains ABC*
, GREP
will process all files starting with ABC except for ABC*.EXE.
It is legal to specify multiple
/X
options, but only one pattern may be specified with
each option. Example: Suppose you want to exclude MS-Word documents, Excel
spreadsheets, and ABC.DEF from the search. You can type something like this:
grep regex /x*.doc /x*.xls /xabc.def * or: grep regex * /x*.doc /x*.xls /xabc.def
Like the other options, the /X
option is
scanned and interpreted before any input filespecs and before any list
file (/@
option) is read. The two
commands in the above example mean exactly the same thing.
You can store one or more /X
options
permanently in the environment variable.
Any /X
exclusions on the command line will be equally
effective with those in the environment variable. The special case
/X*
tells GREP to disregard all previous exclusions
specified with /X
.
/E
regex_level — Select Extended Regexes or StringsThis option tells GREP how to interpret the regex(es) you enter on the command line, from keyboard, or in a file.
Basic and extended regexes are fully explained under Regular Expressions, later. An extended regex supports all the features of a basic regex plus the quantifiers ? and {...}, alternatives |, subexpressions (...), some special constructs with the backslash \, and more.
/E0 |
Don't use regular expressions at all. Treat the
regex(es) as simple literal strings and search files for exact
match with no special treatment of any characters.
|
/E1 |
Treat regexes as basic regexes. This is how GREP
always worked before release 6.0, and it is still the default.
|
/E2 |
(GREP32 only)
Treat regexes as extended regexes.
|
/E4 |
(GREP32 only)
Treat regexes as stand-alone words. For example, if you specify
the regex other , GREP will find all occurrences of
"other" but will ignore it where it occurs as "others", "mother",
"brothers", and so on.
By default, a "word" is any group of letters, digits, and underscores bounded by start or end of line and/or by any other characters. For instance, if you're searching for other with the
/E4 option, then "other55" would not be found because
the 5s are part of the "word".
If this is a problem, you can redefine a "word" to be any sequence
of non-blanks, or any sequence of letters. Please see the
/M option for details.
When you use /E4 you probably won't put special
characters in your search regex. But if you do, it will be treated
as an extended regex. In fact, the
E4 option is the same as /E2 except that
GREP slaps a \b (assert word boundary) aat the
beginning and end of your regex.
|
/E0\ /E1\ /E2\ /E4\ | |
These are the same as /E0 /E1 /E2 /E4
except that they turn on the (deprecated)
Special Rules for the Command Line,
which are described later in this reference manual. The Special
Rules are the old way to have a regex contain characters like <
and | that have special meanings to DOS. The better way to bypass
DOS command-line restrictions is to use the
/F option and enter your regex.
|
If you never specify the /E
option at all, the effect is
the same as /E1\
, which is basic regexes with the
Special Rules for the Command Line
enabled; this default was chosen to match GREP's behavior before
release 6.0. /E
with no number is the same as
/E1
, which specifies basic regexes without the Special Rules.
/F-
or /F
file — Read Regexes from Keyboard or FileGREP reads one or more regexes from file instead of taking a single regex from the command line, and reports lines from the input file(s) that match any of the regexes read from file. You must enter the regexes one per line in the file; don't put quotes around them.
file must follow the F
with no intervening
space, and ends at the next space; it may not be wildcarded.
If you use a minus sign for the file (/F-
option),
GREP will accept regexes from standard input.
Standard input is the keyboard, unless you
redirect it from a file with the
< character or pipe it from
another command with the | character.
When you supply two or more regexes, GREP normally reports each line
from the input file that matches any
(at least one) of the regexes.
If you set the /V
option or
/Y
option or both, you
modify that behavior according to the rules of logic.
Specifically:
/Y
but not /V
, GREP reports
only the lines that match all of the regexes in any order.
/V
but not /Y
, GREP reports
only the lines that match none of the regexes. (If the input line
matches one or more of the regexes, GREP doesn't report it.)
/V
and /Y
, GREP reports
every line that matches less than all of the regexes, i.e.
every line that matches 0 to N-1 of your N regexes. (If the input line
matches all the regexes, GREP doesn't report it; if it matches some of
the regexes but not all, or none of the regexes, GREP reports it.)
(The /F
file option is active only in the
registered version.
/F-
works in all versions.)
/I
— Ignore Case in MatchingIgnore case, treating capitals and lower case as matching each other.
Caution: By default, the /I
option does not apply to 8-bit
characters (characters 128-255). You can turn on 8-bit character
support in GREP32 with the /M
option.
In GREP16, the /I
option does not apply to 8-bit
characters (characters 128-255) because Microsoft C 16-bit code does
not support setting the locale.
Therefore, if you want case-blind comparisons in GREP16, you must explicitly
code any 8-bit upper and lower case in your regex.
For instance, to search for the French word "thé" in upper or
lower case, code it as th[éÉE]
since
é can be upper-cased as É or as plain E. The "th", being
7-bit ASCII characters, will be found as upper or lower case by the
/I
option. (You may need to code 8-bit characters like
éÉ
in a special way if you enter them on
the command line; either use the /F
option
or see Special Rules for the
Command Line below.)
/M
loc or /M
loc,word — Specify Character Mapping and Define "Word"Set the character mapping or locale. This option is available only in GREP32, because Microsoft 16-bit C does not support setting the locale. There are four issues with locale: binary output, case-blind matching, the definition of a "word", and character classes in general. Details about all four are given below, after the list of mappings.
While many locales (character mappings) are supported in GREP32, most are duplicates. The six unique locales are:
/Mfr |
code page 1252, valid for most European
languages including Danish, Dutch, English, Finnish, French, German,
Icelandic, Italian, Norwegian (both), Portuguese, Spanish, and Swedish;
this also matches the MS-Windows U.S.A. character set
|
/Mcsy |
code page 1250, valid for Czech, Hungarian, Polish,
and Slovak
|
/Mell |
code page 1253, valid for Greek
|
/Mrus |
code page 1251, valid for Russian
|
/Mtrk |
code page 1254, valid for Turkish
|
/Mc |
the default C locale, in which none of characters 128-255 are considered letters, digits, punctuation, space, or printing characters |
The recommended strategy is to put an /M
option
in your environment variable with the
appropriate locale and then forget about it. The locale affects the
following issues:
displaying hits in
binary mode: GREP displays
each non-printing character as the four-byte sequence
<nn>, where nn is the hexadecimal value of the
character. GREP32 uses the current /M
mapping to decide
what is and is not a printing character.
case-blind matching
(/I
option):
The default C locale knows only the English alphabet A-Z and
a-z, with no accent marks.
If you're doing case-blind matching and your input files may contain
accented characters like é (character 233) or É (201),
or non-English letters like Å (197) or å (229), you should
use the /M
option with the appropriate mapping from the
above list.
definition of a "word":
This matters in extended regexes that use
\w
and \W
as character types,
\b
and
\B
as word-boundary sentinels, or
[:word:]
for a "word"
character in a character class.
A "word" is any group of "word" characters bounded by non-"word" characters and/or start and end of line. By default, a "word" character is any letter, any digit, or the underscore character (_). But you might prefer to define a "word" as a sequence of letters, or a sequence of non-blank characters.
The optional second argument, word, is used to
redefine what are "word" characters in the input files.
That second argument, if present, must be one of
the three special symbols alpha
,
alnum
, or graph
to define a "word"
character as any letter, any letter or digit, or any printing
character. What is a "letter" or a "digit" or a "printing character"
depends on the locale selected with the /M
option; see
the next bullet for details.
For example, specify /Mfr,graph
to use the
ISO-8859-1 character set and define any non-blank character as a
"word" character, or /MC,alpha
to use the default
locale and define only letters as "word" characters.
You can use the supplied TEST255 file to check the definition of "word" characters in any locale, like this:
grep /e2r2w21 ^[[:word:]] test255
If you specify a locale with /M
and omit the second argument, a
"word" will be any sequence of letters, digits, and underscores.
character
types and character class names
in extended regular expressions (/E2
option):
The question of what is a letter, or a punctuation mark, or a word
boundary, will be different in different locales. If you're using
extended regexes with character types or named character classes, and
your input files may contain accented letters or non-English letters,
you need the /M
option.
You can use the supplied TEST255 file to check the definition of any character class or type. For instance, either of these commands will list all the digit characters in your locale:
grep /e2r2w21 ^[[:digit:]] test255 grep /e2r2w21 ^\d test255
The /M
mapping affects how GREP interprets each
character. But it does not affect the appearance of characters on your
screen; that is controlled by DOS commands like CHCP.
/V
— Display Lines That Don't Contain a MatchShow or count the lines that don't match the regex
instead of those that do. (For the effect of the /V
option
with two or more regexes, see the /F
option.)
The /V
option is not allowed with the
/J
option: it doesn't make any sense
to display only non-matches but display the part of each line that was
a match.
The /V
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /V+
.
/Y
— Multiple Regexes Must All MatchWhen multiple regexes are given
(/F
option), GREP normally reports a
hit if the line, record, or buffer contains a match for any of the
regexes. If you also set the /Y
option, GREP reports a
hit only if the line, record, or buffer matches every regex, though
not necessarily in order. The normal test is an OR; the test with the
/Y
option is an AND.
For example, if you use the /F
option and enter the two
regexes brown
and fox
, then all of these
lines will match:
The quick brown fox I see a brown smudge Crazy like a fox The fox's tail is brown
But if you also use the /Y
option, then GREP will match
only lines that contain both the regular expressions, namely the first
and fourth lines in the example.
As you see from the example, with the /Y
option, input
lines must match all the regexes, but in any order. If you want to
match all regexes in a specific order, specify them as a single regex
connected with ".*". For instance, to match lines that contain "brown"
somewhere before "fox", use the regex brown.*fox
.
While not actually forbidden, the /Y
option usually
doesn't give useful results with the
/R3
option.
For the effect of the /V
option
together with/Y
and /F
, see the
/F
option.
The /Y
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /Y+
.
Before going through the output options, let's take a moment to look at some of the possible output formats. By default, GREP's output is similar to that of DOS FIND:
---------- GREP.C op_showhead = ShowNoHeads; else if (op_showhead == ShowNoHeads) op_showhead = ShowNoHeads; ---------- GREP_MAT.C op_showhead == ShowNoHeads)
However, the /U
option
produces UNIX grep-style output like this:
GREP.C: op_showhead = ShowNoHeads; GREP.C: else if (op_showhead == ShowNoHeads) GREP.C: op_showhead = ShowNoHeads; GREP_MAT.C: op_showhead == ShowNoHeads)
You can see the main difference: DOS-style output has the filename as a header above the group of hits from that file, and UNIX-style output has the filename on the same line with each hit.
The output options give you a lot of control over what GREP produces, but they can be confusing. Here's the executive summary:
/P
option), just the matching lines
(default), just the matching portions of lines
(/J
option),
just a count of hits by file
(/C
option), or just the names of
files that contain matches (/L
option).
/B
option),
only the filespecs of files that contain matches (default),
or no filespec headers at all (/H
option).
/U
option)
and/or the line number (/N
option).
/K
option).
/R2
or /R3
option
GREP reads files in binary
mode, and that has a side effect on the output format.
Now, in alphabetical order, here are the options that control what GREP outputs and how it is formatted.
/B
— Display a Header for Every File ScannedDisplay a header for every file examined, even if the file contains
no matches. (This option is meaningful only with DOS-style output, when the
/U
option is not set.)
The /B
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /B+
.
/C
— Display the Hit Count, Not the Actual HitsDisplay only a count of the hits in each file, instead of the hits themselves.
Lines, records, or buffers are counted, not matches.
If several regexes match the same line,
or a match occurs several times on a line,
the line is counted only once. You cannot use the /C
option to get a full count of the number of matches in the file,
unless you know that the match doesn't occur more than once on any
line.
(For free-form binary, the /R3
option,
the buffer size may affect how many matching buffers are found, since
multiple occurrences in one buffer are counted only once.)
The /C
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /C+
.
/H
— Don't Display Filespecs in OutputDon't display any filespecs as headers.
The /H
option is most appropriate when you're using
GREP as a filter to extract lines from one or more named file for
processing by another program, like this:
grep /H "Directory" inputfilespecs | other program
If you want to keep the filename with each extracted line, use the
/U
option instead of the
/H
option.
The /H
option is not needed and has no effect with
redirected input, such as
grep /H "Directory" <inputfile or: other program | grep /H "Directory"
GREP never displays a filespec header for redirected input.
The /H
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /H+
.
/J
— Display Just the Part of Each Line That MatchesDisplay just the portion of each line that matches the
input regex, not the whole line. If a given line contains multiple
occurrences of the regex, or matches for more than one regex (entered
with the /F
option), only the first
occurrence will be displayed.
If you specify multiple regexes
with the /F
option, and also set the
/Y
option (all regexes must match),
then GREP displays the part of the line/record/buffer that matches the
last regex.
The /J
option behaves similarly for binary files
(/R2
or /R3
option): it
displays only the portion of each binary record or buffer that matches
the regex. If more than one match occurs in the record or buffer, GREP
displays only the first.
The /J
option is not allowed with the
/V
option, because it doesn't make any sense
to display only non-matches but display the part of each line that was
a match.
The /J
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /J+
.
/K
count — Report Only the First Few Hits Per FileStop reading each file and move on to the next after
reporting the first count hits. count may be any number from 0
to 9999. /K0
means to report all matches, and it is the
default.
No special message is displayed in the output when GREP stops
reading a file early because of the /K
option. However,
the event is noted in the debug output (/D
option).
The /K
option displays up to the indicated
number of matches per file. There is no option in GREP to stop
after displaying a certain number of matches total. But you can always
redirect GREP output (>reportfile or |more
) and
then just look at the beginning of the output.
If you also use the /P
option to
report context lines before and after matches, you may see more
matches than requested. For example, suppose you specify
/K2P5,5
to get the first two hits per file, with five
lines of context before and after each one. Five lines will be
reported after the second and last requested hit, naturally. Those
five context lines might contain additional hits, which will be
shown, but the context will not be extended past the five lines that
follow the second hit, the last one you actually requested.
The /K
option and /V
option
together will report the first count lines that
don't contain a match. The /K
option is ignored when you
also specify the /C
option or the
/L
option.
/L
— List Files That Contain Hits, Not the Actual HitsDisplay only a bare list of the filespecs of files that contain matches, not the actual lines that match.
The /L
option and
/V
option together will
display the filespecs of files that don't contain any matches.
The /L
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /L+
.
/N
— Show Line Numbers with HitsShow the line or record number before each matching
line. DOS-style output with the /N
option looks like
this:
---------- GREP.C [ 144] op_showhead = ShowNoHeads; [ 178] else if (op_showhead == ShowNoHeads) [ 366] op_showhead = ShowNoHeads; ---------- GREP_MAT.C [ 98] op_showhead == ShowNoHeads)
With /N
and
the /U
option
used together, the UNIX-style output looks like this:
GREP.C:144: op_showhead = ShowNoHeads; GREP.C:178: else if (op_showhead == ShowNoHeads) GREP.C:366: op_showhead = ShowNoHeads; GREP_MAT.C:98: op_showhead == ShowNoHeads)
UNIX-style output is suitable for use with the excellent freeware editor Vim.
When displaying a buffer from a
free-format binary file —
either
under the /R3
option or because you
specified the /R-1
or /R-2
option and GREP
sensed that the file was binary — the line number is replaced by
a byte number, in hex, with a leading "b" for "byte". The first byte
in the file is numbered 0.
If a text file contains lines longer than the limit given with the
/W
option, each chunk of the line counts
separately. For example, if you specified /W256
but the file
contained a line of 612 characters, it will be counted as three lines and
subsequent line numbers will be increased by 2. GREP warns you at the end of
execution and suggests a /W
value to remedy this problem.
The /N
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /N+
.
/P
before,after — Show Context Lines around Matching LinesShow context lines before and after each match. If you omit
after, GREP will show the same number of lines after each match
as before. Plain /P
is the same as
/P2,2
.
Either number can be 0. For instance, use /P0,4
if you
want to show every match and the four lines that follow it.
/P0
or /P0,0
tells GREP to show only the
matching lines with no context lines, and is the default.
If you use the /P
option, you probably want to use the
/N
option as well,
to display line numbers. In that case,
the punctuation of the line numbers will distinguish which lines are
actual matches and which are displayed for context. Here is some
DOS-style output from a run with the options /P1,1N
set:
---------- GREP.C 143 if (opcount >= argc) [ 144] op_showhead = ShowNoHeads; 145 177 PRTDBG "with each matching line"); [ 178] else if (op_showhead == ShowNoHeads) 179 PRTDBG "NO"); 365 if (myToggle('L') || myToggle('U')) [ 366] op_showhead = ShowNoHeads; 367 else if (myToggle('B')) ---------- GREP_MAT.C 97 op_showwhat == ShowMatchCount || [ 98] op_showhead == ShowNoHeads) 99 headered = TRUE;
As you can see, the actual matches have square brackets around the
line numbers, and the context lines do not. (In UNIX format, with the
/U
option in addition to
/N
and /P
, GREP
displays colons around the numbers of matching lines and spaces
around the numbers of context lines.)
Interactions between the /P
option and the
/R
option:
/R2
option, GREP will
display the indicated numbers of binary records before and after any
record that contains a match.
/R3
option,
the /P
option is not allowed.
/R-1
or /R-2
option, GREP will honor the /P
option when reading
text files but ignore it when reading binary files.GREP16 has to allocate space for the preview lines within the
same 64 K data segment as all other data. Consequently, if you
specify a moderately large value, particularly with a large line
width (/W
option), you may get a
message that GREP can't allocate space for the lines. To resolve this,
use GREP32 if possible; otherwise either reduce either the line width
or the first number after /P
(the before number);
the second number, after, has no effect on memory use.
/U
— UNIX-style Output: Show Filespec with Each HitShow the filespec on the line with each hit, instead of just once in a separate header. This UNIX-style output is useful with editors like Vim that can automatically jump to the file that contains a match. Some examples of UNIX-style output were given at the beginning of "Output Options".
There's one small difference from UNIX grep output: UNIX grep
suppresses the filespec when there is only one input file, but GREP
assumes that if you didn't want the filespec you wouldn't have
specified the /U
option. Neither GREP nor UNIX grep
displays a filespec if input comes from a file via <
redirection.
The /U
option is a toggle. If you specify it twice
(including the environment variable and the
command line), the second occurrence will cancel the first. If you
don't know what's in the environment variable and definitely want to
turn this option on, use /U+
.
/D
file or /D-
or /D
— Display Debugging OutputDebugging information includes whether you're running GREP16 or GREP32, whether the program is registered, the contents of the environment variable, the values of all options specified or implied, the input files specified, the raw and interpreted values of the regex(es), details of every file scanned, execution timings, and more. This information is normally suppressed, but you may find it helpful if GREP seems to behave in a way you don't expect or if you have a bug report.
Since the debugging information can be voluminous, if you want to see it at all you will usually want to specify an output file:
/D file |
Write all debug information to the given
filespec. file must follow the D with no
intervening space, and ends at the next space; it may not be
wildcarded GREP will append to the file if it already exists.
|
/D- |
Send debugging information to the standard output, which you
can redirect (>) or pipe (|). This intersperses debug
information with the normal output of GREP.
|
/D |
Send debugging information to the standard error output
(normally the screen). Be careful not to specify any other options
between /D and the next space, or they'll be taken as
a filespec.
|
You can weed through the debugging output to some extent. GREP writes the following unique strings on most lines of output, so you can send debug output to a file and then grep the file for
grep GC:
parsing the command line
grep GM:
matching regexes
against inputs
grep GR:
parsing and interpreting the
regexes
grep GX:
expanding
input filespecs, including subdirectories
/Qlevel
— Suppress the Logo and Unwanted WarningsRegistered users can set the quietness level to suppress messages you may not want to see:
/Q0 |
(default) Show all messages.
|
/Q1 |
Suppress the program logo; all warnings will still appear.
|
/Q2 |
Suppress the program logo, as well as warnings about invalid
combinations of options. Warnings about missing files will still
appear, as will the warning about lines that were broken in the
middle, possibly missing matches (see the
/W option).
|
/Q3 |
Suppress the program logo and all warnings. This level is not recommended unless you definitely know what you're doing, because you might miss important error messages about your input files. |
Messages that force GREP to stop execution
will always be displayed. Debug output will also be displayed, if you set the
/D
option, regardless of the
/Q
setting.
All messages are listed later in this reference manual.
For compatibility with earlier releases of GREP, you can still specify
a plain /Q
option with no level number, and it means
/Q3
(suppress all warnings), just as in earlier releases. A
plain /Q
after an earlier /Q
or
/Qlevel
re-enables all messages.
/Z
— Reset All OptionsReset all options to their default values.
If you use the /Z
option on the command line, any
options in the environment variable will be
disregarded, and so will any preceding options on the command line.
I recommend using /Z
as the first option on every GREP
command in a batch file. This will make sure that GREP behaves as
expected, uninfluenced by any settings in the environment
variable.
The /Z
option is the only single-letter option whose
effect can't be reversed. If you use /Z
more than once,
GREP disregards the environment variable and all command-line options
up through the last /Z
.
/0
or /1
— Set ERRORLEVEL
to Show Whether Matches Were FoundThese options control the values that GREP returns in the
DOS error level. /0
returns 0 if there are
matches or 1 if there are no matches; /1
returns
1 for matches or 0 for no matches. For more details, see
Return Values in the user guide.
/?
— Display HelpDisplay a help message and summary of
input filespecs,
options, and regex forms,
then exit with no further processing. The help message is more than
100 lines long, so you probably want to pipe it through
more
or a similar filter, like this:
grep /? | more
You can also redirect this information. For instance,
grep /? >prn
will send the help text to the printer.
Registered users who
use certain options frequently can put them in
the ORS_GREP
environment variable. You have the same
freedom as on the command line: leading slashes or hyphens, space
separation or options run together, caps or lower case.
Only options can be put in the environment variable. If you want to
store a regex, put it in a file and put
/F
file
in the environment variable; if you want to store a list of
input filespecs, put them in a file and
put /@
file in the
environment variable.
If you have some options in the ORS_GREP
environment
variable but you don't want one of them for a particular run of GREP,
you don't have to edit the environment variable. You can make most
changes on the command line, like this:
The /Z
option on the
command line makes GREP disregard the environment variable (as well as
any preceding options on the command line).
The numeric options
/0
and /1
,
which set return values from GREP, override each other. The latest one
specified in the environment variable or on the command line will be
effective.
/D
,
/E
,
/F
,
/K
,
/M
,
/P
,
/Q
,
/R
,
/W
, and
/@
in the environment variable can be overridden by
different settings on the command line. (If /D
and
/F
and /@
are set in the environment
variable, you can specify different files for them [including
-
] on the command line, but to clear them completely you
must use the /Z
option on the command
line.)
The /X
option
can be used multiple times, and therefore if you
have /X
in the environment variable and on the command
line, all listed groups of files will be excluded. You can put
/X*
on the command line to clear all previous
/X
options, or of course /Z
to clear all
previous options.
The other single-letter
options — namely,
/A
,
/B
,
/C
,
/H
,
/I
,
/J
,
/L
,
/N
,
/S
,
/U
,
/V
, and
/Y
—
function as toggles, but a "+
" suffix will turn them
definitely on.
Extended example: Suppose you have set the environment variable as
set ORS_GREP=/UNI
because you usually run GREP
with UNIX-style output (/U
option)
with line numbers (/N
option),
ignoring case of letters (/I
option).
If you want to run case sensitive for one particular run of GREP,
simply put the /I
option on the command line to reverse
the setting from the environment variable.
If you don't know what's in
the environment variable, perhaps because you're on an unfamiliar
machine, either put the /Z
option
on the command line followed by the options you want, or set them
positively by specifying for instance /L+
.
Finally, if you want to turn an option definitely off, without
regard to the environment variable, turn it on and then toggle it. To
turn off line numbers, /N+N
will always work, whether
N
was set in the environment variable or not.
(/N-
might be more logical, but for historical reasons
options with leading minus signs are allowed to run together, and such
a usage would conflict.)
If you're ever in doubt about the interaction of options between
the command line and the environment variable, simply add
"/d- | more
" to the end of your command line
and GREP will tell you all the option settings in effect and how it
interprets your regex.
A regular expression or regex is a pattern of characters that will be compared to one or more input files. A line/record/buffer from an input file is a hit if all or part of it agrees with the pattern in the regex. You've already met some examples in the user guide.
A regex can be a simple text string, like mother
, or
it can include a bunch of special characters to express possibilities
like "repeated" and "any of these characters or substrings".
(If you want to search only for simple
strings, use the /E0
option and
ignore all this regex stuff.)
Regexes come in two flavors, basic and extended regexes. If you're new to regexes, you might want to ignore extended regexes while you get comfortable with basic regexes. Use the following Overview to help you find the particular feature you need. On the other hand, if you're already comfortable with regexes, you'll find additional material and tips in Mastering Regular Expressions by Jeffrey Friedl (O'Reilly & Associates).
A regex is a mix of normal characters and special characters. Here's an overview of the special characters, with hyperlinks to the places in this reference manual where they are discussed in detail.
The following characters are special if they occur outside of square brackets:
\
backslash (treat special character as normal)
\
backslash (character
types, simple assertions,
back references,
character encoding, extended regex only)
.
period (matches any character)
*
asterisk (0 or more occurrences)
+
plus sign (1 or more occurrences)
?
question mark (0 or 1 occurrence, extended regex only)
{
left brace (repetition count,
extended regex only)
[
left square bracket (start
character class)
^
caret (match start of line)
$
dollar sign (match end of line)
|
vertical bar (alternatives,
extended regex only)
(...)
parentheses or round brackets
(subexpressions, extended regex only)
The following characters are special if they occur within square brackets:
]
right square bracket (end
character class)
^
caret (negate
the character class)
\
backslash (treat special character as normal)
\
backslash (character encoding, extended regex only)
-
minus sign or hyphen
(character range)
[:
left square bracket followed by colon (introduce a
named character class, extended regex only)
Otherwise, every character is a normal character. Any of the above characters also becomes a normal character if preceded by a backslash, as will be shown below.
GREP offers two levels of regular expressions. This manual will mark certain features as "extended regex"; all others are common to basic and extended regexes.
Basic regexes offer a "core subset" of the regex capabilities. By default, GREP treats your regexes as basic, since that's the only kind there was before release 6.0. Special characters marked as "extended regex" are treated as normal characters in basic regexes.
Extended regexes can do much more than than basic, including
| alternatives,
? optional match,
{ } quantifiers, and
( ) subexpressions.
If you want to use extended regexes, specify the
/E2
option, available only in GREP32.
Acknowledgement: Extended regexes were added to GREP in release 6.0, using the open-source PCRE library package, release 3.5, copyright by the University of Cambridge, England. Thanks are due to Philip Hazel for making this available, and in that spirit extended regexes were added to GREP with no increase in price. The primary download site for PCRE is <ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/>.
This GREP reference manual covers most of the features of extended regexes, but you might want to know about two additional references. For your convenience, the GREP download files include an abridged copy of Philip Hazel's PCRE man page, PCRE.HTM, with just the information relevant to GREP users. His original man page at <http://www.pcre.org/man.txt> also contains considerable information about incorporating PCRE in programs.
Different utilities define regexes differently; the following sections tell you how this GREP defines them. You can find fascinating tables of different interpretations in Jeffrey Friedl's book Mastering Regular Expressions (pages 63 and 182-183 of the 1997 edition).
A note to UNIX or Vim veterans:
This GREP follows the Perl or egrep scheme, which uses |
not
\|
for alternatives, ( )
not
\( \)
for subexpressions, \b
not \<
\>
for word boundaries. Be
alert to differences from the scheme you may know.
Any normal character matches itself. Example: the regex
abc
matches input lines that contain the three
consecutive characters a, b, and c.
You can use any character from space through character 255. When using 8-bit characters or certain special characters on the command line, see Special Rules for the Command Line below.
If you specify the /I
option, any
letter in your regex will match both the upper and lower case
of that letter. (By default, only unaccented English letters A-Z and a-z are
affected by the /I
option. In GREP32, you can use the
/M
option to select a mapping that
includes all letters.)
If you want to match a special character, you must precede it with
a backslash \
in your
regex. Example: to search for the string "^abc\def", you must put
backslashes before the two special characters
( \^abc\\def
).
That makes GREP treat them
as normal characters and not give them special meanings.
The Overview lists all the special
characters.
The period (full stop or dot) in a regex normally matches
any character. Example: o.e
matches lines that contain
"ode", "one", "ope", "ore", and "owe". Of course it also matches lines
that contain "oae", "o e", "o$e", "o´e", and so on.
If you want to match a literal period, for instance to search for
"3.50", you need a backslash before the
period in your regex to turn it into a normal character
( 3\.50
).
In binary mode, the period matches any character including Ctrl-Z, carriage return, and line feed. In text mode, Ctrl-Z is end of file, and carriage return or line feed marks a line break.
A period between square brackets is just a normal character. For example, [.?!] matches any of the characters that end an unquoted sentence.
A plus sign (+
) after a character,
character class,
subexpression, or
back reference
matches one or more occurrences; an asterisk
(*
) matches zero or more occurrences.
In other words, the plus sign means "one or more" and the
asterisk means "any number, including none at all".
(The note on greediness below applies
to *
and +
in extended regexes.)
Example: Big.*night
matches lines that contain
"Big" followed by any number of any character followed by "night".
Since "any number" includes "zero", that regex also matches lines
that contain "Bignight".
Examples: snor+ing
matches lines that contain
"snoring", "snorring", "snorrring", and so on, but not "snoing".
snor*ing
matches those and also "snoing".
Used with a character class or
character type, the plus
sign and asterisk match any multiple characters in the class, not only
multiple occurrences of the same character. For instance,
sno[rw]+ing
matches lines that contain "snowing",
"snorwing", "snowrring", and so on.
Obligatory example: [A-Za-z_][A-Za-z0-9_]*
matches a C
or C++ identifier, which is an English letter or underscore, possibly
followed by any number of letters, digits, and underscores.
(The square brackets enclose character
classes.)
Anything followed by *
will always match. For example,
the regex .*
would match any number of characters
including none, meaning that empty and non-empty lines would match.
.*
is more useful as part of a regex.
Between square brackets, + and * are
normal characters. For instance, the regex
2[*+]2
will match lines containing "2+2" and "2*2".
In an extended regex only, a question mark
after a character,
character class,
subexpression, or
back reference
indicates that the construct is optional.
For example, the extended regex move?able
matches
lines containing "moveable" and "movable", but not "moveeable";
labou?r
matches lines containing "labour" or "labor".
(The note on greediness below applies
to ?
in extended regexes.)
Anything followed by ?
will always match. For example,
the extended regex .?
would match one character or none.
Since every line contains a string of no characters (whether or not
there are some additional characters on the line), every line would be
a match.
?
is a normal character when it occurs within
square brackets in an extended regex; it's
always a normal character in a basic regex.
In an extended regex only, you can use braces (also called
curly braces) after a character,
character class,
subexpression, or
back reference
to specify repetition. The general form is
{
minimum,maximum}
where both
numbers are in the range 0 to 65535 and minimum is less than
maximum. Here are the three variations:
Specify a minimum and maximum number of repetitions:
Aa{1,5}
matches "Aa", "Aaa", "Aaaa", "Aaaaa", or
"Aaaaaa".
Specify an exact number of repetitions:
[0-9]{4}
matches four consecutive digits (not necessarily the
same digit four times).
Specify a minimum number of repetitions:
^.{5,}$
matches lines that contain at least five
characters.
Three special cases of quantifiers have already been discussed.
The asterisk *
is equivalent
to {0,}
;
the plus sign +
is
equivalent to {1,}
; and
the question mark ?
is
equivalent to {0,1}
.
The braces are normal characters in other
contexts. For instance, {,3}
is just four normal
characters because it doesn't match any of the three variations listed
above. The braces are always normal characters inside
square brackets, and the right brace on its
own is always a normal character.
Both braces are normal characters anywhere in a basic regex.
(This is an advanced topic, probably best skipped on the first few readings of this reference manual.)
The quantifiers {...}
, ?
, *
,
and +
can be "greedy" or "ungreedy". A greedy quantifier
consumes as many characters as possible without causing the overall
extended regex to fail; an ungreedy quantifier consumes as few as
possible without causing the overall extended regex to fail.
Because both greedy and ungreedy quantifiers still let the overall
regex succeed if possible, you don't need to worry about the
distinction unless you're using capturing
subexpressions and back
references.
In an extended regex, all quantifiers are greedy by default. You can
make a particular quantifier ungreedy by putting a question mark after
it: {...}?
, ??
, *?
, or
+?
.
For details and examples, please see the "Repetition" section of the included file PCRE.HTM.
To match any one of a group of characters, enclose them in square
brackets [ ]
.
Examples: [aA]
matches a capital or lower-case letter A;
sno[wr]ing
matches lines that contain "snowing" or
"snoring".
Immediately after the opening [ or [^,
a right square bracket is just a normal character: []abc]
matches the character ], a, b, or c. A right square bracket after a
left square bracket and at least one other character ends the
character class, though as always you can use a
backslash to make it normal:
[abc\]]
is the same character class as the preceding.
Finally, a right square bracket with no preceding left square bracket
is a normal character.
In an extended regex, certain abbreviations and class names are available for commonly used classes.
You can indicate a character range with the minus sign
or hyphen (-
, ASCII 45).
Examples:
[0-9]
will match any single digit, and
[a-zA-Z]
will match any English letter.
A character class can contain both ranges and single characters,
mixed any way as long as each range within the class is written
low-high: T-f
is fine since they are ASCII
84 and 102, but f-T
is invalid.
There is no difference to GREP between writing
out all the characters in a range and using the minus sign to
abbreviate a range: [pqrsty]
and [ytsrpq]
and
[yp-t]
and [yq-stp]
are just some of the ways
to write the same class.
The minus sign is a normal character outside
square brackets.
It is also a normal character if it occurs at the beginning or end of a
class (immediately after the opening [
or [^
or immediately before the closing ]
character).
Here's one final example: To match any Western European letter (under most recent versions of Windows, in North America and Western Europe), a basic regex is
[a-zA-ZÀ-ÖØ-öø-ÿ]
(Note 1. That regex will work fine on the command line with GREP16 or in a file
[/F
option]
with either GREP. But to enter it on the command line with
GREP32, you must use numeric sequences for the 8-bit characters; see Special Rules for the Command Line
below.)
(Note 2. In GREP32, you can avoid the above mess.
Set an appropriate character mapping with the
/M
option and use
the extended regex [[:alpha:]]
.
(The /E2
option selects extended
regexes, and
named character classes
are discussed below.)
To match any character that is not in a class, use square
brackets with a caret or circumflex, (^
,
ASCII 94).
Examples: [^0-9 ]
matches any character except a
digit or a space, and the[^a-z]
matches "the" followed by
anything except a lower-case letter.
The negative character class matches any character not within
the square brackets, but it does match a character. It might help to
read it as "a character other than ..." rather than just "not ...".
For instance,
the[^a-z]
matches "the" followed by a character other than a
lower-case letter, but it does not match "the" at the end of a line
because then "the" is not followed by any characters. For further
explanation, please see the "Finding a Word"
under the rules for ^ and $, below.
The caret has a different meaning when it occurs outside square brackets. And when it occurs within square brackets but not immediately after the opening left square bracket, the caret is a normal character.
If you use the /I
option to
specify case-blind matching, then the character class
[abc]
matches an upper-case or lower-case a, b, or c.
With the /I
option in effect, [^abc]
matches
any character except A, a, B, b, C, or c.
Extended regexes support POSIX character class names, such as
[:lower:]
for any lower-case letter and
[:^lower:]
for any character except a lower-case
letter. Notice that you can negate a character class name by putting a
caret after the first colon.
These are not character classes, but special names that you can
insert within square brackets as (part of) a character class. For
instance, the extended regex [AB[:^alpha:]]
matches
a capital A or B or any non-alphabetic character.
Here is the complete list of POSIX character class names. Remember
that they occur inside the normal square brackets for a
character class. Also remember that they
must be surrounded by [: :]
, or
[:^ :]
for negation.
word |
any "word" character (letters, digits and underscore, same as \w and can be redefined with the /M option) |
alnum |
any letter or digit |
alpha |
any letter |
lower |
any lower case letter |
upper |
any upper case letter |
digit |
any decimal digit (same as \d) |
xdigit |
any hexadecimal digit, decimal digits plus A-F and a-f |
space |
any white space character (same as \s) |
graph |
any printing character, excluding space |
print |
any printing character, including space |
punct |
any printing character, excluding letters and digits and the space character |
ascii |
any ASCII character (see note below) |
cntrl |
any control character |
The exact definitions of the above classes will depend on the
character mapping in effect. In the default C locale, the above
classes match only 7-bit characters (character positions 0-127); in
other mappings, 8-bit characters also match.
You can set the character mapping with the /M
option
.
Use the supplied file TEST255
to test the meaning of any
character class in your selected locale; see
examples in the supplied TOUR.BAT file.
A caret or circumflex (^
, ASCII 94)
at the start of a regex
means that the regex starts at the beginning of a line in
the file(s) being searched. A dollar sign ($
,
ASCII 36) at the end of a regex means that the regex
ends at the end of a line in the file(s) being searched.
The caret and dollar are sometimes called anchors because they anchor a regex to the start or end of a line (or both). They're also the two best-known examples of assertions, constructs that match a condition rather than a character.
Examples:
^[wW]hereas
matches the word "Whereas" or
"whereas" at the start of a line, but not in the middle of a line.
Blanks are not ignored, so if you want to find that word whenever it's
the first word of the line, you need to use a pattern like
^ *[wW]hereas
to allow for indention.
^$
will match only lines that contain no characters at
all.
^ *$
will match lines that contain no characters
or contain only spaces.
^ +$
will match lines that contain only spaces,
but not empty lines.
^[A-Za-z]+$
will find every line that contains
nothing but one or more English letters.
^ *[a-z]+ *$
will find every line that contains
exactly one lower-case English word, possibly preceded or followed by
blanks.
You should probably use ^
and $
only in
text mode or
record-oriented binary mode.
Also, they make sense only at the beginning and end of your regex. For
those who prefer to live life on the edge, here are the full
rules:
Basic regex | Extended regex | |
---|---|---|
With line-oriented text or record-oriented binary
( /R0 or /R2 ) |
^ and at the start of a basic regex
matches the start of a line or record; everywhere else (except just
after a left square bracket) it's a normal
character.
$ at the end of a basic regex matches the end of a line
or record; everywhere else it's a normal character. |
^ and $ outside
square brackets always mean start and end of
a line or record. If you misplace them, your extended regex won't match
anything. |
With free-form binary ( /R3 ) |
In a basic regex, ^ and $ outside
square brackets don't match anything useful. |
In an extended regex, ^ and $ outside
square brackets match a newline (ASCII 10). |
When GREP senses file format ( /R-1 or /R-2 ) |
Don't use ^ and $ in a
regex with the /R-1 or /R-2
option. If you do use them, they work correctly in text files, but
in binary files they match the start and end of every buffer,
arbitrary file positions that are not likely to be useful. |
It's a historical artifact that the rules for basic and extended regexes are not quite the same.
Suppose you want to find the
word "the" in a file, whether in caps or lower case. You can use the
/I
option
to make the search case blind, and concentrate
on constructing the regexes.
This section shows progressive refinements of the search technique.
If using GREP32, you might want to skip it and just use the
/E4
option.
At first glance,
[^a-z]the[^a-z]
seems adequate: anything other than a
letter, followed by "the", followed by anything but a letter. That
lets in "the" and rules out "then" and "mother". But it also rules
out "the" at the beginning or end of a line. (Remember that a negative
character class does insist on matching some character. Read it as
"any character other than ..." rather than as simply "not...".) The
solution with basic regexes requires four of them, for "the" at the
beginning, middle, or end of a line, or on a line by itself:
^the[^a-z] [^a-z]the[^a-z] [^a-z]the$ ^the$
To search for just the occurrences of the word "the", put those four
lines in a file and then use the /F
option on GREP.
But this becomes much easier if you use the power of extended
regular expressions (/E2
option,
GREP32 only).
You can search for the word "the", not embedded in larger words, with
one extended regex:
grep /e2 \bthe\b
Read this as "a word boundary, followed by t-h-e, followed by a
word boundary." As you would expect, start and end of line count as
word boundaries. Even easier, the /E4
option
will supply the \b
sequences for you:
grep /e4 the
There might be one problem with the above regular
expression: it would not match "the6" or "the_" since the underscore
and the digits are considered
"word" characters.
(This is how the -w
option works in most UNIX greps,
too.)
It's not likely you'd get such sequences in a text file, but if you
want to be absolutely precise you should use
an option like /Mfr,alpha
to define
"word" characters as just letters.
In an extended regex only, the vertical bar (|
,
ASCII 124) separates two or more alternatives. The extended regex will match
lines that contain any of the alternatives. It is legal for an
alternative to be empty, and this can be useful in
subexpressions.
Example: the extended regex cat|dog
will match any
input line that contains the string "cat" or "dog".
If you want alternatives for part of an extended regex, use parentheses or round brackets to form a subexpression. See the examples in the section on subexpressions.
If you are matching alternatives that must occur at the start of
end of a line, the anchor needs to be in each alternative. Example: to
match lines that start with "cat" or "dog", use ^cat|^dog
as your extended regex. Another way to express that is with a
subexpression, ^(cat|dog)
.
Efficiency note: Alternatives can be slower than character classes.
The extended regexes bar|bat
and ba(r|t)
is logically equivalent to the basic regex
ba[rt]
, but the latter will generally execute faster
(even ass an extended regex).
You may or may not notice any time difference, depending on the speed
of your computer and the size of the files that you're searching.
Caution: The vertical bar |
has special meaning on the
DOS command line. If your operating system doesn't let you override
that meaning, use the
/F-
option to enter your regex from the
keyboard, or see Backslash for
Character Encoding below.
In an extended regex only, the parentheses or round brackets have several uses, but only two will be discussed in this reference manual.
The first use is straightforward: to set up alternatives as part of an extended regex. For example, the extended regex
the quick (brown fox|white rabbit)
matches lines containing either "the quick brown fox" or "the quick white rabbit". Here's another example, adapted from the PCRE manual page:
cat(aract|erpillar|)s
matches lines containing "cataracts", "caterpillars", or "cats".
The second use of parentheses is to set up a "capturing subpattern", which can be referred to with a "back reference"; see "Backslash for Back References", below.
Parentheses are not special inside square brackets, or anywhere in a basic regex.
The parentheses or round brackets have several other meanings in an extended regex. To save space in this reference manual, they are not documented here but you can read about them in the accompanying PCRE.HTM file:
The backslash (\) has quite a number of uses.
First and simplest, when the backslash precedes any
special character it makes that character
normal. For example, the regex 2+2
normally matches a
string of two or more 2s. (The 2+
construct means
"one or more occurrences of the character
2".) If you want to match that middle character as an actual plus
sign, you must "escape" it with a backslash: 2\+2
.
If you want to match a backslash itself, you escape it in the same
way. For example, the regex ^c:\\
matches every line that begins
with "c:\".
The backslash functions as an escape both inside and outside of
square brackets. If you are not sure when
a non-alphabetic character like ]
or $
is
special and when it is not, just precede it with a backslash and it
will be a normal character, even if it would have been normal
anyway.
Example: To match any of the four signs of arithmetic, you might write
the regex [+-*/]
. But that minus sign has a
special meaning inside square brackets.
To treat it as a normal character you must escape it with the
backslash, like this: [+\-*/]
.
This is the only use of the backslash in basic regexes; the rest all relate to extended regexes.
Many regexes involve a type of character: digit (or not), blank (or not), and so forth. While you can always use ordinary character classes, in an extended regex you can also use these shortcuts on their own or as part of a character class:
\w |
any "word" character, meaning any letter or decimal digit or
an underscore — can be redefined with the
/M option |
\W |
any character except a "word" character |
\d |
any of the decimal digits |
\D |
any character except a decimal digit |
\s |
any whitespace character: tab, space, and so on |
\S |
any character except a whitespace character |
The exact definitions of the above types will depend on the
character mapping in effect. In the default C locale, no 7-bit
characters (characters 128-255) are considered as possible
"word" characters, digits, or whitespace; in
other mappings, some 8-bit characters also match.
You can set the character mapping with the /M
option
.
Use the supplied file TEST255
to test the meaning of any
character type in your selected locale; see
examples in the supplied TOUR.BAT file.
Example: To scan a file for four-digit numbers, your regex
could repeat the \d
four times or use
curly braces: \d\d\d\d
or
\d{4}
.
Did you spot the problem with the preceding example? Yes, either of those extended regexes matches lines containing four-digit numbers. But it also matches lines containing five-digit numbers, since a five-digit number contains four consecutive digits. One way to match numbers of exactly four digits is to mark them as being preceded by start or line or a non-digit, and followed by end of line or a non-digit:
(^|\D)\d{4}($|\D)
Of course, if you know something about the files you're scanning you may not need to get so elaborate.
Example: To scan for four hexadecimal digits, use the extended regex
[\da-fA-F]{4}
(This one has the same problem as the previous example: it also matches five or more hex digits. Fixing it is left as an exercise for the reader!)
The assertions in this section look like the above
character types, but there's an
important difference. The difference is that while a character type
matches a character of specified type, an assertion matches a position
in the line and doesn't "consume" a character. (You already know two
examples of assertions, namely the anchors
^
and $
.)
\b |
word boundary, namely the transition between a word and a non-word character or vice versa, or the beginning or end of line if the adjacent character is a word character |
\B |
not a word boundary |
\A |
similar to ^ but matches start of
buffer even in free-form binary mode
(/R3 option) |
\Z |
similar to $ but matches end of
buffer even in free-form binary mode
(/R3 option) |
These assertions are not valid inside square
brackets, and in fact \b
has a different meaning
inside a character class; see Backslash
for Character Encoding, below.
Outside square brackets, a backslash
followed by a digit other than 0 is interpreted as a back reference to
a capturing subpattern in the regex. For
example, \6
refers to the sixth capturing subpattern in
the extended regex.
Example (from the PCRE man page): the extended regex
(sens|respons)e and \1ibility
matches "sense and sensibility" or "response and responsibility" but not "sense and responsibility". A back reference always refers to the actual matching subpattern in this particular instance, not to just any alternative.
Example: U.S. toll-free area codes are 800, 888, 877, 866 (and soon
855). The regex 8[08765]{2} would be wrong because it would match
strings like "867" and "808". You need a back reference to ensure that
the third digit is the same as the second: 8\([08765]\)\1
is your regex. That says you must have an 8, followed by 0, 8, 7, 6,
or 5, followed by a second occurrence of the same digit.
A "back reference" can actually be a forward reference: any of
\1
through \9
refers to the first through
ninth capturing subpattern in the extended regex, even if that
subpattern comes after the "back reference" in the regex. But
\10
and greater can refer only to subpatterns that
precede the back reference. If something looks like a back reference
but the number is greater than 9 and greater than the number of
capturing subexpressions to the left of it, it is read as
an encoded character in octal.
The last use of backslash in extended regexes is also
the ugliest. You can use a backslash to
encode certain characters, either non-printing characters or those
that DOS doesn't allow on the command line.
Note that you may not need these rules. If you use the
/F
option to enter
a regex from the keyboard or in a file, you can include any character
in it except NUL (ASCII 0), CR (ASCII 13), LF (ASCII 10), and Ctrl-Z
(ASCII 26)..
(Please note that these rules for extended regexes are quite different from the Special Rules for the Command Line. It's an unfortunate incompatibility, but neither can be changed because PCRE is a supplied library for extended regexes and users rely on existing behavior of basic regexes.)
Except as noted, each of these sequences has the indicated meaning anywhere in an extended regex:
\a |
"alarm", the BEL character, ASCII 7 |
\b |
the backspace character, ASCII 8, but only inside square brackets. Outside square brackets it is an assertion. |
\cx |
a control character. If x is a letter, it's straightforward:
\cb and \cB are both Control-B, ASCII 2.
If x is not a letter, it is XORed with 64 (hex 40). |
\e |
escape, ASCII 27 |
\f |
form feed, ASCII 12 |
\n |
"newline", line feed, ASCII 10. This character will never be seen in a text file, since it marks a line break. It can occur in a binary file. |
\r |
carriage return, ASCII 13. This character will never be seen in a text file, since it marks a line break. It can occur in a binary file. |
\t |
tab, ASCII 9 |
\xhh |
character with the given hex code hh (zero, one, or two
digits).
Examples: \x7c or \x7C is hex 7C (ASCII 124), the
| character. \x or \x0 or
\x00 is the NUL character, ASCII 0. |
\0dd |
octal number of one to three digits. \032 is
Control-Z, ASCII 26. |
\ddd |
This sequence, one to three digits where the first one is not zero,
is complicated. Outside square brackets,
it's read as a decimal number and is interpreted as a
back reference (above) if
possible. Otherwise, or always inside square brackets, it's read as an
octal number and the least significant 8 bits are taken as its
value. Examples: \7 is a back reference.
\11 is a back reference if there have already been eleven
capturing subpatterns; otherwise it's octal 11, ASCII 9, the tab
character. |
GREP defines some special sequences starting with a
backslash \
to let you get problem characters into your
regex.
These rules date back to a much earlier release of GREP.
Better ways are available now (see the /F
option),
but the special rules are maintained for upward
compatibility.
The special rules are in effect by default, but you
turn them on or off with the
/E
option. The special rules never
apply when regexes are read from file or keyboard
(/F
option).
When the special rules are in effect, you can find out how GREP
applied them by using the /D
option
and looking for the "massaged" string or regex.
You need them only when you enter a regex or search
string on the command line (no /F
option),
and any of these is true:
Your regex or search string contains DOS-reserved characters like
<
, |
, >
, space, and
semicolon, or
Your regex starts with a - or /, which normally introduce GREP options, or
Your regex contains 8-bit characters (ASCII 128 and above) and you're running GREP32.
When you select extended regexes (/E2
option),
you probably don't want the special rules given
below. Extended regexes come with
their own ways of using a backslash
for character encoding.
Special "escape sequences" give you a way to enter special characters in a regex on the command line, as follows:
instead of | you can use any of |
---|---|
< (less) |
\l \60 \0x3C \074 |
> (greater) |
\g \62 \0x3E \076 |
| (vertical bar) |
\v \124 \0x7C \0174 |
" (double quote) |
\" \34 \0x22 \042 |
, (comma) |
\c \44 \0x2C \054 |
; (semicolon) |
\i \59 \0x3B \073 |
= (equal) |
\q \61 \0x3D \075 |
(the space character) | \s \32 \0x20 \040 |
(tab) | \t \9 \0x09 \011 |
(escape) | \e \27 \0x1B \033 |
You can enter any character as a numeric sequence, not just the
special characters in the above list. Use
decimal, hex (leading 0x
), or
octal (leading zero). Example: capital A would be
\65
, \0x41
, or \0101
.
\0
is not allowed; either code something like
[^\1-\255]
("any character except ASCII 1 to 255") in
your basic regex, or use an
extended regex.
This section lists the error and warning messages and
prompts produced by GREP, with explanations for most of them. Only
debug messages (/D
option) are
omitted.
Any message that begins "GREP failure" indicates that GREP failed.
While this might be a problem in your operating system, it could also
be a problem in the code of GREP itself.
If you suspect the latter, please send full details to
<support@oakroadsystems.com>.
If possible, first re-run the program with the
/D-
option and redirect output with
>file; then send that output file with your trouble
report.
With most of these errors, GREP returns 128 in
ERRORLEVEL
. Exceptions
are noted in the description.
[...]
character
class expands into too many characters for a basic regex. You
can probably complete your task by using the
/E2
option to specify an extended
regex. Please consult the description of
differences between basic and extended
regexes.
/E2
option to specify an extended
regex. Please consult the description of
differences between basic and extended
regexes.
[...]
in a
basic regex. Please report this problem to the address
above. You can probably complete your task
by using the /E2
option to
specify an extended regex. Please consult the description of
differences between basic and extended
regexes.
ERRORLEVEL
.
You might try one or more of these general suggestions:Run GREP32 instead of GREP16. If you're already running GREP32, try closing some memory-hungry Windows or DOS programs.
Reduce the values given with the
/P
option and the
/W
option.
GREP needs about this many bytes for the context buffer:
(9 plus the /W
value) times (1 plus the before
value from /P
).
If your extended regex
(/E2
option) is very complex, try
to simplify it.
If you set the
/S
option and you have many, many
levels of subdirectories, try running GREP in stages at different
levels. (The number of subdirectories at each level doesn't
matter, it's the number of levels.) Alternatively, use the
"dir /s/b
" command to create a file list, and pass
that list to GREP with the /@
option
but without the /S
option.
If you have many, many regexes
(/F
option), try running GREP
successively with fewer regexes at a time.
These messages all indicate some condition caused by the user that
prevents GREP from finishing its task. (Most programs would call them
"fatal errors.") Except as noted, GREP returns 255 in
ERRORLEVEL
if any of
these occur.
/F
option and
/@
option respectively.)
/D
option,
but it can't be opened for output. Check whether your
disk is full or write protected, or the file is in use by another
process. GREP returns 254 in
ERRORLEVEL
.
/@
option either doesn't exist or
can't be opened for reading. GREP returns 254 in ERRORLEVEL.
/F
option either doesn't exist or
can't be opened for reading. GREP returns 254 in ERRORLEVEL.
[a-Z]
is an error because Z (ASCII 90)
precedes a (ASCII 97) in the collating sequence.
\[
).
[]
or
[^]
. Please check the description of
character classes, or if you actually want
the right square bracket to be part of the class then precede it
with a backslash ([\]
or [^\]
).
/F-
option), but if you want to
read regexes from a named file (/F
file option)
you need the registered version.
/E2
option specifies
extended regexes, but these are not
supported in GREP16. Either remove the /E2
(or
E3
) option, or use GREP32.
/@
option) is limited to
the longest path and filename allowed by the Microsoft run-time code. The
limit is 128 characters in GREP16 and 260 in GREP32.
/F
option)
is limited to 127 characters, even if you are reading extended regexes.
This limit could be increased in a future release if it is a burden.
/X
option.
/@
option to list
input filespecs in a file, but that file was empty.
/F
option
option at all, or
/F-
option but pressed Control-Z
before entering any regexes, or
/F
file option but the file
was empty or contained only blank lines.
/E2
option to specify
an extended regex. Please consult the description of
differences between basic and extended
regexes.
/R-1
and /R-2
options tell GREP to sense the type of each file
automatically. This is an added benefit in the
registered version.
ERRORLEVEL
.
/E2
option).
/E0
option, your
search string must be 511 characters or less. You may be able to
complete your task by using the /E2
option
to specify an extended regex, but you may need to put
backslashes before certain characters. Please consult the
description of extended regexes.
/M
option
allows only specified strings.
/@
option.
/F
option.
/J
option displays only
matches, not the full line, record, or buffer containing them. The
/V
option displays lines,
records, or buffers that don't contain a match. Remove
one of the options and run GREP again.
/X
option.
/X
option.
/M
option
for the supported locales. Some additional locales are
supported, but if you look at Microsoft's documentation you'll see
some locales listed that are not actually supported in the
run-time library.
/R3
option) or let GREP sense the
type of each file (R-1
or /R-2
option),
GREP internally splits your stated buffer size in half. The width
you specify for binary file buffers with the
/W
option must be divisible by 2.
Registered users can
suppress most of these warnings with the /Q
option as shown.
The /R3
option reads input
files as free-form binary, and the beginning and end of a binary
buffer don't have any special meaning in terms of the file data.
Therefore the anchors ^ and $ for start
and end of line or record don't mean anything. In an extended
regex (/E2
option), those anchors
are reinterpreted to mean the newline character, but that doesn't
happen in a basic regex. On the other hand, with record-oriented
binary (/R2
option), ^ and $ make
sense as the start or end of the record, both in a basic and in an
extended regex.
The /R-1
and /R-2
options check each input file and decide whether to read it
as text or free-form
binary. If the file is binary, you'll have the problem
mentioned in the preceding paragraph.
(warning suppressed by /Q3
)
If you specify the /S
option
(search subdirectories) in the
unregistered version, this
warning appears before GREP starts reading directories and files.
You cannot specify a directory as a named input file to GREP. The message suggests that you specify directory\* (directory\*.* would also work).
(warning suppressed by /Q3
)
Blank lines are ignored when reading input filespecs from file
(/@
option).
(warning suppressed by /Q2
or higher)
Blank lines are ignored when reading regexes from file
(/F
option).
(warning suppressed by /Q2
or higher)
At the end of execution, GREP checks whether it opened at
least one file for each filespec on the command line. It displays
this warning for each filespec that didn't match any actual files.
If you have the /S
option in
effect for subdirectory searches, this warning appears for each
filespec when not even one directory contains a file that matches
the filespec.
(GREP performs a similar diagnosis for each filespec
while reading a list file; see the /@
option.)
If you used the /X
option,
GREP will add the reminder "Maybe your /X exclusions
ruled out matching files?" See Missing
Files, near the start of this document.
(warning suppressed by /Q3
)
GREP displays the preceding warning for each input filespec
that doesn't lead to opening any files.
(No warning is displayed for files that exist but contain no
hits.)
If you have multiple filespecs on the command line or you used
the /@
option, and if
none on your input filespecs actually led to opening any
files, GREP displays this additional warning.
If no files are found for any of your input
filespecs, and there are no more serious problems, GREP will
return 4 in ERRORLEVEL
whether or not this warning is displayed.
(warning suppressed by /Q2
or higher)
You specified some input files on the command line, but you
also redirected input and you weren't using the redirected input
for a list of regexes with the /F-
option
or a list of input files with the /@
option.
(warning suppressed by /Q3
)
When reading text files, GREP keeps track of every
line that is longer than your stated maximum. (See the
/W
option.) At the end of the run,
it gives you this warning and suggests the value needed with
/W
to solve the problem. You should re-run GREP with the
suggested /W
option value to make sure you don't
miss any matches. If you want to know which files have the
oversize lines, use the /D
option.
(warning suppressed by /Q3
)
The Special Rules are a set of
hacks to let you get various special characters, reserved by DOS,
into a regex. (The Special Rules are turned on or off with the
/E
option.) When you use the
/F
option to enter regexes from file
or keyboard, there is no need for those hacks and they are not
applied.
(warning suppressed by /Q2
or higher)
The long form of the /M
option lets
you redefine a "word" character for purposes
of extended regexes (/E2
option).
That has no effect with simple searches or basic regexes
(/E0
or /E1
option).
(warning suppressed by /Q2
or higher)
In the unregistered version
of GREP, the /S
option searches
down only two levels of subdirectories. GREP then displays this
warning for each subdirectory further down the tree.
The /A
option says to include
hidden and system files when expanding
wildcard filespecs, but that
doesn't make any sense when no input files were named. If you
didn't specify any input files on the command line or via the
/@
option, the /A
option is ignored.
(warning suppressed by /Q2
or higher)
The /B
option shows the name
of every file read, whether or not it contains any hits. But the
/L
option shows only the names of
files that contain hits. If you specify both options, the
/L
option is honored.
(warning suppressed by /Q2
or higher)
The /B
option shows the filespec
of every file read, whether or not it contains any hits, on a
separate header line. But the /U
option
shows hits in UNIX style, with the filespec on every line. If you
specify both options, the /U
option is honored.
(warning suppressed by /Q2
or higher)
GREP release 6.0 added the ability to display matches without
the lines, records, or buffers that contained them (like the present
/J
option), but only when you
specified extended regexes (like the present
/E2
option). That combination was
specified by /E3
. In the next release the
/J
option was added, independent of your choice of
basic or extended regex, and /E3
became obsolete. It
is still honored for users who may have embedded it in batch files
or their environment variable.
(warning suppressed by /Q2
or higher)
The /B
option shows a file
header for every file examined, whether or not it contains any
hits. The /H
option suppresses
all file headers. If you specify both options, the /B
option is honored.
(warning suppressed by /Q2
or higher)
The /C
option shows the
count of hits with every file header (and doesn't show the
actual hits), but the /H
option
suppresses all filespec headers. If you specify both
options, the /C
option is honored.
(warning suppressed by /Q2
or higher)
The /L
option shows the names
of files that contain hits (and doesn't show the actual
hits), but the /H
option
suppresses all filespec headers. If you specify both options, the
/L
option is honored.
(warning suppressed by /Q2
or higher)
The /U
option shows matches
in UNIX style, putting the filename on every line instead of
displaying filename headers. The /H
option
suppresses filename headers, and therefore it is
included in the action of the /U
option.
(warning suppressed by /Q2
or higher)
The /C
option shows the
count of hits with every file header (and doesn't show the
actual hits). The /J
option
shows actual matches (though not the lines, records, or buffers
that contain them). If you specify both options, the
/C
option is honored.
(warning suppressed by /Q2
or higher)
The /L
option shows only the
filespecs of files that contain matches, but the
/J
option shows actual matches. If
you specify both options, the /L
option is honored.
(warning suppressed by /Q2
or higher)
The /K
option displays a set
maximum number of hits per file. The /C
option
and /L
option display
abbreviated information instead of displaying matching lines. If
you specify /C
or /L
, that option is
honored and /K
is ignored.
(warning suppressed by /Q2
or higher)
The /C
option shows the
filespecs of files that contain hits, with the count of hits in
each file, but the /L
option
shows only the filespecs without the count of hits. If you specify
both options, the /C
option is honored.
(warning suppressed by /Q2
or higher)
Microsoft's 16-bit run-time code doesn't support locale
settings, which are required to implement the
/M
option.
(warning suppressed by /Q3
)
The /C
option shows the
count of hits with every file header (and doesn't show the
actual hits), but the /N
option
shows line numbers with the hits. If you specify
both options, the /C
option is honored.
(warning suppressed by /Q2
or higher)
The /L
option shows the
filespecs of files that contain hits (not the actual hits), but the
/N
option shows line numbers with
the hits. If you specify both options, the /L
option is honored.
(warning suppressed by /Q2
or higher)
The /P
option displays
context lines or records around every line or record that contains
a match. The /C
option,
/J
option, and
/L
option all display abbreviated
information instead of the actual lines or records that contain
matches. If you specify the /P
option with any of the
others, the other option is honored.
(warning suppressed by /Q2
or higher)
The /R3
option tells GREP to
read files in free-form binary. There are no lines or records, and
so the /P
option (display context
lines or records) doesn't make sense. If you specify both options,
the /R3
option is honored.
(warning suppressed by /Q2
or higher)
You're running the unregistered
version, and you specified the /Q
option.
You didn't specify any filespecs on the command line or via
the /@
option, but used the
/R
option to specify some file
format other than text. GREP always reads redirected input files
(<file) and keyboard input as text.
(warning suppressed by /Q3
)
The /S
option says to search
subdirectories for the named files, but that can't be done when
GREP is reading only standard input because no input files were
named on the command line or via the
/@
option.
(warning suppressed by /Q2
or higher)
The /L
option shows the names
of files that contain hits (not the actual hits), but the
/U
option shows hits in UNIX
format (with the filespec on each line). If you specify both
options, the /L
option is honored.
(warning suppressed by /Q2
or higher)
You didn't specify any filespecs on the command line or via
the /@
option, but used the
/X
option on the command line
to exclude certain filespecs. When input is from standard
input, the /X
option is ignored. It is ignored
silently if the /X
options are in the
environment variable but not on the command line.
(warning suppressed by /Q2
or higher)
The /Y
option says that a
hit must match all of the (multiple) regexes. But you can specify
only one regex on the command line.
(warning suppressed by /Q2
or higher)
The /Y
option says that a
hit must match
all of the regexes. It has no effect if you enter only
one regex via the keyboard or regex file.
(warning suppressed by /Q2
or higher)
You're running the unregistered version, and you have something in the environment variable.
The /R3
option reads files as
free-form binary, and the /V
option
displays buffers that don't contain matches. Probably
what you want to know is which files
don't contain matches. To do this, run GREP again and add the
/L
option on the command line.
(warning suppressed by /Q3
)
/Q1
, /Q2
, or /Q3
)
/@-
option
(without redirection)
to read input filespecs from the keyboard. GREP is ready for you
to type the next one. If you have no more filespecs to enter,
press Control-Z immediately after this prompt. (With GREP16, you
need to press Enter after Control-Z.)
You didn't specify any input
filespecs, and you didn't redirect input from a file
(<file) or pipe it from another command
(other-command | grep
). This can be a good
way to explore the effects of certain regexes. After parsing the
command line, GREP takes input lines from the keyboard and tests
them against the regex(es). Only lines that contain a match will
be echoed to the output. (If you set the
/V
option, only lines that
don't contain a match will be echoed.)
Press Control-Z immediately after this prompt to end the GREP run. (With GREP16, you need to press Enter after Control-Z.)
/F-
option
(without redirection)
to read regular expressions from the keyboard. GREP
is ready for you to type the next one.
If you have no more regexes to enter, press Control-Z
immediately after this prompt.
(With GREP16, you need to press Enter after Control-Z.)
/@-
option
(without redirection)
to read input filespecs from the keyboard. GREP has finished
parsing the command line and is ready for you to type them in.
If you simply forgot to specify inputs or redirection, type Control-Z right away. Otherwise please see "line to test:" above.
With the /V
option,
the prompt changes to "... lines that don't contain a match."
/F-
option
(without redirection)
to read regular expressions from the keyboard. GREP has finished
parsing the command line and is ready for you to type them in.
[ back to user guide ]