Regular expressions are useful for searching in strings, when the substring being sought is not a constant, but some sort of patten, such as a number, followed by a period, followed by a sequence of one or more letters, all at the start of a line.
Creating A Regular Expression
The regularExpresson module can be imported using import "regularExpression" as re
, for any identifier re
of your choice.
The object re
will then respond to the following requests, both of
which create a regular expression.
fromString(regEx:String)
// returns a regular expression defined by regEx. Most characters in
// regEx will match themsleves, but certain characters have special meaning,
// as described by the table below. The regex will perform a non-global,
// case-sensitive match; to modify this, use the following request.
fromString(regEx:String) modifiers(modifiers:String)
// returns a regular expression defined by regEx with modifiers, as defiend below.
Modifiers
By defualt, searches stop after the first match, and are case-sensitive. One, two or three of the following modifier characters can be used to define regular expressions that perform modified searches.
Modifier | Description |
---|---|
g | Perform a global match (find all matches rather than stopping after the first match) |
i | Perform case-insensitive matching |
m | Perform “multiline” matching: ^ and $ match at the beginning at end of each line |
Brackets
Brackets are used to find a range of characters; parentheses are used for grouping.
Expression | Description |
---|---|
[abc] | Matches any character between the brackets |
[^abc] | Matches any character not between the brackets |
[0-9] | Matches any character in the range 0–9 (any digit) |
[^0-9] | Matches any character not in the range 0–9 (any non-digit) |
(x|y) | Matches anything matching the regualr expressions x or y |
Metacharacters
Metacharacters are characters with a special meaning. Many of them are introduced by a
slash (\
) character. Since \
in a quoted string introduces a string escape,
it is suggested that patterns containing these metacharacters be written as
uninterpreted strings between ‹
and ›
.
If you write strings between "
and "
, then \
must be written as \\
.
Metacharacter | Description |
---|---|
. | Matches any single character except newline or line terminator |
\w | Matches a word character: a–z, A–Z, 0–9, or _ (underscore) |
\W | Matches a non-word character |
\d | Matches a digit |
\D | Matches a non-digit character |
\s | Matches a whitespace character: space, tab, CR, LF, VT or FF |
\S | Matches a non-whitespace character |
\b | Matches a match at the start or end of a word, so \bHI matches words beginning with HI, and HI\b matches words ending with HI |
\B | Matches a match, but not at the beginning/end of a word |
\0 | Matches a NUL character |
\n | Matches a new line (LF) character |
\f | Matches a form feed (FF) character |
\r | Matches a carriage return (CR) character |
\t | Matches a tab character |
\v | Matches a vertical tab (VT) character |
\ddd | Matches the character specified by the octal number ddd |
\xdd | Matches the character specified by the hexadecimal number dd |
\udddd | Matches the Unicode character with the hexadecimal codepoint dddd |
Quantifiers
Quantifier | Description |
---|---|
n+ | Matches any string that contains at least one n |
n* | Matches any string that contains zero or more occurrences of n |
n? | Matches any string that contains zero or one occurrences of n |
n{X} | Matches any string that contains a sequence of X n’s |
n{X,Y} | Matches any string that contains a sequence of X to Y n’s |
n{X,} | Matches any string that contains a sequence of at least X n’s |
n$ | Matches any string with n at the end of it |
^n | Matches any string with n at the beginning of it |
?=n | Matches any string that is followed by n |
?!n | Matches any string that is not followed by n |
Using Regular Expressions
Once a regular expression has been created, the following requests can be made on it.
matches(text:String) → Boolean
// answers true if the receiver matches text, and false if it does not
firstMatchingPosition⟦T⟧(text:String) ifNone(noMatchBlock:Function0⟦T⟧) → Number | T
// answers the index of the first substring of text that matches the receiver
firstMatchingString⟦T⟧(text:String) ifNone(noMatchBlock:Function0⟦T⟧) → String | T
// answers the first substring of text that matches the receiver
allMatches(text:String) → Collection ⟦MatchResult⟧
// answers a collection containing all the substrings of text that match the receiver.
// Each element is a MatchResult object that describes one match
type MatchResult = interface {
position → Number // the index at which the matching text starts
group(i:Number) → String // i is an integer; returns the text matching the
// i_th parenthesized matching group,
// or raises BoundsError if there is no such group.
whole → String // returns the whole of the matching text
}
Acknowlegements
The regular expression facility in Grace is implemented using the JavaScript Regular Expression system. This documentation page is based on the w3schools.com documentation page for JavaScript regular expressions.