The Regular Expression Module | Grace Documentation

Regular expressions are useful for searching in strings, when the substring being sought is not a constant, but some sort of patten, such as a number, followed by a period, followed by a sequence of one or more letters, all at the start of a line.

Creating A Regular Expression

The regularExpresson module can be imported using import "regularExpression" as re, for any identifier re of your choice. The object re will then respond to the following requests, both of which create a regular expression.

fromString(regEx:String)
    // returns a regular expression defined by regEx.  Most characters in
    // regEx will match themsleves, but certain characters have special meaning, 
    // as described by the table below. The regex will perform a non-global,
    // case-sensitive match; to modify this, use the following request.
    
fromString(regEx:String) modifiers(modifiers:String)
    // returns a regular expression defined by regEx with modifiers, as defiend below.

Modifiers

By defualt, searches stop after the first match, and are case-sensitive. One, two or three of the following modifier characters can be used to define regular expressions that perform modified searches.

Modifier	Description
g	Perform a global match (find all matches rather than stopping after the first match)
i	Perform case-insensitive matching
m	Perform “multiline” matching: ^ and $ match at the beginning at end of each line

Brackets

Brackets are used to find a range of characters; parentheses are used for grouping.

Expression	Description
[abc]	Matches any character between the brackets
[^abc]	Matches any character not between the brackets
[0-9]	Matches any character in the range 0–9 (any digit)
[^0-9]	Matches any character not in the range 0–9 (any non-digit)
(x\|y)	Matches anything matching the regualr expressions x or y

Metacharacters

Metacharacters are characters with a special meaning. Many of them are introduced by a slash (\) character. Since \ in a quoted string introduces a string escape, it is suggested that patterns containing these metacharacters be written as uninterpreted strings between ‹ and ›. If you write strings between " and ", then \ must be written as \\.

Metacharacter	Description
.	Matches any single character except newline or line terminator
\w	Matches a word character: a–z, A–Z, 0–9, or _ (underscore)
\W	Matches a non-word character
\d	Matches a digit
\D	Matches a non-digit character
\s	Matches a whitespace character: space, tab, CR, LF, VT or FF
\S	Matches a non-whitespace character
\b	Matches a match at the start or end of a word, so \bHI matches words beginning with HI, and HI\b matches words ending with HI
\B	Matches a match, but not at the beginning/end of a word
\0	Matches a NUL character
\n	Matches a new line (LF) character
\f	Matches a form feed (FF) character
\r	Matches a carriage return (CR) character
\t	Matches a tab character
\v	Matches a vertical tab (VT) character
\ddd	Matches the character specified by the octal number ddd
\xdd	Matches the character specified by the hexadecimal number dd
\udddd	Matches the Unicode character with the hexadecimal codepoint dddd

Quantifiers

Quantifier	Description
n+	Matches any string that contains at least one n
n*	Matches any string that contains zero or more occurrences of n
n?	Matches any string that contains zero or one occurrences of n
n{X}	Matches any string that contains a sequence of X n’s
n{X,Y}	Matches any string that contains a sequence of X to Y n’s
n{X,}	Matches any string that contains a sequence of at least X n’s
n$	Matches any string with n at the end of it
^n	Matches any string with n at the beginning of it
?=n	Matches any string that is followed by n
?!n	Matches any string that is not followed by n

Using Regular Expressions

Once a regular expression has been created, the following requests can be made on it.

matches(text:String) → Boolean
    // answers true if the receiver matches text, and false if it does not
        
firstMatchingPosition⟦T⟧(text:String) ifNone(noMatchBlock:Function0⟦T⟧) → Number | T
    // answers the index of the first substring of text that matches the receiver
    
firstMatchingString⟦T⟧(text:String) ifNone(noMatchBlock:Function0⟦T⟧) → String | T
    // answers the first substring of text that matches the receiver
    
allMatches(text:String) → Collection ⟦MatchResult⟧
    // answers a collection containing all the substrings of text that match the receiver.
    // Each element is a MatchResult object that describes one match
    
type MatchResult = interface {
    position → Number        // the index at which the matching text starts
    group(i:Number) → String // i is an integer; returns the text matching the
                             // i_th parenthesized matching group, 
                             // or raises BoundsError if there is no such group.
    whole → String           // returns the whole of the matching text
}

Acknowlegements

The regular expression facility in Grace is implemented using the JavaScript Regular Expression system. This documentation page is based on the w3schools.com documentation page for JavaScript regular expressions.