PATTERNS(7) | Miscellaneous Information Manual | PATTERNS(7) |

`patterns`

—
Lua's pattern matching rules

`x`- (where
`x`is not one of the magic characters ‘^$()%.[]*+-?’) represents the character`x`itself. - .
- (a dot) represents all characters.
- %a
- represents all letters.
- %c
- represents all control characters.
- %d
- represents all digits.
- %g
- represents all printable characters except space.
- %l
- represents all lowercase letters.
- %p
- represents all punctuation characters.
- %s
- represents all space characters.
- %u
- represents all uppercase letters.
- %w
- represents all alphanumeric characters.
- %x
- represents all hexadecimal digits.
- %
`x` - (where
`x`is any non-alphanumeric character) represents the character`x`. This is the standard way to escape the magic characters. Any non-alphanumeric character (including all punctuation characters, even the non-magical) can be preceded by a ‘%’ when used to represent itself in a pattern. - [
`set`] - represents the class which is the union of all characters in
`set`. A range of characters can be specified by separating the end characters of the range, in ascending order, with a ‘-’. All classes ‘`%x`’ described above can also be used as components in`set`. All other characters in`set`represent themselves. For example, ‘[%w_]’ (or ‘[_%w]’) represents all alphanumeric characters plus the underscore, ‘[0-7]’ represents the octal digits, and ‘[0-7%l%-]’ represents the octal digits plus the lowercase letters plus the ‘-’ character.The interaction between ranges and classes is not defined. Therefore, patterns like ‘[%a-z]’ or ‘[a-%%]’ have no meaning.

- [
`^set`] - represents the complement of
`set`, where`set`is interpreted as above.

For all classes represented by single letters ( ‘%a’, ‘%c’, etc.), the corresponding uppercase letter represents the complement of the class. For instance, ‘%S’ represents all non-space characters.

The definitions of letter, space, and other character groups depend on the current locale. In particular, the class ‘[a-z]’ may not be equivalent to ‘%l’.

- a single character class, which matches any single character in the class;
- a single character class followed by ‘*’, which matches zero or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;
- a single character class followed by ‘+’, which matches one or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;
- a single character class followed by ‘-’, which also matches zero or more repetitions of characters in the class. Unlike ‘*’, these repetition items will always match the shortest possible sequence;
- a single character class followed by ‘?’, which matches zero or one occurrence of a character in the class. It always matches one occurrence if possible;
- ‘%
`n`’, for`n`between 1 and 9; such item matches a substring equal to the n-th captured string (see below); - ‘%b
`xy`’, where`x`and`y`are two distinct characters; such item matches strings that start with`x`, end with`y`, and where the`x`and`y`are*balanced*. This means that if one reads the string from left to right, counting*+1*for an`x`and*-1*for a`y`, the ending`y`is the first`y`where the count reaches 0. For instance, the item ‘%b()’ matches expressions with balanced parentheses. - ‘%f[
`set`]’, a*frontier pattern*; such item matches an empty string at any position such that the next character belongs to`set`and the previous character does not belong to`set`. The set`set`is interpreted as previously described. The beginning and the end of the subject are handled as if they were the character ‘\0’.

As a special case, the empty capture ‘()’ captures the current string position (a number). For instance, if we apply the pattern “()aa()” on the string “flaaap”, there will be two captures: 2 and 4.

Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes, Patterns, Lua 5.3 Reference Manual, http://www.lua.org/manual/5.3/manual.html#6.4.1, Lua.org, PUC-Rio, June 2015.

June 10, 2017 | OpenBSD-current |