This library provides generic functions for string manipulation, such as finding and extracting substrings and pattern matching. When indexing a string, the first character has position 1. See Page for an explanation about patterns, and Section 8.3 for some examples on string manipulation in Lua.
This function looks for the first match of pattern in str. If it finds one, it returns the indexes on str where this occurence starts and ends; otherwise, it returns nil. If the pattern specifies captures, the captured strings are returned as extra results. A third optional numerical argument specifies where to start the search; its default value is 1. A value of 1 as a forth optional argument turns off the pattern matching facilities, so the function does a plain ``find substring'' operation.
format('%q', 'a string with "quotes" and \n new line')will produce the string:
"a string with \"quotes\" and \ new line"
The options c, d, E, e, f, g i, o, u, X, and x all expect a number as argument, whereas q and s expect a string.
If repl is a string, its value is used for replacement. Any sequence in repl of the form %n with n between 1 and 9 stands for the value of the n-th captured substring.
If repl is a function, this function is called every time a match occurs, with all captured substrings as parameters. If the value returned by this function is a string, it is used as the replacement string; otherwise, the replacement string is the empty string.
An optional parameter n limits the maximum number of substitutions to occur. For instance, when n is 1 only the first occurrence of pat is replaced.
As an example, in the following expression each occurrence of the form
$name$ calls the function getenv,
passing name as argument
(because only this part of the pattern is captured).
The value returned by getenv will replace the pattern.
Therefore, the whole expression:
gsub("home = $HOME$, user = $USER$", "$(%w%w*)$", getenv)
may return the string:
home = /home/roberto, user = roberto
a character class is used to represent a set of characters. The following combinations are allowed in describing a character class:
a pattern item may be a single character class, or a character class followed by * or by ?. A single character class matches any single character in the class. A character class followed by * matches 0 or more repetitions of characters in the class. A character class followed by ? matches 0 or one occurrence of a character in the class. A pattern item may also has the form %n, for n between 1 and 9; such item matches a sub-string equal to the n-th captured string.
a pattern is a sequence of pattern items. Any repetition item (*) inside a pattern will always match the longest possible sequence. A ^ at the beginning of a pattern anchors the match at the beginning of the subject string. A $ at the end of a pattern anchors the match at the end of the subject string.
A pattern may contain sub-patterns enclosed in parentheses, that describe captures. When a match succeeds, the sub-strings of the subject string that match captures are captured for future use. Captures are numbered according to their left parentheses.