函数参考


StringRegExp

检查字串是否符合给定的正则表达式.

StringRegExp ( "字符串", "表达式" [, 标志 [, 偏移量]] )

参数

字符串 需要检查的字符串
表达式 正则表达式比较.
标志 [可选参数] 一个表示函数运行方式的值. 见下表详细说明. 默认 0.
偏移量 [可选参数] 开始匹配的字符串位置 (起始于1). 默认为 1.


标志
0 返回 1(匹配) 或 0(不匹配)
1 返回匹配项目的数组.
2 返回包括完整匹配的数组.(Perl/ PHP 样式).
3 返回全局匹配的数组.
4 返回包括完整匹配(Perl/ PHP 样式)和全局匹配的数组.

返回值

标志 = 0 时:
@Error 意思
2 正则表达式错误. @Extended = 在正则表达式中的错误偏移量.


标志= 1 或 2 时:
@Error 意思
0 有效数组. 检查 @Extended 的下一个偏移量.
1 数组无效. 没有匹配项目.
2 正则表达式错误, 数组无效. @Extended = 在正则表达式中的错误偏移量.


标志 = 3 或 4 时:
@Error 意思
0 有效数组.
1 数组无效. 没有匹配项目.
2 正则表达式错误, 数组无效. @Extended = 在正则表达式中的错误偏移量.

注意/说明

The flag parameter can have one of 5 values (0 through 4). 0 gives a true (1) or false (0) as to whether the pattern was found or not. 1 and 2 find the first match and returns it in an array. 3 and 4 find multiple hits and fills the array with all the matching text. 2 and 4 include the full matching text as the first record, not just the capturing groups, which is all you get with flag 1 and 3.

使用正则表达式是搜寻字串的一个轻巧的方法. 正则表达式指出纯文本字符串应该在目标字串中存在, 而且有些字符有特别的意义,表示什么样的可变性是承认目标字串的. AutoIt 正则表达式通常是与大小写有关.

正则表达式由下面所列的一个或多个简单的字符规则组成. 如果字符不在下列表格中,则它只会匹配它本身.

次数修饰符 (*, +, ?, {...} ) 将尝试最大可能的匹配, 使第一次匹配之后的字符也能得到匹配, 除非被一个问号跟随; then it will find the smallest pattern that allows the following characters to match as well.

允许嵌套的组, 但必须记住所有的组, 除非不捕获组, 在返回的数组中,里面的表达式匹配的值在前,而外面的表达式匹配的值在后.

完整的描述在 这里

注意: 错误的的正则表达式可能产生一个死循环,使 CPU even(偶校验)崩溃.

匹配字符

[ ... ] Match any character in the set. e.g. [aeiou] matches any lower-case vowel. A contiguous set can be defined using a dash between the starting and ending characters. e.g. [a-z] matches any lower case character. To include a dash (-) in a set, use it as the first or last character of the set. To include a closing bracket in a set, use it as the first character of the set. e.g. [][] will match either [ or ]. Note that special characters do not retain their special meanings inside a set, with the exception of \\, \^, \-,\[ and \] match the escaped character inside a set.
[^ ... ] Match any character not in the set. e.g. [^0-9] matches any non-digit. To include a caret (^) in a set, put it after the beginning of the set or escape it (\^).
[:class:] Match a character in the given class of characters. Valid classes are: alpha (any alphabetic character), alnum (any alphanumeric character), lower (any lower-case letter), upper (any upper-case letter), digit (any decimal digit 0-9), xdigit (any hexadecimal digit, 0-9, A-F, a-f), space (any whitespace character), blank (only a space or tab), print (any printable character), graph (any printable character except spaces), cntrl (any control character [ascii 127 or <32]) or punct (any punctuation character). So [0-9] is equivalent to [[:digit:]].
[^:class:] Match any character not in the class, but only if the first character.
( ... ) Group. The elements in the group are treated in order and can be repeated together. e.g. (ab)+ will match "ab" or "abab", but not "aba". A group will also store the text matched for use in back-references and in the array returned by the function, depending on flag value.
(?#....) comment (not nestable).
(?i) Case-insensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-insensitive matching from that point on.
(?-i) (default) Case-sensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-sensitive matching from that point on.
(?: ... ) Non-capturing group. Behaves just like a normal group, but does not record the matching characters in the array nor can the matched text be used for back-referencing.
(?i: ... ) Case-insensitive non-capturing group. Behaves just like a non-capturing group, but performs case-insensitive matches within the group.
(?-i: ... ) Case-sensitive non-capturing group. Behaves just like a non-capturing group, but performs case-sensitive matches within the group.
(?J) allow duplicate names.
(?m) ^ and $ match newlines within data.
(?s) . 匹配任意字符,包括换行. (默认 "." 不匹配换行)
(?U) Invert greediness of quantifiers.
(?x) 忽略空白区域和 # 注释.
(?-...) unset option(s).
. 匹配任何的单字符 (除换行以外)..
| 或(or). 可以匹配|前的字符也可以匹配|之后的字符.
\ 退出一个特殊字符 (让它匹配实际字符) 或者引用一个特殊字符类型 (见下文)..
\\ 匹配一个真实的反斜线 (\).
\a 闹铃,即字符 BEL (chr(7)).
\A 只匹配字符串开始.
\b 匹配一个单词范围.
\B 匹配一个非单词范围.
\c 匹配一个控制字符, 基于下一字符进行计算. 例如, \cM 匹配 ctrl-M.
\d 匹配任何的数字 (0-9).
\D 匹配任何的非数字.
\e 匹配一个退出符 (chr(27)).
\E end case modification.
\f 匹配进纸符 (chr(12)).
\G first matching position in subject.
\h 任何的水平空白字符.
\H 任何不是水平的空白字符.
\n 匹配换行符 (@LF, chr(10)).
\K reset start of match.
\N a character that is not a newline
\Q quote (disable) pattern metacharacters till \E.
\r 匹配一个回车符 (@CR, chr(13)).
\R a newline sequence.
\s 匹配任何的空白字符: Chr(9) 到 Chr(13).包括:水平制表符,换行,垂直列表符,换页,回车以及标准空格 ( Chr(32) ).
\S 匹配任何的非空白的字符.
\t 匹配一个制表符 (chr(9)).
\v 任何的垂直空白字符.
\V 任何一个不是垂直空白字符的字符.
\w 匹配任何的"单词" 字符: a-z, A-Z, 0-9 或下划线 (_).
\W 匹配任何的非"单词"字符.
\ddd Match character with octal code ddd, or backreference if found. Match the prior group number given exactly. For example, ([:alpha:])\1 would match a double letter.
\xhh 匹配指定的字符,用字符的的十六进制表示.hh为两位
\x{hhh..} 匹配指定的字符,用字符的的十六进制表示.hhh..至少为三位
\z 只匹配字符串结束.
\Z 只匹配字符串结束,或者换行之前.

重复字符

{x} 重复上一字符,字符集,字符组 x 次.
{x,} 重复上一字符,字符集,字符组至少 x 次.
{0,x} 重复上一字符,字符集,字符组最多 x 次.
{x, y} 重复上一字符,字符集,字符组 xy 次, x,y也包括在内.
* 重复上一字符,字符集,字符组 0 次或多次. 等于 {0,}
+ 重复上一字符,字符集,字符组 1 次或多次. 等于 {1,}
? 重复上一字符,字符集,字符组可能或者不可能出现. 等价于 {0, 1}
? (在一个重复字符之后) 查询最小范围匹配代替最大范围.

字符类别

[:alnum:] 字母和数字
[:alpha:] 字母
[:ascii:] 字符代码 0 - 127
[:blank:] 空格或制表符
[:cntrl:] 控制字符
[:digit:] 十进制数字 (相同于 \d)
[:graph:] 可打印字符, 排除空格
[:lower:] 小写字母
[:print:] 可打印字符,包括空格
[:punct:] 可打印字符, 排除文字和数字
[:space:] 空白空间 (不完全和 \s 相同, 还包括 VT: chr(11) )
[:upper:] 大写字母
[:word:] "字" 字符 (相同于 \w)
[:xdigit:] 十六进制数

General comments about UTF-8 mode (use internaly by AutoIt to translate pattern) :

    1. An unbraced hexadecimal escape sequence (such as \xb3) matches a two-byte UTF-8 character if the value is greater than 127.

    2. Octal numbers up to \777 are recognized, and match two-byte UTF-8 characters for values greater than \177.

    3. Repeat quantifiers apply to complete UTF-8 characters, not to individual bytes, for example: \x{100}{3}.

    4. The dot metacharacter matches one UTF-8 character instead of a single byte.

    5. The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly test characters of any code value, but the characters that PCRE recognizes as digits, spaces, or word characters remain the same set as before, all with values less than 256. Note that this also applies to \b, because it is defined in terms of \w and \W.

    6. Similarly, characters that match the POSIX named character classes are all low-valued characters.

    7. However, the Perl 5.10 horizontal and vertical whitespace matching escapes (\h, \H, \v, and \V) do match all the appropriate Unicode characters.

    8. Case-insensitive matching applies only to characters whose values are less than 128. PCRE supports case-insensitive matching only when there is a one-to-one mapping between a letter's cases. There are a small number of many-to-one mappings in Unicode; these are not supported by PCRE.

相关

StringInStr, StringRegExpReplace

示例/演示


;=============================================================
;官方例子
;=============================================================
;示例 1, 返回匹配项目的数组.并使用偏移量
Local $nOffset = 1

Local $array
While 1
    $array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 1, $nOffset)

    If @error = 0 Then
        $nOffset = @extended
    Else
        ExitLoop
    EndIf
    For $i = 0 To UBound($array) - 1
        MsgBox(4096, "正则测试 标志值 1 - " & $i, $array[$i])
    Next
WEnd


;示例 2, 返回包括完整匹配的数组.(Perl/ PHP 样式).
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 2)
For $i = 0 To UBound($array) - 1
    MsgBox(4096, "正则测试 标志值 2 - " & $i, $array[$i])
Next


;示例 3, 返回全局匹配的数组.
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 3)

For $i = 0 To UBound($array) - 1
    MsgBox(4096, "正则测试 标志值 3 - " & $i, $array[$i])
Next


;示例 4, 返回包括完整匹配(Perl/ PHP 样式)和全局匹配的数组.
$array = StringRegExp('F1oF2oF3o', '(F.o)*?', 4)

For $i = 0 To UBound($array) - 1

    Local $match = $array[$i]
    For $j = 0 To UBound($match) - 1
        MsgBox(4096, "正则测试 标志值 4 - " & $i & ',' & $j, $match[$j])
    Next
Next

;=============================================================
;kodin温馨提示:本人极力推荐使用正则测试工具辅助学习。
;在线正则测试工具地址:http://www.gskinner.com/RegExr/
;=============================================================

;示例 1 匹配 Email地址
$Email = '131sg31gsg autoit@acn.com  313sfsg31sg'
$array = StringRegExp($Email, '\b[\w\.-]+@[\w\.-]+\.\w{2,4}\b', 2)
MsgBox(4096, "正则测试", $array[UBound($array[0])])

;示例 2 匹配 日期时间(yyyy-mm-dd hh:mm:ss)
$data = 'data 2010-03-27 12:30:10'
$array = StringRegExp($data, '(19[0-9]{2}|2[0-9]{3})-(0[1-9]|1[012])-([123]0|[012][1-9]|31) ([01][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])', 2)
MsgBox(4096, "正则测试", $array[UBound($array[0])])