通过改变一个选项可以修改为懒惰模式,就是一旦匹配到就中止,修改代码如下:
$str = "<h2>hello</h2><h2>world</h2>"; $preg = "/<h2>.*?</h2>/is"; preg_match($preg,$str,$arr); print_r($arr);
进一步理解 preg_match_all()
通过这函数的最后一个参数,能够返回不同形式的数组:
$str= 'jiangsu (nanjing) nantong guangdong (guangzhou) zhuhai beijing (tongzhou) haidian'; $preg = '/^s*+([^(]+?)s(([^)]+))s+(.*)$/m'; preg_match_all($preg,$str,$arr,PREG_PATTERN_ORDER); print_r($arr); preg_match_all($preg,$str,$arr,PREG_SET_ORDER); print_r($arr);
强大的正则替换回调
虽然 preg_replace() 函数能完成大多数的替换,但是假如你想更好的控制,可以使用回调,不用多说看例子:
$str = "china hello world";
$preg = '/b(w+)(w)b/';
function fun($m){
return $m[1].strtoupper($m[2]);
}
echo preg_replace_callback($preg,"fun",$str);
在这一点上,PHP 比 Python 强大的多,Python 中没有正则回调,不过可以使用闭包的方式解决,可看我以前的文章。
preg_quote()
这个函数类似于 Python 中的 re.compile() 函数,假如在模式中一些元字符仅仅想表达字符的本身含义,可以转义,但是假如在模式中写太多的转义,会显得很混乱,可以使用这个函数来统一转义:
$str = '*china*world';
$preg = "*china";
$preg = preg_quote($preg);
echo $preg;
preg_match( "/{$preg}/is",$str,$arr);
print_r($arr);
向前查找 ?= 的妙用
用英文解释可能比较贴切:
The "?=" combination means "the next text must be like this". This construct doesn't capture the text.
(1)这个例子可以获取 URL 中的协议部分,比如 https,ftp,注意 ?: 后面的部分不在返回的内容中。
$str = "http://www.google.com"; $str = "https://www.google.com"; $preg = '/[a-z]+(?=:)/'; preg_match($preg,$str,$arr); print_r($arr);
(2)"invisible" 分隔符
也叫 “zero-width” 分隔符,参考下面的例子:
$str = ("chinaWorldHello");
$preg = "/(?=[A-Z])/";
$arr = preg_split($preg,$str);
print_r($arr);
(3)匹配强密码
instead of specifying the order that things should appear, it's saying that it must appear but we're not worried about the order.
The first grouping is (?=.{8,}). This checks if there are at least 8 characters in the string. The next grouping (?=.[0-9]) means "any alphanumeric character can happen zero or more times, then any digit can happen". So this checks if there is at least one number in the string. But since the string isn't captured, that one digit can appear anywhere in the string. The next groupings (?=.[a-z]) and (?=.[A-Z]) are looking for the lower case and upper case letter accordingly anywhere in the string.







