正则表达式 #

一、正则表达式基础 #

1.1 什么是正则表达式 #

正则表达式（Regular Expression）是一种用于匹配字符串模式的工具。

1.2 基本语法 #

元字符	说明
.	匹配任意单个字符
^	匹配字符串开头
$	匹配字符串结尾
*	匹配0次或多次
+	匹配1次或多次
?	匹配0次或1次
	匹配n次
	匹配n次或更多
	匹配n到m次
[]	字符类
()	分组
\|	或

1.3 预定义字符类 #

字符类	说明
\d	数字 [0-9]
\D	非数字
\w	单词字符 [a-zA-Z0-9_]
\W	非单词字符
\s	空白字符
\S	非空白字符

二、preg_match() #

2.1 基本用法 #

php

<?php
$str = "Hello World";

if (preg_match('/World/', $str)) {
    echo "Found";
}

2.2 获取匹配结果 #

php

<?php
$str = "Hello World";

if (preg_match('/(\w+) (\w+)/', $str, $matches)) {
    print_r($matches);
}

2.3 使用标志 #

php

<?php
$str = "Hello World";

preg_match('/world/i', $str, $matches);
print_r($matches);

$str = "Hello\nWorld";
preg_match('/^World/m', $str, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);

2.4 验证邮箱 #

php

<?php
function isValidEmail(string $email): bool
{
    return preg_match('/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/', $email);
}

var_dump(isValidEmail("user@example.com"));
var_dump(isValidEmail("invalid-email"));

2.5 验证手机号 #

php

<?php
function isValidPhone(string $phone): bool
{
    return preg_match('/^1[3-9]\d{9}$/', $phone);
}

var_dump(isValidPhone("13812345678"));
var_dump(isValidPhone("12345678901"));

三、preg_match_all() #

3.1 基本用法 #

php

<?php
$str = "Hello World, Hello PHP";

preg_match_all('/Hello/', $str, $matches);
print_r($matches);

3.2 提取所有链接 #

php

<?php
$html = '<a href="https://example.com">Link1</a>
         <a href="https://test.com">Link2</a>';

preg_match_all('/<a href="([^"]+)">([^<]+)<\/a>/', $html, $matches);
print_r($matches);

3.3 提取所有数字 #

php

<?php
$str = "a1b2c3d4e5";

preg_match_all('/\d/', $str, $matches);
print_r($matches[0]);

preg_match_all('/\d+/', $str, $matches);
print_r($matches[0]);

四、preg_replace() #

4.1 基本用法 #

php

<?php
$str = "Hello World";

echo preg_replace('/World/', 'PHP', $str);

4.2 使用回调函数 #

php

<?php
$str = "The price is $100 and $200";

$result = preg_replace_callback('/\$(\d+)/', function($matches) {
    return '¥' . ($matches[1] * 7);
}, $str);

echo $result;

4.3 使用数组替换 #

php

<?php
$str = "Hello World, Hello PHP";

echo preg_replace(
    ['/World/', '/PHP/'],
    ['JavaScript', 'Python'],
    $str
);

4.4 限制替换次数 #

php

<?php
$str = "aaa bbb aaa bbb aaa";

echo preg_replace('/aaa/', 'XXX', $str, 2);

4.5 敏感词过滤 #

php

<?php
function filterWords(string $text, array $words): string
{
    $pattern = '/(' . implode('|', $words) . ')/i';
    return preg_replace($pattern, '***', $text);
}

$text = "This is a bad word and another bad word.";
$badWords = ['bad', 'word'];

echo filterWords($text, $badWords);

五、preg_split() #

5.1 基本用法 #

php

<?php
$str = "apple, banana, cherry";

$arr = preg_split('/,\s*/', $str);
print_r($arr);

5.2 分割单词 #

php

<?php
$str = "Hello World! This is PHP.";

$words = preg_split('/\s+/', $str);
print_r($words);

$words = preg_split('/[^\w]+/', $str, -1, PREG_SPLIT_NO_EMPTY);
print_r($words);

5.3 限制分割数量 #

php

<?php
$str = "a-b-c-d-e";

$arr = preg_split('/-/', $str, 3);
print_r($arr);

六、preg_grep() #

6.1 基本用法 #

php

<?php
$arr = ['apple', 'banana', 'cherry', 'date'];

$result = preg_grep('/^a/', $arr);
print_r($result);

6.2 过滤数字 #

php

<?php
$arr = ['abc', '123', 'def', '456'];

$numbers = preg_grep('/^\d+$/', $arr);
print_r($numbers);

$nonNumbers = preg_grep('/^\d+$/', $arr, PREG_GREP_INVERT);
print_r($nonNumbers);

七、常用正则模式 #

7.1 邮箱 #

php

<?php
$pattern = '/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/';

7.2 手机号（中国） #

php

<?php
$pattern = '/^1[3-9]\d{9}$/';

7.3 身份证号 #

php

<?php
$pattern = '/^\d{17}[\dXx]$/';

7.4 URL #

php

<?php
$pattern = '/^https?:\/\/[^\s]+$/';

7.5 IP地址 #

php

<?php
$pattern = '/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/';

7.6 日期 #

php

<?php
$pattern = '/^\d{4}-\d{2}-\d{2}$/';

7.7 用户名 #

php

<?php
$pattern = '/^[a-zA-Z][a-zA-Z0-9_]{2,15}$/';

7.8 密码强度 #

php

<?php
function checkPasswordStrength(string $password): array
{
    $result = [
        'length' => strlen($password) >= 8,
        'upper' => preg_match('/[A-Z]/', $password),
        'lower' => preg_match('/[a-z]/', $password),
        'number' => preg_match('/\d/', $password),
        'special' => preg_match('/[!@#$%^&*]/', $password)
    ];
    
    $result['score'] = array_sum($result);
    $result['strong'] = $result['score'] >= 4;
    
    return $result;
}

八、实际应用 #

8.1 提取HTML标签内容 #

php

<?php
function extractTagContent(string $html, string $tag): array
{
    $pattern = "/<$tag[^>]*>(.*?)<\/$tag>/is";
    preg_match_all($pattern, $html, $matches);
    return $matches[1];
}

$html = '<p>First</p><p>Second</p>';
print_r(extractTagContent($html, 'p'));

8.2 清理HTML标签 #

php

<?php
function stripTags(string $html, array $allowed = []): string
{
    if (empty($allowed)) {
        return strip_tags($html);
    }
    
    $allowedStr = '<' . implode('><', $allowed) . '>';
    return strip_tags($html, $allowedStr);
}

8.3 生成URL Slug #

php

<?php
function slugify(string $text): string
{
    $text = preg_replace('/[^\p{L}\p{N}\s-]/u', '', $text);
    $text = preg_replace('/[\s-]+/', '-', $text);
    $text = trim($text, '-');
    return strtolower($text);
}

echo slugify("Hello World!");
echo slugify("PHP 是最好的语言");

8.4 高亮搜索关键词 #

php

<?php
function highlightKeywords(string $text, string $keyword): string
{
    return preg_replace(
        '/(' . preg_quote($keyword, '/') . ')/i',
        '<mark>$1</mark>',
        $text
    );
}

echo highlightKeywords("Hello World, hello PHP", "hello");

九、性能优化 #

9.1 使用非捕获分组 #

php

<?php
preg_match('/(abc)/', $str, $matches);
preg_match('/(?:abc)/', $str, $matches);

9.2 避免贪婪匹配 #

php

<?php
$html = '<div>content1</div><div>content2</div>';

preg_match_all('/<div>.*<\/div>/', $html, $matches);
preg_match_all('/<div>.*?<\/div>/', $html, $matches);

9.3 使用原子分组 #

php

<?php
preg_match('/(a+)+b/', 'aaaaaaab');
preg_match('/(?>a+)+b/', 'aaaaaaab');

十、总结 #

本章学习了：

正则表达式基础语法
preg_match() 匹配
preg_match_all() 全局匹配
preg_replace() 替换
preg_split() 分割
preg_grep() 过滤
常用正则模式
实际应用

下一章将学习PHP面向对象。