10 Most Useful Regex Patterns for Everyday Development

10 个最实用的正则表达式:日常开发必备模式

"Every time I need to validate an email, extract URLs, or match phone numbers, I have to write regex from scratch or search Stack Overflow for outdated answers. Is there a collection of common patterns I can just copy and use?"

This article curates 10 regex patterns with the highest real-world usage frequency, each with detailed explanations, use cases, and test samples. Paste them directly into Suried Regex Tester to validate and understand their behavior before use.

01 1. Email Address Validation

Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This pattern matches standard email address formats. The username part allows letters, digits, dots, underscores, percent signs, plus signs, and hyphens. The domain part allows letters, digits, dots, and hyphens, with a top-level domain of at least 2 letters.

Note: A perfect email regex doesn't exist (the RFC 5322 spec is extremely complex). This pattern covers 99% of real email formats and strikes the best practical balance. For strict validation, ultimately confirm by sending a verification email.

02 2. URL Matching

Pattern: https?:\/\/[\w.-]+(?:\.[a-zA-Z]{2,})(?:\/[\w./?#&=%-]*)*

Matches URLs starting with http:// or https://. Supports subdomains, paths, query parameters, and anchors. Ideal for extracting links from text.

To also match URLs without protocols (like www.example.com), change the start to (?:https?:\/\/)?(?:www\.)?.

03 3. Phone Number Matching

Chinese mobile pattern: ^1[3-9]\d{9}$

Chinese mobile numbers start with 1, the second digit is 3–9, followed by 9 more digits (11 total). This concise pattern covers all current carrier number ranges.

International phone numbers are more complex: ^\+?[1-9]\d{1,14}$ (E.164 standard, up to 15 digits including country code). For formats with hyphens or spaces like +1-234-567-8900, adjust to ^\+?\d{1,3}[-.\s]?\d{1,4}[-.\s]?\d{1,4}[-.\s]?\d{1,9}$.

04 4-5. Date Formats & IP Addresses

Date format YYYY-MM-DD: ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$. Months are restricted to 01-12, days to 01-31. Note: this pattern doesn't validate month-day logic (e.g., Feb 31 passes); complete validation requires additional code logic.

IPv4 address: ^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$. Each octet is limited to 0-255, more precise than simple \d{1,3}\. — it won't match invalid addresses like 999.999.999.999.

IPv6 can also be matched with regex, but the pattern is extremely complex (handling compressed forms), so using a dedicated IP parsing library is usually recommended.

05 6-7. Chinese Characters & HTML Tags

Chinese character matching: /\p{Script=Han}/u (recommended) or [\u4e00-\u9fa5] (traditional). Unicode property matching is more accurate, covering all extension block characters. To also match Chinese punctuation, extend with [\u3000-\u303f\uff00-\uffef].

HTML tag matching: <\/?[a-zA-Z][a-zA-Z0-9]*(?:\s[^>]*)?\/?>. Matches opening, closing, and self-closing tags. But note: parsing HTML with regex is not a good idea! HTML has nesting, comments, CDATA, and other complex structures. Regex is only suitable for simple tag extraction or cleanup; real HTML parsing should use a DOM parser.

The famous Stack Overflow answer: don't parse HTML with regex. But for quick search-and-replace or log analysis in non-critical scenarios, regex matching HTML tags is still the most convenient solution.

06 8-10. File Extensions, Password Strength & Whitespace Cleanup

File extension extraction: \.([a-zA-Z0-9]+)$. Extracts the extension from the end of a filename. Capture group 1 is the extension without the dot. For specific extensions only: \.(jpg|jpeg|png|gif|webp)$ (matches common image formats).

Password strength validation: ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$. Uses four lookahead assertions ensuring lowercase, uppercase, digit, and special character, minimum 8 characters. Each assertion independently checks one condition — an elegant "must satisfy all conditions simultaneously" approach.

Whitespace cleanup — trim leading/trailing: ^\s+|\s+$ (replace with empty string). Collapse multiple spaces: \s{2,} (replace with single space). Remove blank lines: ^\s*$\n (replace with empty string). These three patterns are extremely frequently used in text processing and data cleaning.

FAQ

Can I use these regex patterns directly in my projects?

Yes, but we recommend testing with your actual data in Suried Regex Tester first to ensure matches meet expectations. Different projects may have data format variations requiring pattern adjustments. These are validated starting points, not universal final solutions.

Why can't email regex be 100% accurate?

Because RFC 5322 allows many unexpected email formats like "quoted string"@example.com and postmaster@[123.123.123.123]. A fully RFC-compliant regex would be thousands of characters long and completely impractical. In real projects, a concise regex for basic format checking + sending a verification email is the best practice.

Is the password strength regex sufficient for security validation?

Regex can only validate password "format complexity" (whether it contains various character types), not whether the password is actually secure. "Password1!" meets all format requirements but is extremely weak. Real password security requires checking common password databases (like HaveIBeenPwned) and implementing minimum entropy requirements.

How to match both Chinese and English in regex?

Use [\p{Script=Han}a-zA-Z] with the u flag to match both Chinese and English characters. To include digits: [\p{Script=Han}\w]. For complete mixed Chinese-English text paragraphs: [\p{Script=Han}\w\s.,!?]+.

Are 127.0.0.1 and 0.0.0.0 valid when matching IP addresses with regex?

Structurally, they are valid IPv4 addresses and will be matched by the regex. But in real networking, 127.0.0.1 is the loopback address (localhost), and 0.0.0.0 is the wildcard address. If you need to exclude these special addresses, filter them via code logic after the regex match.

🧩

Try the Tool Now

This article curates 10 regex patterns with the highest real-world usage frequency, each with detailed explanations, use cases, and test samples. Paste them directly into Suried Regex Tester to validate and understand their behavior before use.

TOOLS.SURIED.COM