Regular expressions look intimidating — a jumble of dots, slashes, and brackets that seems designed to confuse. But once you understand a handful of building blocks, regex becomes one of the most powerful tools in your programming toolkit. Let's break it down.
The Basic Syntax Table
Before diving into patterns, memorize these fundamentals:
| Symbol | Meaning | Example | Matches |
|---|---|---|---|
. |
Any single character | a.c |
abc, a1c, a-c |
* |
Zero or more of previous | ab*c |
ac, abc, abbc |
+ |
One or more of previous | ab+c |
abc, abbc (not ac) |
? |
Zero or one of previous | colou?r |
color, colour |
\d |
Any digit (0-9) | \d{3} |
123, 456 |
\w |
Word character (a-z, 0-9, _) | \w+ |
hello, user_1 |
\s |
Whitespace (space, tab, newline) | a\sb |
a b |
^ |
Start of string | ^Hello |
Hello world |
$ |
End of string | world$ |
Hello world |
[] |
Character set | [aeiou] |
any vowel |
() |
Capture group | (ab)+ |
ab, abab |
Master these 11 symbols and you can read 90% of regex patterns you'll encounter in the wild.
10 Practical Patterns
1. Email Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
[a-zA-Z0-9._%+-]+— one or more valid characters before the @@— literal @ symbol[a-zA-Z0-9.-]+— domain name\.[a-zA-Z]{2,}— dot followed by 2+ letter TLD
Matches: [email protected], [email protected]
2. Phone Number (Korean Format)
^01[016789]-?\d{3,4}-?\d{4}$
01[016789]— Korean mobile prefixes (010, 011, 016, etc.)-?— optional hyphen\d{3,4}— 3 or 4 digits for the middle group
Matches: 010-1234-5678, 01012345678, 011-123-4567
3. Password Strength (Min 8 chars, upper + lower + digit + special)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$
(?=.*[a-z])— lookahead: must contain a lowercase letter(?=.*[A-Z])— must contain an uppercase letter(?=.*\d)— must contain a digit(?=.*[!@#$%^&*])— must contain a special character.{8,}— at least 8 characters total
Matches: MyP@ss1word, Str0ng!Pass
4. URL
^https?:\/\/([\w.-]+)(:\d+)?(\/[\w./?&=%+-]*)?$
https?— http or https[\w.-]+— domain name(:\d+)?— optional port number(\/[\w./?&=%+-]*)?— optional path and query string
Matches: https://example.com, http://localhost:3000/api/users?id=1
5. HTML Tag Extraction
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
([a-z][a-z0-9]*)— captures the tag name[^>]*— any attributes inside the tag(.*?)— non-greedy match of inner content<\/\1>— closing tag matching the captured name
Matches: <div>content</div>, <p class="text">hello</p>
6. Korean Characters Only
^[가-힣]+$
[가-힣]— the full range of composed Korean syllables in Unicode
Matches: 안녕하세요, 프로그래밍
7. IP Address (IPv4)
^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$
- Each octet: 0–255
- Separated by dots
- Exactly 4 groups
Matches: 192.168.0.1, 10.0.0.255, 0.0.0.0
8. Date Format (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
\d{4}— 4-digit year(0[1-9]|1[0-2])— month 01–12(0[1-9]|[12]\d|3[01])— day 01–31
Matches: 2026-03-22, 1999-12-31
9. Currency Format (with commas)
^\$?\d{1,3}(,\d{3})*(\.\d{2})?$
\$?— optional dollar sign\d{1,3}— 1 to 3 leading digits(,\d{3})*— comma-separated groups of 3 digits(\.\d{2})?— optional cents
Matches: $1,234,567.89, 100,000, 99.99
10. Duplicate Whitespace Removal
\s{2,}
Replace with a single space to clean up messy text input.
\s{2,}— two or more consecutive whitespace characters
Use case: Cleaning user-submitted form data, normalizing log files
3 Common Mistakes to Avoid
Mistake 1: Greedy Matching
❌ <.*> → matches "<div>hello</div>" as ONE match
✅ <.*?> → matches "<div>" and "</div>" separately
By default, * and + are greedy — they grab as much text as possible. Add ? to make them lazy (match as little as possible).
Mistake 2: Forgetting to Escape Special Characters
❌ price: $10.99 → $ means "end of string", . means "any character"
✅ price: \$10\.99 → matches the literal text
Characters like . $ ^ * + ? ( ) [ { | \ all need a backslash when you want to match them literally.
Mistake 3: Missing Anchors
❌ \d{3} → matches "123" inside "abc12345xyz"
✅ ^\d{3}$ → only matches strings that are exactly 3 digits
Without ^ and $ anchors, your pattern will match substrings within larger text, leading to false positives.
Quick Reference Cheat Sheet
| Task | Pattern |
|---|---|
| Only digits | ^\d+$ |
| Only letters | ^[a-zA-Z]+$ |
| No special chars | ^[\w\s]+$ |
| Starts with uppercase | ^[A-Z] |
| Contains a word | \bword\b |
| Either/or | cat|dog |
Conclusion
Regular expressions are a skill that compounds over time. You don't need to memorize every pattern — just understand the building blocks and know where to look things up. The 10 patterns above cover the most common real-world scenarios you'll encounter.
Try the Regex Tester to experiment with these patterns in real time and build your regex muscle memory!
