Regex Tester

Test regular expressions with live matching

Regex pattern
1
2
Processed locally in your browser
Understanding Regular Expressions
TL;DR

Regular expressions (regex) are patterns that match text. Used everywhere — input validation, search/replace, log parsing. Dense syntax but learnable.

What is Regex?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regex patterns describe what text looks like rather than what it means — they match character sequences, not semantics.

Regular expressions are one of the most powerful and ubiquitous tools in programming. They appear in virtually every programming language (JavaScript, Python, Java, Go, Rust), every text editor (VS Code, Vim, Sublime), every command-line tool (grep, sed, awk), and most databases (PostgreSQL, MySQL). Learning regex is a force multiplier — the same syntax works almost everywhere.

The trade-off is readability. Regex patterns are dense and can look cryptic at first glance. A pattern like ^(?=.*[A-Z])(?=.*\d)[A-Za-z\d@$!%*?&]{8,}$ is perfectly logical once you know the syntax, but impenetrable if you don’t. The key is learning the building blocks one at a time.

Core Concepts

Character Classes

A character class matches any single character from a defined set:

  • [abc] — Matches a, b, or c
  • [a-z] — Matches any lowercase letter
  • [^abc] — Matches any character except a, b, or c
  • . — Matches any character except newline

Shorthand Classes

ShorthandEquivalentMeaning
\d[0-9]Any digit
\D[^0-9]Any non-digit
\w[a-zA-Z0-9_]Any word character
\W[^a-zA-Z0-9_]Any non-word character
\s[ \t\n\r\f\v]Any whitespace
\S[^ \t\n\r\f\v]Any non-whitespace

Quantifiers

Quantifiers specify how many times a pattern should repeat:

QuantifierMeaningExample
*0 or morea* matches "", "a", "aaa"
+1 or morea+ matches "a", "aaa" but not ""
?0 or 1a? matches "" or "a"
{n}Exactly na{3} matches "aaa"
{n,m}Between n and ma{2,4} matches "aa", "aaa", "aaaa"
{n,}n or morea{2,} matches "aa", "aaa", …

Anchors

Anchors match positions, not characters:

  • ^ — Start of string (or line in multiline mode)
  • $ — End of string (or line in multiline mode)
  • \b — Word boundary (between \w and \W)

Groups and Alternation

  • (abc) — Capturing group: matches abc and captures it for backreferences
  • (?:abc) — Non-capturing group: matches abc without capturing
  • a|b — Alternation: matches a or b

Regex Cheat Sheet

SyntaxNameDescription
.DotAny character (except newline)
\dDigit[0-9]
\wWord[a-zA-Z0-9_]
\sWhitespaceSpace, tab, newline
*Star0 or more (greedy)
+Plus1 or more (greedy)
?Question0 or 1 (optional)
{n,m}RangeBetween n and m repetitions
()GroupCapture group
[]ClassCharacter class
^CaretStart of string/line
$DollarEnd of string/line
|PipeAlternation (OR)
(?=)LookaheadAssert what follows
(?<=)LookbehindAssert what precedes

Lookahead and Lookbehind

Lookahead and lookbehind are zero-width assertions — they check if a pattern exists before or after the current position without consuming characters.

Anatomy of an Email Regex The regex pattern for email validation decomposed into labeled segments: local part, at sign, domain, dot, and TLD. Email Regex Anatomy ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ^ Start of string [a-zA-Z0-9._%+-]+ Local part: letters, digits, dots, specials (1 or more) @ Literal @ symbol [a-zA-Z0-9.-]+ Domain: letters, digits, dots, hyphens (1 or more) \. Escaped literal dot [a-zA-Z]{2,} TLD: letters only (2 or more chars) $ End of string Match: user.name+tag@sub.example.com user.name+tag = local | sub.example = domain | com = TLD No match: user@@example..com Double @ and consecutive dots violate the pattern

Positive lookahead (?=...) asserts that what follows matches a pattern. For example, \d+(?= USD) matches digits only if followed by USD — it matches 100 in “100 USD” but not in “100 EUR”.

Negative lookahead (?!...) asserts that what follows does not match. \d+(?! USD) matches digits not followed by USD.

Positive lookbehind (?<=...) asserts that what precedes matches. (?<=\$)\d+ matches digits preceded by $ — it matches 50 in “$50” but not in “50”.

Negative lookbehind (?<!...) asserts that what precedes does not match.

Lookarounds are zero-width — they check context without consuming characters, so the matched text does not include the lookaround pattern itself.

Common Patterns

Here are battle-tested patterns for frequent validation tasks:

  • Email (basic): ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
  • URL: https?://[^\s/$.?#].[^\s]*
  • IPv4: ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
  • Date (YYYY-MM-DD): ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
  • Phone (international): ^\+?[1-9]\d{1,14}$
  • Hex color: ^#([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$

Common Use Cases

  • Input validation: Checking that form fields contain valid emails, phone numbers, postal codes, or IDs before submission
  • Search and replace: Finding and transforming patterns across codebases — renaming variables, updating imports, reformatting data
  • Log parsing: Extracting timestamps, IP addresses, error codes, and stack traces from unstructured log files
  • Data extraction: Scraping structured data from semi-structured text like emails, PDFs, or HTML
  • URL routing: Web frameworks use regex patterns to map URLs to handler functions

Try These Examples

Email Regex Match Valid

This pattern matches standard email addresses like user@example.com. It checks for a local part (letters, digits, dots, underscores), an @ symbol, a domain name, and a TLD of at least 2 characters.

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Invalid Email (No Match) Invalid

This string has double @ and double dots, which the email regex rejects. The pattern requires exactly one @ and does not allow consecutive dots in the domain.

user@@example..com