JavaScript RegExp Tutorial

JavaScript RegExp Tutorial

posted in javascript on • last updated on

A cheat sheet for the regex syntax in JavaScript.

MDN RegExp Guide

TL&DR

/^$/.test(''); // boolean
'ok'.replace(/(o)(k)/g, '$2$1');
// Other replacements:
// $$ (literal), $& (all), $` (before), $' (after), $<name>

const matchG = 'aaa'.match(/a/g);
matchG == ['a', 'a', 'a'];

const matchNoG = 'str'.match(/(st)r/);
matchNoG == Object.assign(['str', 'st'], {groups: undefined, index: 0, input: 'str'});
matchNoG == /(st)r/.exec('str');

Common Usage

Define yourself a regex:

// Preferred way
const expressionLiteral = /\w+/;

// Only use this for dynamic patterns
const ctor = new RegExp('\\w+');

RegExp.prototype.test

RegExp.prototype.test

Returns true or false.

const didMatch = /a/.test('abc');

String.prototype.replace

String.prototype.replace

Without the g flag, only the first match is replaced. (which makes no difference in the example:)
ECMAScript2021 added replaceAll: you can now replace all occurences without using a regex.

'too  many  spaces'.replace(/  /g, ' ');
'too  many  spaces'.replaceAll('  ', ' ');
Replacement Description
$$ A literal $
$` Portion before the match
$' Portion after the match
$& The whole matched string
$n With n < 100: the nth captured group (!! 1 indexed !!)
$<Name> Named capturing group

String.prototype.match

match() with the g flag:

String.prototype.match

Returns null or all matches as a string[]. There is no captured group info.

const match = 'a_ab_a'.match(/a(b?)/g);
expect(match).toEqual(['a', 'ab', 'a']);

match() without the g flag:
Returns null or the first match and its capturing groups.
The result is an array with additional fields (groups, index and input). exec

(Hence the weird toEqual array syntax in the code below…)

const match = '0abaa'.match(/a(?<theB>b?)/);
expect(match).toEqual([
    0: 'ab', // Entire matched string
    1: 'b',  // First captured group
    groups: {theB: 'b'}, // Results of named groups (ES2018)
    index: 1,
    input: '0abaa'
]);

// Without the g flag, match behaves exactly like exec:
const exec = /a(?<theB>b?)/.exec('0abaa');
expect(exec).toEqual(match);

RegExp.prototype.exec

RegExp.prototype.exec

When you need the capturing groups of all matches, it’s exec to the rescue.

const globbing = /(a)(b2?)/g;
const input = 'ab_ab2';
// Indexes     0  3  

globbing.exec(input);
== Object.assign(['ab', 'a', 'b'], {index: 0});

globbing.exec(input);
== Object.assign(['ab2', 'a', 'b2'], {index: 3});

globbing.exec(input);
== null

RegExp.prototype.matchAll (ES2020)

RegExp.prototype.matchAll

Easier matching and grouping, plus avoiding the while loop necessary when using the “old” exec.

const regex = /(a)(b2?)/g; // TypeError when not g(lobal)!
const result = 'ab_ab2'.matchAll(regex);

for (const match of result) {
  // First match: ['ab', 'a', 'b', index: 0, input: 'ab_ab2', groups: undefined]
  // Second match: ['ab2', 'a', 'b2', index: 3, input: 'ab_ab2', groups: undefined]
}

Flags

const rl = /ab+c/i;
const rc = new RegExp('ab+c', 'i');

Available flags:

Flag Property Remarks StackOverflow
i .ignoreCase Case insensitive  
g .global Do not stop at first match but find all of them  
m .multiline ^ and $ match beginning/end of each line (otherwise of entire string) StackOverflow
s .dotAll . matches newlines. (ES2018)  
u .unicode /^.$/u.test('😀')  
y .sticky Use .lastIndex to match at that specific index only (overwrites the g flag) StackOverflow
d .hasIndices Adds an indices property to the match object that contains the start/end indices of each capture group (2022)  
  .flags Returns a string with the active flags  

Less Common

const flags: string = /a/ig.flags; // "gi"
const src: string = /a/.source; // "a"

String.prototype.search

String.prototype.search

A more powerful version of indexOf. Returns -1 if no match.

const index: number = /a/.search('a');

String.prototype.split

String.prototype.split

Usually used in the form of something like 'a,b,c'.split(','). But also possible to split on regex matches.

"a,b;c".split(/,|;/);
// --> ['a', 'b', 'c']

// Wrap in parentheses to include the separator.
"a,b;c".split(/(,|;)/);
// --> ['a', ',', 'b', ';', 'c']

RegExp.prototype.lastIndex

RegExp.prototype.lastIndex

Used by exec, test, … with the global (g) flag and for any of the above functions when using the sticky (y) flag.

const input = 'aab';
const regex = /a/g;
regex.test(input); // Returns true. lastIndex is now 1
regex.test(input); // Returns true. lastIndex is now 2
regex.test(input); // Returns false. lastIndex is reset to 0

Stuff that came into being during the making of this post
Other interesting reads
Updates
  • 6 October 2023 : Added ES2020 & ES2021 RegExp enhancements
  • 22 May 2023 : RegExp.prototype.matchAll was added in ES2020
Tags: cheat-sheet regex