JavaScript RegExp Tutorial

JavaScript RegExp Tutorial

posted in javascript on

A cheat sheet for the regex syntax in JavaScript.

MDN RegExp Guide

TL&DR

''.test(/^$/);
'ok'.replace(/(.)(.)/g, '$2$1'); // $$, $&, $`, $'

const matchG = 'aaa'.match(/a/g);
matchG == ['a', 'a', 'a'];

const matchNoG = 'str'.match(/(st)r/);
matchNoG == Object.assign(['str', 'st'], {groups: undefined, index: 0, input: 'str'});
matchNoG == /(st)r/.exec('str');

Common Usage

Define yourself a regex:

// Preferred way
const expressionLiteral = /\w+/;

// Only use this for dynamic patterns
const ctor = new RegExp('\\w+');

RegExp.prototype.test

RegExp.prototype.test

Returns true or false.

const didMatch = /a/.test('abc');

String.prototype.replace

String.prototype.replace

Without the g flag, only the first match is replaced. (which makes no difference in the example:)

'dog The'.replace(/(dog) (The)/g, '$2 $1');
Replacement Description
$$ A literal $
$& The matched string
$` Portion before the match
$' Portion after the match
$n With n < 100: the nth captured group (!! 1 indexed !!)

String.prototype.match

match() with the g flag:

String.prototype.match

Returns null or all matches as a string[]. There is no captured group info.

const match = 'a_ab_a'.match(/a(b?)/g);
expect(match).toEqual(['a', 'ab', 'a']);

match() without the g flag:
Returns null or the first match and its capturing groups.
The result is an array with additional fields (groups, index and input). exec

(Hence the weird toEqual array syntax in the code below…)

const match = '0abaa'.match(/a(?<theB>b?)/);
expect(match).toEqual([
    0: 'ab', // Entire matched string
    1: 'b',  // First captured group
    groups: {theB: 'b'}, // Results of named groups (ES2018)
    index: 1,
    input: '0abaa'
]);

// Without the g flag, match behaves exactly like exec:
const exec = /a(?<theB>b?)/.exec('0abaa');
expect(exec).toEqual(match);

!! RegExp.prototype.matchAll is “Under Construction” !!

🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 👷 👷 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧

🚧 👷 🚧  proposal-string-matchAll is a Stage 3 Candidate at the moment 🚧 👷 🚧

🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 👷 👷 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧 🚧

RegExp.prototype.exec

RegExp.prototype.exec

When you need the capturing groups of all matches, it’s exec to the rescue.

const globbing = /(a)(b2?)/g;
const input = 'ab_ab2';
// Indexes     0  3  

globbing.exec(input);
== Object.assign(['ab', 'a', 'b'], {index: 0});

globbing.exec(input);
== Object.assign(['ab2', 'a', 'b2'], {index: 3});

globbing.exec(input);
== null

Flags

const rl = /ab+c/i;
const rc = new RegExp('ab+c', 'i');

Available flags:

Flag Property Remarks StackOverflow
i .ignoreCase Case insensitive  
g .global Do not stop at first match but find all of them  
m .multiline ^ and $ match beginning/end of each line (otherwise of entire string) StackOverflow
s .dotAll . matches newlines. (ES2018)  
u .unicode /^.$/u.test('😀')  
y .sticky Use .lastIndex to match at that specific index only (overwrites the g flag) StackOverflow
  .flags Returns a string with the active flags  

Less Common

const flags: string = /a/ig.flags; // "gi"
const src: string = /a/.source; // "a"

String.prototype.search

String.prototype.search

A more powerful version of indexOf. Returns -1 if no match.

const index: number = /a/.search('a');

String.prototype.split

String.prototype.split

Usually used in the form of something like 'a,b,c'.split(','). But also possible to split on regex matches.

"a,b;c".split(/,|;/);
// --> ['a', 'b', 'c']

// Wrap in parentheses to include the separator.
"a,b;c".split(/(,|;)/);
// --> ['a', ',', 'b', ';', 'c']

RegExp.prototype.lastIndex

RegExp.prototype.lastIndex

Used by exec, test, … with the global (g) flag and for any of the above functions when using the sticky (y) flag.

const input = 'aab';
const regex = /a/g;
regex.test(input); // Returns true. lastIndex is now 1
regex.test(input); // Returns true. lastIndex is now 2
regex.test(input); // Returns false. lastIndex is reset to 0

Other interesting reads
Tags: cheat-sheet regex