Split a String keeping the Whitespace in JavaScript

avatar

Borislav Hadzhiev

Last updated: Nov 18, 2021

banner

Check out my new book

Split a String keeping the Whitespace #

To split a string keeping the whitespace, call the split() method passing it the following regular expression - /(\s+)/. The regular expression uses a capturing group to preserve the whitespace when splitting the string.

index.js
const str = 'apple banana kiwi'; const results = str.split(/(\s+)/); // 👇️ ['apple', ' ', 'banana', ' ', 'kiwi'] console.log(results);
If you're looking to avoid regular expressions, scroll down to the next section.

The only parameter we passed to the String.split method is a regular expression.

The forward slashes / / mark the beginning and end of the regular expression.

The \s special character matches whitespace (spaces, tabs, newlines).

The plus + matches the preceding item (whitespace) one or more times, in other words it would collapse multiple consecutive spaces into 1.

The parenthesis () are called a capturing group and allow us to match the character and still include it in the results.

Here's an easy way to visualize how capturing groups work.

index.js
console.log('abc'.split(/b/)); // 👉️ ['a', 'c'] console.log('abc'.split(/(b)/)); // 👉️ ['a', 'b', 'c']

The second example uses a capturing group () to match the b character, but still include it in the results.

If you're looking to avoid using regular expressions, you can chain calls to the split() and join() methods.

This approach only works for spaces, not for tabs or newlines.
index.js
const str = 'apple banana kiwi'; const result = str.split(' ').join('# #').split('#'); console.log(result); // 👉️ ['apple', ' ', 'banana', ' ', 'kiwi']

We first split the string on each space, to get an array containing the words in the string.

index.js
const str = 'apple banana kiwi'; // 👇️ ['apple', 'banana', 'kiwi'] console.log(str.split(' '));

The next step is to convert the array to a string, by using the join() method.

index.js
const str = 'apple banana kiwi'; // 👇️ "apple# #banana# #kiwi" console.log(str.split(' ').join('# #'));

We used a hash #, however we could have used any character, as long as there is a space in the middle.

The last step is to split the string on each hash.

index.js
const str = 'apple banana kiwi'; const result = str.split(' ').join('# #').split('#'); console.log(result); // 👉️ ['apple', ' ', 'banana', ' ', 'kiwi']
Which approach you pick is a matter of personal preference. I'd go with the regular expression in this scenario because I find it more direct and intuitive. I would also add comments regarding how capturing groups work to improve readability.
Use the search field on my Home Page to filter through my more than 3,000 articles.