Basics of Javascript · String · split() (method)

This article is a transcript of my free youtube series about basics of web development. If you prefer watching over reading, feel free to visit my channel “Dev Newbs”.
Hi to all of the developers watching this episode! Today’s method is one of my favourite ones. Don’t ask me why I have favourite String methods, please. I guess I am just weird. But now seriously — we will be splitting strings like crazy. Brace yourselves!
The split() method is used to split a string into an array of substrings, and return the new array. The split() method does not change the original string. The method has two parameters. Both are optional.
First specifies the character, or the regular expression, to use for splitting the string. If the parameter is omitted, the entire string will be returned as an array with only one item. On the other hand, if an empty string (“”) is used as the separator, the string is split between each character.
Second parameter is an integer number that specifies the amount of splits. Items after the split limit will not be included in the resulting array.
Let’s see how it works in example 1.
const str = "Hello Dev Newbs! 😃";// separator is not provided
str.split() // ["Hello Dev Newbs! 😃"]// separator is empty string
str.split("")// ["H", "e", "l", "l", "o", " ", "D", "e", "v", " ", "N", "e", "w", // "b", "s", "!", " ", "\ud83d", "\ude03"]// separator is empty space
str.split(" ") // ["Hello", "Dev", "Newbs!", "😃"]// separator is RegExp specifying (optional) empty space + capital letter
str.split(/[\s]*[A-Z]/) // ["", "ello", "ev", "ewbs! 😃"]// separator is empty string & limit is 5 first letters
str.split("", 5) // ["H", "e", "l", "l", "o"]
As we can see in the first example, omitting the input parameter results in the array with only one element consisting of the entire string.
Empty string splits the string into individual UTF-16 “characters”. I want to point out that characters that consist of more than one code unit will be split into code units, not code points. Therefore our emoji symbol is split into its code unit values as we see in the output.
If we use regular expression, the exact match has to be found, otherwise the result is an array with one element consisting of the entire string.
If the split characters or regular expression is at the beginning or the end, splitting is still happening and we get an empty string element either at the beginning or at the end, based on which case it is. You can see that happening in case of separator being regular expression, where we are searching for capital letter optionally preceded by empty space. Capital H at the beginning of the string fits this condition, so the split creates empty space as a first element of the resulting array.
There is some counterintuitive behavior when it comes to the regular expressions that I should mention. If separator is a regular expression with capturing parentheses, then each time separator matches, the results (including any undefined results) of the capturing parentheses are spliced into the output array.
Let’s see the behavior in the second example.
const months = 'January,February ,March,April, May, June , July,August, ...';const re1 = /\s*,\s*/;
const re2 = /\s*(,)\s*/;months.split(re1) // ["January", "February", "March", "April", "May", "June", "July", // "August", "..."]months.split(re2)// ["January", ",", "February", ",", "March", ",", "April", ",",
// "May", ",", "June", ",", "July", ",", "August", ",", "..."]
First regular expression does not use parentheses and we get what we would expect. However in the second case, the parentheses contain comma, so this match is also added into the resulting array.
There is also another interesting aspect to this example and it is usage of expression “\s*” to remove one or more potential empty spaces. This can come in handy when we try to split strings that do not have exactly defined structure.
Let’s see one more example to actually show a use case, when this behavior makes sense.
const numbers = 'Uno Dos Three 4 Five 6 Sieben Osem 9 zehn';const re = /\s*(\d+)\s*/;numbers.split(re) // ["Uno Dos Three", "4", "Five", "6", "Sieben Osem", "9", "zehn"]
We are trying to split the string containing different representations of numbers. Our numbers can be written as words in different languages and even using digits. We first want to split the string into words numbers and digits numbers. We can do that using the provided regular expression. Then we can process each group of word numbers as we see fit.
Of course, there are multiple other solutions, some are more effective than others, but still… It’s nice to also have this option.
Okay, that was the String method split() covered. I hope you enjoyed this episode.
As always, thanks for your time. I appreciate it a lot. See you in the next article.