Another "How to split a string on | (pipe) in Java" post? NO!!! Please, read the API reference!!!!!

This is not about Java, nor it is about split or string. This entry is about

The need to check the API reference always.

Lets me illustrate this need with a short tale.


A frecuently asked question between novice Java programmer is "How to split a string on | (pipe)".

It could be naive to think that this is going to work:

tokensArray = content.split("|");

Well, you are (supposed to be) a neat programmer, and you (think you) do your work. You can analize the first few responses after google the question: this one, this one, just another one, or the last one for example.

And after a brief "research", you (are proud of yourself becouse you) have the "right" answer:

tokensArray = content.split("\\|");

Yes, the argument is a regular expression, and in regex | is a metacharacter representing the OR operator. You need to escape that character using \, bla bla bla bla.

Let's try it in a sample. We are parsing a set of lines (records), each one with a bunch of strings (fields) delimited by | (pipe). The structure is:

FIRST_NAME | LAST_NAME | PROFESSION | BIRTH_DATE | DATE_OF_DEATH

With this fragment of code

package eu.albertomorales.mite.helloWorld;

public class SplitSample {

public static void main(String[] args) {
    SplitSample sample = new SplitSample();
    sample.doIt();
}

private void doIt() {
    /*
     * First line
     */
    String firstLine = "Terry|Pratchett|writer|1948|2015";
    splitAndPrint(firstLine);
}

private void splitAndPrint(String content) {
    String[] tokensArray;
    String firstName, familyName, profession, birthDate, dateOfDeath;
    tokensArray = content.split("\\|");
    firstName = tokensArray[0];
    familyName = tokensArray[1];
    profession = tokensArray[2];
    birthDate = tokensArray[3];
    dateOfDeath = tokensArray[4];
    System.out.println("First Name: "+firstName);
    System.out.println("Family Name: "+familyName);
    System.out.println("Profession: "+profession);
    System.out.println("Birth Date: "+birthDate);
    System.out.println("Date of Death: "+dateOfDeath);
}

you get this output:

First Name: Terry Family Name: Pratchett
Profession: writer
Birth Date: 1948
Date of Death: 2015

But now let's run the sample again with an alive writer (a writer that has an empty date of death).

...

    /*
     * Second line
     */
    String secondLine = "Joanne|Rowling|writer|1965|";
    splitAndPrint(secondLine);

...

You get a gorgeous exception (ArrayIndexOutOfBoundsException). Do you know why? Because you haven´t read the API reference, and the default behaviour of split method is NOT the same you have expected. Simple.


And we will code and sing and dance and we will live happily ever after.