It is very easy to spilt a string in java. Before JDK1.4 code, there was StringTokenizer class for spliting a string using a delimiter, however this class has become legacy in JDK1.4 and upward. This is what JDK1.4 documentation says about this class.
"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead."
JDK 1.4 introduces two split function in String class which makes spilting a string using delimiter very easy. These functions are
public String[] split(String regex)
public String[] split(String regex, int limit)
- Parameters:
regex- the delimiting regular expressionlimit- the result threshold- Returns:
- the array of strings computed by splitting this string around matches of the given regular expression
- Throws:
PatternSyntaxException- if the regular expression's syntax is invalid
Here thumb rule is if a delimiter character (i.e. regular expression) has a special meaning in java then it should be preceeded by double escape character('\\') else you can use delimiter character directly.
1.) How do you split a string using semicolon ( ; ) delimiter?
String str = "a;b;c;d";
Answer: str.split(";"); semicolon character ( ; ) has no special meaning in java so you can use it directly for splitting a string.
2.) How do you split a string using pipe ( | ) delimiter?
String str = "a|b|c|d";
Answer: str.split("\\|"); remember pipe character ( | ) is 'bitwise inclusive OR' operator in java, so it has special meaning in java so you can't use it directly for splitting a string. You will have to add double escape character('\\') before pipe ( | ) for regular expression.
3.) How do you split a string using space ( ) delimiter?
String str = "a b c d";
Answer: str.split("\\s"); This one is interesting, regular expression ("\\s") stands for blank space, if you don't put double escape character('\\'), it will become character 's'. This expression breaks a string whereever it finds blank space.
4.) How do you split a string using new line?
Answer: str.split("\\n"); similarly regular expression ("\\n") stands for new line. This expression will break a multi line string from every new line.
public class SplitTest {
public static void main(String [] args){
SplitTest st = new SplitTest();
st.mySplit();
}
private void mySplit(){

String str = "a;b;c;d";
String []tokens = str.split(";");
for(int i=0; i < tokens.length; i++){
System.out.println("i"+i +":"+ tokens[i]);
}
}
}
Result:
i0:a
i1:b
i2:c
i3:d
That is correct result but problem arise if there are empty tokens in strings.
public class SplitTest {
public static void main(String [] args){ 
SplitTest st = new SplitTest();
st.mySplit();
}
private void mySplit(){

String str = ";;a;b;;;c;d;;;;"; 
String []tokens = str.split(";"); 
for(int i=0; i < tokens.length; i++){ 

System.out.println("i"+i +":"+ tokens[i]);
}
}
}
Result:
i0:
i1:
i2:a
i3:b
i4:
i5:
i6:c
i7:d
That is incorrect result as it doesn't return empty tokens at end of string. How to solve this problem? This is where second split function comes into picture.
public String[] split(String regex, int limit)
The second parameter limit controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
To solve our problem, add '-1' as second parameter i.e. limit with split function.
public class SplitTest {
public static void main(String [] args){
SplitTest st = new SplitTest();
st.mySplit();
}
private void mySplit(){

String str = ";;a;b;;;c;d;;;;";
String []tokens = str.split(";",-1);
for(int i=0; i < tokens.length; i++){

System.out.println("i"+i +":"+ tokens[i]);
}
}
}
Result:
i0:
i1:
i2:a
i3:b
i4:
i5:
i6:c
i7:d
i8:
i9:
i10:
i11:
Which is correct result. Similarly you can try this program with other limit values and see result yourself.







"StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead."
Source:
http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html