Java Strings: Comparing Strings
In Java, strings are objects that contain a sequence of characters. Strings are used a lot and having ways to compare strings is very useful for sorting, matching, and a host of other functions that may be required in your program. There are three ways to compare entire strings in Java, they are:
- The equals( ) method that compares the content of two string, not case sensitive.
- The equalsIgnoreCase( ) method that compares the content of two string, is case sensitive.
- The == operator that compares two object references and will return true if those references are the same.
- The compareTo( ) method that compares two strings lexicographically, not case sensitive.
- The compareToIgnoreCase( ) that compares two strings lexicographically, is case sensitive.
The equals() Method
When comparing strings we reply on the equality operator == to compare the content of strings, this will only compare object references. To compare string content we have the equals( ) method which will compare two strings and return a boolean (true or false). The equals( ) method is case sensitive and will only return true of both strings are identical. The general syntax is:
string1.equals(string2);
Let’s see this in action by creating three string variables, each with different values: “Leaning Glue”, “learningglue”, and “learning glue”. You’ll notice that each of these strings contain the same characters in the same order but they are not identical.
String str1= "Learning Glue"; String str2= "learningglue"; String str3= "learning glue"; System.out.println("str1: " + str1 + ", str2: " + str2 + ", str3: " + str3); System.out.println("str1 equals to str2? "+str1.equals(str2)); System.out.println("str1 equals to str3? "+str1.equals(str3));
Since the equals( ) method is case sensitive it requires both strings to be both identical in structure (character, spaces, etc.) and case before it will return a boolean true. Since all three strings are not identical we get a false for both comparisons:
run: str1: Learning Glue, str2: learningglue, str3: learning glue str1 equals to str2? false str1 equals to str3? false BUILD SUCCESSFUL (total time: 0 seconds)
The equalsIgnoreCase() Method
Like the equals( ) method, this method will compare two strings and return true if the strings are equal, and false if not. But unlike the equals( ) method it will ignore difference in upper and lower case. The general syntax is:
string1.equalsIgnoreCase(String2);
Let’s add this to our code:
System.out.println("str1 equalsIgnoreCase to str2? "+str1.equalsIgnoreCase(str2)); System.out.println("str1 equalsIgnoreCase to str3? "+str1.equalsIgnoreCase(str3));
Now the output identifies str1 (“Learning Glue”) and str3 (“learning glue”) as equal, the only difference between them is their case:
run: str1: Learning Glue, str2: learningglue, str3: learning glue str1 equals to str2? false str1 equals to str3? false str1 equalsIgnoreCase to str2? false str1 equalsIgnoreCase to str3? true BUILD SUCCESSFUL (total time: 0 seconds)
The == operator
We have used the == or equality operator before. When used with primitives like int, double or boolean it will return true if the both primitives are equal to each other. For example:
public static void main(String[] args){ int studentGrade = 86; int averageGrade = 86; System.out.println(studentGrade == averageGrade); }
Since both grades are identical, using the == operator to compare them should output true:
run: true BUILD SUCCESSFUL (total time: 0 seconds)
However, when used with objects it will only return true if both objects have the same location in memory – essentially if they are the same object. Since strings are objects, when we compare two strings that have different memory locations, or are different objects, the output can only be false. But how can two strings have the same memory address? If you refer back to the first lesson in this section you will see that this happens when the the String Literal method is used to create a string, here is the general syntax:
String variableName = “String Content”;
If the string being created has content that already exists in the string constant pool, Java will not create another object for it – it will simply create a reference to it. And if two strings variables are referencing the same object then using the == operator to compare then will return true.
Let’s see an example of how this works. Write a program that contains three strings, two of which have the same content, and then use the == operator to compare pairs of these variables:
public static void main(String[] args){ String student01Name = "James"; String student02Name = "Bob"; String student03Name = "James"; System.out.println(student01Name == student02Name); System.out.println(student01Name == student03Name); System.out.println(student02Name == student03Name); }
In this example, student01Name and student03Name have the same content, “James”. When these are compared with the == operator the output should be true if both of these strings point to the same object. Here is the output from this example:
run: false true false BUILD SUCCESSFUL (total time: 0 seconds)
The first output is false when the strings “James” and “Bob” are compared, this is correct since both strings are different objects with different variable names.
The second output is true when the strings “James” and “James” are compared. This is correct because they are in fact the same object, even though they have different variable names. Here’s why:
- When Java was asked to create the variable student03Name with the value “James” it looked for any existing string objects with that value
- It found one that the variable student01Name was pointing to
- Rather than create a new duplicate object, it simple pointed student03Name to the same object
The third output is false as it compares the string “Bob” and “James”, and both of these are different objects.
One thing to watch out for is when strings are created using the new keyword, here is a reminder of the syntax:
String variableName = new String(“String Content”);
Using this method, an object is always created for every string.
The compareTo() method
If two strings have the same length and contain the same characters in the same positions, they are considered lexicographically equal. This is what the compareTo( ) method focuses on, it compares two strings lexicographically and returns an integer to indicate the degree on lexicographic positioning. If both the strings are equal the compareTo( ) method will return 0 indicating similar length, character case and character position. If it returns a positive value when the first string is lexicographically greater than the second string, and if it returns a negative value when the first string is lexicographically lower than the second string.
Relation | string1.compareTo(string2); | ||
---|---|---|---|
string1 | Less Than | string2 | Negative Integer |
string1 | Equal | string2 | Zero |
string1 | Greater Than | string2 | Positive Integer |
The size of the returned value indicates the lexicographic position difference. The general syntax is:
string1.compareTo(string2);
Let see how this works in practice. Let’s create four string variables each with the names: “James”, “Bob”, “Jane” and “bob” – case is important here. We will then use the compareTo( ) method to compare their lexicographic positions:
public static void main(String[] args){ String str1 = "James"; String str2 = "Bob"; String str3 = "Jane"; String str4 = "bob"; System.out.println(str1.compareTo(str2)); System.out.println(str1.compareTo(str3)); System.out.println(str2.compareTo(str4)); }
First we compared “James” and “Bob”. There are a number of differences with these two strings (length and characters) but the one difference that the compareTo( ) method will return is the difference in the first letters: J and B.
- The number of characters between J and B is 8 (I-H-G-F-E-D-C-B) and this number is returned to us in the output below.
- Note that this is a positive number, why is that?
- Well, the compareTo( ) method will return a positive value when the first string is lexicographically greater than the second string.
- The first string is “James” and the second string is “Bob”. J is indeed greater than B so a positive number will be returned, and the number of characters between them is 8.
- You can see that in this case a positive 8 return gives us a lot of information about where these two strings are lexicographically positioned in relation to each other.
What if the first letters are identical but there are differences elsewhere? In that case Java will move on to the second letters of each string and compare them. This process continues until a difference is identified and an appropriate integer returned. We can see this in the second pair of strings compared: “James” and Jane”:
- Both of these strings have identical first and second letters: J and a.
- The third letters are different: m and n. This is where the compareTo( ) method stops and reports.
- We know that the compareTo( ) method will return a negative value when the first string is lexicographically lower than the second string.
- Since m is lower than n by one position, the output is -1.
What about case? The compareTo( ) method is case sensitive, so there is a difference between B and b. This is demonstrated in the last pair of strings compared: “Bob” and “bob”:
- In Java, all uppercase letters come before lowercase letters
- So B is lower than b and will return a negative value
- Since the ASCII values of alphabets: A – Z = 65 to 90, a – z = 97 to 122, the difference between B and b is 32
- Therefore the output for “B”.compareTo(“b”) will be -32
Here is the full output:
run: 8 -1 -32 BUILD SUCCESSFUL (total time: 0 seconds)
The compareToIgnoreCase() Method
The compareToIgnoreCase( ) behaves in a similar way to the compareTo( ) method in that it will report on the lexicographic difference between two strings but it will ignore and difference is upper and lower case. The general syntax is:
string1.compareToIgnoreCase(string2);
In this example we have three string variables, each with the value James but with varying cases. We then apply a compareToIgnoreCase( ) method to compare all strings with each other:
public static void main(String[] args){ String str1 = "JAMES"; String str2 = "james"; String str3 = "James"; System.out.println(str1.compareToIgnoreCase(str2)); System.out.println(str2.compareToIgnoreCase(str3)); System.out.println(str1.compareToIgnoreCase(str3)); }
The output for each method is 0 because the compareToIgnoreCase( ) method is only concerned with the characters and has completely ignored any case differences:
run: 0 0 0 BUILD SUCCESSFUL (total time: 0 seconds)
The compareTo( ) and compareToIgnoreCase( ) methods are very useful for a lot different applications and they are particularly useful when sorting string arrays and lists.