String handling in Java

The basic aim of the String Handling concept is storing the string data in the main memory (RAM), manipulating the data of the String, retrieving the part of the String, etc. String Handling provides a lot of concepts that can be performed on a string such as concatenation of string, comparison of string, finding the substring, etc.

Implementing strings as built-in objects allows Java to provide a full complement of features that make string handling convenient. For example, Java has methods to compare two strings, search for a substring, concatenate two strings, and change the case of letters within a string. Also, String objects can be constructed in a number of ways, making it easy to obtain a string when needed.

Somewhat unexpectedly, when you create a String object, you are creating a string that cannot be changed. That is, once a String object has been created, you cannot change the characters that comprise that string. At first, this may seem to be a serious restriction. However, such is not the case. You can still perform all types of string operations. The difference is that each time you need an altered version of an existing string, a new String object is created that contains the modifications.

The original string is left unchanged. This approach is used because fixed, immutable strings can be implemented more efficiently than changeable ones. For those cases in which a modifiable string is desired, Java provides two options: StringBuffer and StringBuilder. Both hold strings that can be modified after they are created. The String, StringBuffer, and StringBuilder classes are defined in java.lang. Thus, they are available to all programs automatically. All are declared final, which means that none of these classes may be subclassed. This allows certain optimizations that increase performance to take place on common string operations. All three implement the CharSequence interface.

The String Constructors

The String class supports several constructors. To create an empty String, you call the default constructor. For example:

String s = new String ();

will create an instance of String with no characters in it

To create a String initialized by an array of characters, use the constructor shown here:

String(char chars[ ])

Here is an example:

char chars[] = { ‘a’, ‘b’, ‘c’ }; String s = new String(chars);

You can specify a subrange of a character array as an initializer using the following constructor:

String(char chars[ ], int startIndex, int numChars)

String Constructors Added by J2SE 5

J2SE 5 added two constructors to the String. The first supports the extended Unicode character set and is shown here:

String(int codePoints[ ], int startIndex, int numChars)

The second new constructor supports the new StringBuilder class. It is shown here:

 String(StringBuilder strBuildObj)

This constructs a String from the StringBuilder passed in strBuildObj.

String Length

The length of a string is the number of characters that it contains. To obtain this value, call the length( ) method, shown here:

int length( )

The following fragment prints “3”, since there are three characters in the string s:

Special String Operations

Because strings are a common and important part of programming, Java has added special support for several string operations within the syntax of the language. These operations include the automatic creation of new String instances from string literals, concatenation of multiple String objects by use of the + operator, and the conversion of other data types to a string representation. There are explicit methods available to perform all of these functions, but Java does them automatically as a convenience for the programmer and to add clarity

String Literals

The earlier examples showed how to explicitly create a String instance from an array of characters by using the new operator. However, there is an easier way to do this by using a string literal. For each string literal in your program, Java automatically constructs a String object. Thus, you can use a string literal to initialize a String object.

For example, the following code fragment creates two equivalent strings:

String Concatenation

In general, Java does not allow operators to be applied to String objects. The one exception to this rule is the + operator, which concatenates two strings, producing a String object as the result. This allows you to chain together a series of + operations. For example, the following fragment concatenates three strings:

String Concatenation with Other Data Types

You can concatenate strings with other types of data.

For example, consider this slightly different version of the earlier example:

String Conversion and toString( )

When Java converts data into its string representation during concatenation, it does so by calling one of the overloaded versions of the string conversion method valueOf( ) defined by String. valueOf( ) is overloaded for all the simple types and for type Object. For the simple types, valueOf( ) returns a string that contains the human-readable equivalent of the value with which it is called. For objects, valueOf( ) calls the toString( ) method on the object. We will look more closely at valueOf( ) later in this chapter. Here, let’s examine the toString( ) method because it is the means by which you can determine the string representation for objects of classes that you create.

Character Extraction

The String class provides a number of ways in which characters can be extracted from a String object. Each is examined here. Although the characters that comprise a string within a String object cannot be indexed as if they were a character array, many of the String methods employ an index (or offset) into the string for their operation. Like arrays, the string indexes begin at zero.

charAt( )

To extract a single character from a String, you can refer directly to an individual character via the charAt( ) method. It has this general form:

getChars( )

If you need to extract more than one character at a time, you can use the getChars( ) method. It has this general form:

getBytes( )

There is an alternative to getChars( ) that stores the characters in an array of bytes. This method is called getBytes( ), and it uses the default character-to-byte conversions provided by the platform. Here is its simplest form:

byte[ ] getBytes( )

Other forms of getBytes( ) are also available. getBytes( ) is most useful when you are exporting a String value into an environment that does not support 16-bit Unicode characters. For example, most Internet protocols and text file formats use 8-bit ASCII for all text interchange

toCharArray( )

If you want to convert all the characters in a String object into a character array, the easiest way is to call toCharArray( ). It returns an array of characters for the entire string. It has this general form:

char[ ] toCharArray( )

This function is provided as a convenience since it is possible to use getChars( ) to achieve the same result.

String Comparison

The String class includes several methods that compare strings or substrings within strings. Each is examined here.

equals( ) and equalsIgnoreCase( )

To compare two strings for equality, use equals( ). It has this general form:

boolean equals(Object str)

Here, str is the String object being compared with the invoking String object. It returns true if the strings contain the same characters in the same order, and false otherwise. The comparison is case-sensitive. To perform a comparison that ignores case differences, call equalsIgnoreCase( ). When it compares two strings, it considers A-Z to be the same as a-z. It has this general form:

boolean equalsIgnoreCase(String str)

// Demonstrate equals() and equalsIgnoreCase().

The output from the program is shown here:

startsWith( ) and endsWith( )

String defines two routines that are, more or less, specialized forms of regionMatches( ). The startsWith( ) method determines whether a given String begins with a specified string. Conversely, endsWith( ) determines whether the String in question ends with a specified string. They have the following general forms:

boolean startsWith(String str)

boolean endsWith(String str)

Here, str is the String being tested. If the string matches, true is returned. Otherwise, false is returned.

equals( ) Versus ==

It is important to understand that the equals( ) method and the == operator perform two different operations. As just explained, the equals( ) method compares the characters inside a String object. The == operator compares two object references to see whether they refer to the same instance. The following program shows how two different String objects can contain the same characters, but references to these objects will not compare as equal:

The variable s1 refers to the String instance created by “Hello”. The object referred to by s2 is created with s1 as an initializer. Thus, the contents of the two String objects are identical, but they are distinct objects. This means that s1 and s2 do not refer to the same objects and are, therefore, not ==, as is shown here by the output of the preceding example

Hello equals Hello -> true

Hello == Hello -> false

compareTo( )

Often, it is not enough to simply know whether two strings are identical. For sorting applications, you need to know which is less than, equal to, or greater than the next. A string is less than another if it comes before the other in dictionary order. A string is greater than another if it comes after the other in dictionary order. The String method compareTo( ) serves this purpose. It has this general form:

int compareTo(String str)

Searching Strings

The String class provides two methods that allow you to search a string for a specified character or substring:

  • indexOf( ) Searches for the first occurrence of a character or substring.
  • lastIndexOf( ) Searches for the last occurrence of a character or substring

To search for the first occurrence of a character, use

int indexOf(int ch)

To search for the last occurrence of a character, use

   int lastIndexOf(int ch)

Modifying a String

You can extract a substring using substring( ). It has two forms. The first is

String substring(int startIndex)

Here, startIndex specifies the index at which the substring will begin. This form returns a copy of the substring that begins at startIndex and runs to the end of the invoking string

The second form of substring( ) allows you to specify both the beginning and ending index of the substring:

   String substring(int startIndex, int endIndex)

Here, startIndex specifies the beginning index, and endIndex specifies the stopping point. The string returned contains all the characters from the beginning index, up to, but not including, the ending index.

concat( )

String concat(String str)

This method creates a new object that contains the invoking string with the contents of str appended to the end. concat( ) performs the same function as +. For example,

replace( )

The replace( ) method has two forms. The first replaces all occurrences of one character in the invoking string with another character. It has the following general form:

String replace(char original, char replacement)

Here, the original specifies the character to be replaced by the character specified by replacement. The resulting string is returned. For example,

String s = “Hello”.replace(‘l’, ‘w’);

trim( )

The trim( ) method returns a copy of the invoking string from which any leading and trailing whitespace has been removed. It has this general form:

Changing the Case of Characters Within a String

The method toLowerCase( ) converts all the characters in a string from uppercase to lowercase. The toUpperCase( ) method converts all the characters in a string from lowercase to uppercase. Nonalphabetical characters, such as digits, are unaffected. Here are the general forms of these methods: String toLowerCase( )

The output produced by the program is shown here:

Original: This is a test.

Uppercase: THIS IS A TEST.

Lowercase: this is a test.

StringBuffer

StringBuffer is a peer class of String that provides much of the functionality of strings. As you know, String represents fixed-length, immutable character sequences. In contrast, StringBuffer represents growable and writeable character sequences. StringBuffer may have characters and substrings inserted in the middle or appended to the end. StringBuffer will automatically grow to make room for such additions and often has more characters preallocated than are actually needed, to allow room for growth. Java uses both classes heavily, but many programmers deal only with String and let Java manipulate StringBuffer behind the scenes by using the overloaded + operator.

StringBuffer Constructors

StringBuffer defines these four constructors:

The default constructor (the one with no parameters) reserves room for 16 characters without reallocation. The second version accepts an integer argument that explicitly sets the size of the buffer. The third version accepts a String argument that sets the initial contents of the StringBuffer object and reserves room for 16 more characters without reallocation. StringBuffer allocates room for 16 additional characters when no specific buffer length is requested because reallocation is a costly process in terms of time. Also, frequent reallocations can fragment memory. By allocating room for a few extra characters, StringBuffer reduces the number of reallocations that take place. The fourth constructor creates an object that contains the character sequence contained in chars.

length( ) and capacity( )

The current length of a StringBuffer can be found via the length( ) method, while the total allocated capacity can be found through the capacity( ) method.

They have the following general forms:

Here is the output of this program, which shows how StringBuffer reserves extra space for additional manipulations:

charAt( ) and setCharAt( )

The value of a single character can be obtained from a StringBuffer via the charAt( ) method. You can set the value of a character within a StringBuffer using setCharAt( ). Their general forms are shown here:

getChars( )

To copy a substring of a StringBuffer into an array, use the getChars( ) method. It has this general form:

 

Here, sourceStart specifies the index of the beginning of the substring, and sourceEnd specifies an index that is one past the end of the desired substring. This means that the substring contains the characters from sourceStart through sourceEnd–1. The array that will receive the characters is specified by the target. The index within the target at which the substring will be copied is passed in targetStart. Care must be taken to assure that the target array is large enough to hold the number of characters in the specified substring

append( )

The append( ) method concatenates the string representation of any other type of data to the end of the invoking StringBuffer object. It has several overloaded versions. Here are a few of its forms:

 

String.valueOf( ) is called for each parameter to obtain its string representation. The result is appended to the current StringBuffer object. The buffer itself is returned by each version of append( ). This allows subsequent calls to be chained together, as shown in the following example:

The output of this example is shown here:

 a = 42!

The append( ) method is most often called when the + operator is used on String objects. Java automatically changes modifications to a String instance into similar operations on a StringBuffer instance. Thus, a concatenation invokes append( ) on a StringBuffer object. After the concatenation has been performed, the compiler inserts a call to toString( ) to turn the modifiable StringBuffer back into a constant String. All of this may seem unreasonably complicated. Why not just have one string class and have it behave more or less like StringBuffer? The answer is performance. There are many optimizations that the Java run time can make knowing that String objects are immutable. Thankfully, Java hides most of the complexity of conversion between Strings and StringBuffer. Actually, many programmers will never feel the need to use StringBuffer directly and will be able to express most operations in terms of the + operator on String variables.

substring( )

You can obtain a portion of a StringBuffer by calling substring( ). It has the following two forms:

StringBuilder

J2SE 5 adds a new string class to Java’s already powerful string handling capabilities. This new class is called StringBuilder. It is identical to StringBuffer except for one important difference: it is not synchronized, which means that it is not thread-safe. The advantage of StringBuilder is faster performance. However, in cases in which you are using multithreading, you must use StringBuffer rather than StringBuilder.

Leave a Reply