Among other changes, JDK 11 introduces 6 new methods for java.lang.String class:
repeat(int)- Repeats the String as many times as provided by the
lines()- Uses a Spliterator to lazily provide lines from the source string
isBlank()- Indicates if the String is empty or contains only white space characters
stripLeading()- Removes the white space from the beginning
stripTrailing()- Removes the white space from the end
strip()- Removes the white space from both, beginning and the end of string
strip() looks very similar to
trim(). As per this article
strip*() methods are designed to:
The String.strip(), String.stripLeading(), and String.stripTrailing() methods trim white space [as determined by Character.isWhiteSpace()] off either the front, back, or both front and back of the targeted String.
String.trim() JavaDoc states:
/** * Returns a string whose value is this string, with any leading and trailing * whitespace removed. * ... */
Which is almost identical to the quote above.
What exactly the difference between
String.strip() since Java 11?
strip() is "Unicode-aware" evolution of
String::trim has existed from early days of Java when Unicode had not fully evolved to the standard we widely use today.
The definition of space used by String::trim is any code point less than or equal to the space code point (/u0020), commonly referred to as ASCII or ISO control characters.
Unicode-aware trimming routines should use Character::isWhitespace(int).
Additionally, developers have not been able to specifically remove indentation white space or to specifically remove trailing white space.
Introduce trimming methods that are Unicode white space aware and provide additional control of leading only or trailing only.
A common characteristic of these new methods is that they use a different (newer) definition of "whitespace" than did old methods such as
String.trim(). Bug JDK-8200373.
The current JavaDoc for String::trim does not make it clear which definition of "space" is being used in the code. With additional trimming methods coming in the near future that use a different definition of space, clarification is imperative. String::trim uses the definition of space as any codepoint that is less than or equal to the space character codepoint (/u0020.) Newer trimming methods will use the definition of (white) space as any codepoint that returns true when passed to the Character::isWhitespace predicate.
isWhitespace(char) was added to
Character with JDK 1.1, but the method
isWhitespace(int) was not introduced to the
Character class until JDK 1.5. The latter method (the one accepting a parameter of type
int) was added to support supplementary characters. The Javadoc comments for the
Character class define supplementary characters (typically modeled with int-based "code point") versus BMP characters (typically modeled with single character):
The set of characters from U+0000 to U+FFFF is sometimes referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater than U+FFFF are called supplementary characters. The Java platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes. In this representation, supplementary characters are represented as a pair of char values ... A char value, therefore, represents Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. An int value represents all Unicode code points, including supplementary code points. ... The methods that only accept a char value cannot support supplementary characters. ... The methods that accept an int value support all Unicode characters, including supplementary characters.