String

Overview

The String object is one of the three wrapper objects natively provided by JavaScript to generate string objects.

var s1 = "abc";
var s2 = new String("abc");

typeof s1; // "string"
typeof s2; // "object"

s2.valueOf(); // "abc"

In the above code, the variable s1 is a string, and s2 is an object. Since s2 is a string object, what the s2.valueOf method returns is the original string it corresponds to.

A string object is an array-like object (much like an array, but not an array).

new String("abc")(
  // String {0: "a", 1: "b", 2: "c", length: 3}

  new String("abc")
)[1]; // "b"

In the above code, the string object corresponding to the string abc has numeric keys (0, 1, 2) and the length property, so the value can be taken like an array.

In addition to being used as a constructor, the String object can also be used as a tool method to convert any type of value into a string.

String(true); // "true"
String(5); // "5"

The above code converts the boolean value true and the value 5 into strings respectively.

Static method

String.fromCharCode()

The static method provided by the String object (that is, the method defined in the object itself, not the method defined in the object instance) is mainly String.fromCharCode(). The parameter of this method is one or more values, representing Unicode code points, and the return value is a string composed of these code points.

String.fromCharCode(); // ""
String.fromCharCode(97); // "a"
String.fromCharCode(104, 101, 108, 108, 111);
// "hello"

In the above code, if the parameter of the String.fromCharCode method is empty, an empty string will be returned; otherwise, the Unicode string corresponding to the parameter will be returned.

Note that this method does not support characters whose Unicode code point is greater than 0xFFFF, that is, the parameter passed in cannot be greater than 0xFFFF (ie 65535 in decimal).

String.fromCharCode(0x20bb7);
// "ஷ"
String.fromCharCode(0x20bb7) === String.fromCharCode(0x0bb7);
// true

In the above code, the String.fromCharCode parameter 0x20BB7 is greater than 0xFFFF, resulting in an error in the returned result. The character corresponding to 0x20BB7 is the Chinese character 𠮷, but the returned result is another character (code point 0x0BB7). This is because String.fromCharCode finds that the parameter value is greater than 0xFFFF, it will ignore the extra bits (ie ignore the 2 in 0x20BB7).

The root cause of this phenomenon is that characters with a code point greater than 0xFFFF occupy four bytes, and JavaScript supports two-byte characters by default. In this case, you must split 0x20BB7 into two characters.

String.fromCharCode(0xd842, 0xdfb7);
// "𠮷"

In the above code, 0x20BB7 is split into two characters 0xD842 and 0xDFB7 (that is, two two-byte characters are combined into one four-byte character), and the correct result can be obtained. The four-byte representation of characters with code points greater than 0xFFFF is determined by the UTF-16 encoding method.

Instance attributes

String.prototype.length

The length property of a string instance returns the length of the string.

"abc".length; // 3

Example method

String.prototype.charAt()

The charAt method returns the character at the specified position, and the parameter is the numbered position starting from 0.

var s = new String("abc");

s.charAt(1); // "b"
s.charAt(s.length - 1); // "c"

This method can be replaced by array subscripts.

"abc".charAt(1); // "b"
"abc"[1]; // "b"

If the argument is negative or greater than or equal to the length of the string, charAt returns an empty string.

"abc".charAt(-1); // ""
"abc".charAt(3); // ""

String.prototype.charCodeAt()

The charCodeAt() method returns the Unicode code point (decimal representation) at the specified position of the string, which is equivalent to the inverse operation of String.fromCharCode().

"abc".charCodeAt(1); // 98

In the above code, the character at position 1 of abc is b, and its Unicode code point is 98.

Without any parameters, charCodeAt returns the Unicode code point of the first character.

"abc".charCodeAt(); // 97

If the argument is negative or greater than or equal to the length of the string, charCodeAt returns NaN.

"abc".charCodeAt(-1); // NaN
"abc".charCodeAt(4); // NaN

Note that the Unicode code point returned by the charCodeAt method will not be greater than 65536 (0xFFFF), that is, only the code point of a two-byte character is returned. If you encounter a character with a code point greater than 65536 (four-byte character), you must use charCodeAt twice in succession, not only read in charCodeAt(i), but also read in charCodeAt(i+1), Put the two values ​​together to get the accurate character.

String.prototype.concat()

The concat method is used to concatenate two strings and return a new string without changing the original string.

var s1 = "abc";
var s2 = "def";

s1.concat(s2); // "abcdef"
s1; // "abc"

The method can accept multiple parameters.

"a".concat("b", "c"); // "abc"

If the parameter is not a string, the concat method will convert it to a string before concatenating.

var one = 1;
var two = 2;
var three = "3";

"".concat(one, two, three); // "123"
one + two + three; // "33"

In the above code, the concat method converts the parameters into strings and then concatenates them, so it returns a three-character string. In contrast, the plus operator does not convert types when both operands are numeric values, so it returns a two-character string.

String.prototype.slice()

The slice() method is used to take out the substring from the original string and return it without changing the original string. Its first parameter is the starting position of the substring, and the second parameter is the ending position of the substring (excluding this position).

"JavaScript".slice(0, 4); // "Java"

If the second parameter is omitted, it means that the substring continues to the end of the original string.

"JavaScript".slice(4); // "Script"

If the parameter is a negative value, it means the position counted down from the end, that is, the negative value plus the length of the string.

"JavaScript".slice(-6); // "Script"
"JavaScript".slice(0, -6); // "Java"
"JavaScript".slice(-2, -1); // "p"

If the first parameter is greater than the second parameter (in the case of a positive number), the slice() method returns an empty string.

"JavaScript".slice(2, 1); // ""

String.prototype.substring()

The substring method is used to take a substring from the original string and return it without changing the original string, which is very similar to the slice method. Its first parameter indicates the start position of the substring, and the second position indicates the end position (the returned result does not include this position).

"JavaScript".substring(0, 4); // "Java"

If the second parameter is omitted, it means that the substring continues to the end of the original string.

"JavaScript".substring(4); // "Script"

If the first parameter is greater than the second parameter, the substring method will automatically change the positions of the two parameters.

"JavaScript".substring(10, 4); // "Script"
// Equivalent to
"JavaScript".substring(4, 10); // "Script"

In the above code, swapping the two parameters of the substring method will get the same result.

If the argument is a negative number, the substring method will automatically convert the negative number to 0.

"JavaScript".substring(-3); // "JavaScript"
"JavaScript".substring(4, -3); // "Java"

In the above code, the parameter -3 in the second example will automatically become 0, which is equivalent to 'JavaScript'.substring(4, 0). Since the second parameter is smaller than the first parameter, the position will be automatically exchanged, so Java is returned.

Since these rules are counterintuitive, the substring method is not recommended, and slice should be used in preference.

String.prototype.substr()

The substr method is used to take out the substring from the original string and return it without changing the original string. It has the same effect as the slice and substring methods.

The first parameter of the substr method is the starting position of the substring (counting from 0), and the second parameter is the length of the substring.

"JavaScript".substr(4, 6); // "Script"

If the second parameter is omitted, it means that the substring continues to the end of the original string.

"JavaScript".substr(4); // "Script"

If the first parameter is a negative number, it indicates the character position of the reciprocal calculation. If the second parameter is a negative number, it will be automatically converted to 0, so an empty string will be returned.

"JavaScript".substr(-6); // "Script"
"JavaScript".substr(4, -1); // ""

In the above code, the parameter -1 in the second example is automatically converted to 0, indicating that the length of the substring is 0, so an empty string is returned.

String.prototype.indexOf(), String.prototype.lastIndexOf()

The indexOf method is used to determine the position of the first occurrence of a string in another string, and the return result is the position where the match begins. If it returns -1, it means there is no match.

"hello world".indexOf("o"); // 4
"JavaScript".indexOf("script"); // -1

The indexOf method can also accept a second parameter, which means starting from this position and matching backwards.

"hello world".indexOf("o", 6); // 7

The usage of the lastIndexOf method is the same as that of the indexOf method. The main difference is that lastIndexOf matches from the tail, and indexOf matches from the head.

"hello world".lastIndexOf("o"); // 7

In addition, the second parameter of lastIndexOf means to match forward from this position.

"hello world".lastIndexOf("o", 6); // 4

String.prototype.trim()

The trim method is used to remove the spaces at both ends of the string and return a new string without changing the original string.

"hello world".trim();
// "hello world"

This method removes not only spaces, but also tab characters (\t, \v), line feed (\n), and carriage return (\r).

"\r\nabc \t".trim(); //'abc'

String.prototype.toLowerCase(), String.prototype.toUpperCase()

The toLowerCase method is used to convert a string to all lowercase, and the toUpperCase method is to convert all to uppercase. They all return a new string without changing the original string.

"Hello World".toLowerCase();
// "hello world"

"Hello World".toUpperCase();
// "HELLO WORLD"

String.prototype.match()

The match method is used to determine whether the original string matches a certain substring, and returns an array whose members are the first string that matches. If no match is found, null is returned.

"cat, bat, sat, fat".match("at"); // ["at"]
"cat, bat, sat, fat".match("xt"); // null

The returned array also has the index attribute and the input attribute, which respectively represent the starting position of the matched string and the original string.

var matches = "cat, bat, sat, fat".match("at");
matches.index; // 1
matches.input; // "cat, bat, sat, fat"

The match method can also use regular expressions as parameters, see the chapter "Regular Expressions" for details.

String.prototype.search(), String.prototype.replace()

The usage of the search method is basically the same as match, but the return value is the first position of the match. If no match is found, -1 is returned.

"cat, bat, sat, fat".search("at"); // 1

The search method can also use regular expressions as parameters, see the section "Regular Expressions" for details.

The replace method is used to replace the matched substring. Generally, only the first match is replaced (unless a regular expression with the g modifier is used).

"aaa".replace("a", "b"); // "baa"

The replace method can also use regular expressions as parameters, see the section "Regular Expressions" for details.

String.prototype.split()

The split method splits the string according to the given rules and returns an array of the split substrings.

"a|b|c".split("|"); // ["a", "b", "c"]

If the split rule is an empty string, the members of the returned array are each character of the original string.

"a|b|c".split(""); // ["a", "|", "b", "|", "c"]

If the parameter is omitted, the only member of the returned array is the original string.

"a|b|c".split(); // ["a|b|c"]

If the two parts satisfying the segmentation rule are next to each other (that is, there are no other characters between the two separators), there will be an empty string in the returned array.

"a||c".split("|"); // ['a','','c']

If the part that satisfies the segmentation rule is at the beginning or end of the string (that is, there are no other characters before or after it), the first or last member of the returned array is an empty string.

"|b|c".split("|"); // ["", "b", "c"]
"a|b|".split("|"); // ["a", "b", ""]

The split method can also accept a second parameter to limit the maximum number of members of the returned array.

"a|b|c".split("|", 0); // []
"a|b|c".split("|", 1); // ["a"]
"a|b|c".split("|", 2); // ["a", "b"]
"a|b|c".split("|", 3); // ["a", "b", "c"]
"a|b|c".split("|", 4); // ["a", "b", "c"]

In the above code, the second parameter of the split method determines the number of members of the returned array.

The split method can also use regular expressions as parameters, see the section "Regular Expressions" for details.

String.prototype.localeCompare()

The localeCompare method is used to compare two strings. It returns an integer. If it is less than 0, it means that the first string is less than the second string; if it is equal to 0, it means that the two are equal; if it is greater than 0, it means that the first string is greater than the second string.

"apple".localeCompare("banana"); // -1
"apple".localeCompare("apple"); // 0

The biggest feature of this method is that it considers the order of natural language. For example, under normal circumstances, uppercase English letters are smaller than lowercase letters.

"B" > "a"; // false

In the above code, the letter B is smaller than the letter a. Because JavaScript uses Unicode code point comparison, the code point of B is 66, and the code point of a is 97.

However, the localeCompare method considers the ordering of natural language, and ranks B before a.

"B".localeCompare("a"); // 1

In the above code, the localeCompare method returns the integer 1, which means that B is larger.

localeCompare can also have a second parameter, which specifies the language used (the default is English), and then compares according to the rules of the language.

"ä".localeCompare("z", "de"); // -1
"ä".localeCompare("z", "sv"); // 1

In the above code, de means German, and sv means Swedish. In German, ä is less than z, so -1 is returned; in Swedish, ä is greater than z, so 1 is returned.