String manipulation

This chapter introduces the syntax of Bash string operations.

The length of the string

The syntax for obtaining the length of a string is as follows.

${#varname}

Below is an example.

$ myPath=/home/cam/book/long.file.name
$ echo ${#myPath}
29

The braces {} are necessary, otherwise Bash will understand $# as the number of parameters in the script, and the variable name as text.

$ echo $#myvar
0myvar

In the above example, Bash explained $# and myvar separately.

Substring

The syntax of string extraction substring is as follows.

${varname:offset:length}

The meaning of the above grammar is to return the substring of the variable $varname, starting from the position offset (counting from 0), and the length is length.

$ count=frogfootman
$ echo ${count:4:4}
foot

The above example returns the substring foot of length 4 starting from position 4 of the string frogfootman.

This kind of grammar cannot directly manipulate the string, and can only read the string through variables, and will not change the original string. The dollar sign in front of the variable can be omitted.

# Error
$ echo ${"hello":2:3}

In the above example, "hello" is not a variable name, causing Bash to report an error.

If length is omitted, it starts at the position offset and returns to the end of the string.

$ count=frogfootman
$ echo ${count:4}
footman

The above example returns the substring of the variable count from position 4 to the end.

If offset is a negative value, it means counting from the end of the string. Note that there must be a space in front of the negative number to prevent confusion with the default value syntax of ${variable:-word}. At this time, you can also specify length, length can be positive or negative (negative value cannot exceed the length of offset).

$ foo="This string is long."
$ echo ${foo: -5}
long.
$ echo ${foo: -5:2}
lo
$ echo ${foo: -5:-2}
lon

In the above example, offset is -5, which means that the interception starts from the 5th character from the bottom, so long. is returned. If the specified length length is 2, then lo is returned; if length is -2, it means to exclude 2 characters from the end of the string, so lon is returned.

Search and replace

Bash provides multiple methods for string search and replacement.

(1) Pattern matching at the head of the string.

The following two syntaxes can check whether the beginning of a string matches a given pattern. If the match is successful, the matched part is deleted and the remaining part is returned. The original variables will not change.

# If pattern matches the beginning of variable,
# Delete the part of the shortest match (non-greedy match) and return the remaining part
${variable#pattern}

# If pattern matches the beginning of variable,
# Delete the part of the longest match (greedy match) and return the remaining part
${variable##pattern}

The above two syntaxes will delete the matching part at the beginning of the variable string (replace it with empty), and return the remaining part. The difference is that one is the shortest match (also known as non-greedy matching) and the other is the longest match (also known as greedy matching).

The matching pattern pattern can use wildcards such as *, ?, and [].

$ myPath=/home/cam/book/long.file.name

$ echo ${myPath#/*/}
cam/book/long.file.name

$ echo ${myPath##/*/}
long.file.name

In the above example, the matching pattern is /*/, where * can match any number of characters, so the shortest match is /home/, and the longest match is /home/cam/book/.

The following wording can delete the directory part of the file path, leaving only the file name.

$ path=/home/cam/book/long.file.name

$ echo ${path##*/}
long.file.name

In the above example, the pattern */ matches the directory part, so only the file name is returned.

Let's look at another example.

$ phone="555-456-1414"
$ echo ${phone#*-}
456-1414
$ echo ${phone##*-}
1414

If the match is unsuccessful, the original string is returned.

$ phone="555-456-1414"
$ echo ${phone#444}
555-456-1414

In the above example, the original string cannot match the pattern 444, so it returns as it is.

If you want to replace the matching part of the header with other content, use the following wording.

# The pattern must appear at the beginning of the string
${variable/#pattern/string}

# Example
$ foo=JPG.JPG
$ echo ${foo/#JPG/jpg}
jpg.JPG

In the above example, the replaced JPG must appear at the head of the string, so jpg.JPG is returned.

(2) Pattern matching at the end of the string.

The following two syntaxes can check the end of a string and whether it matches a given pattern. If the match is successful, the matched part is deleted and the remaining part is returned. The original variables will not change.

# If pattern matches the end of variable,
# Delete the part of the shortest match (non-greedy match) and return the remaining part
${variable%pattern}

# If pattern matches the end of variable,
# Delete the part of the longest match (greedy match) and return the remaining part
${variable%%pattern}

The above two syntaxes will delete the matching part at the end of the variable string (replace it with empty), and return the remaining part. The difference is that one is the shortest match (also known as non-greedy matching) and the other is the longest match (also known as greedy matching).

$ path=/home/cam/book/long.file.name

$ echo ${path%.*}
/home/cam/book/long.file

$ echo ${path%%.*}
/home/cam/book/long

In the above example, the matching pattern is .*, where * can match any number of characters, so the shortest match is .name and the longest match is .file.name.

The following wording can delete the file name part of the path, leaving only the directory part.

$ path=/home/cam/book/long.file.name

$ echo ${path%/*}
/home/cam/book

In the above example, the pattern /* matches the file name part, so only the directory part is returned.

The following wording can replace the file suffix name.

$ file=foo.png
$ echo ${file%.png}.jpg
foo.jpg

The above example changes the file extension from .png to .jpg.

Let's look at another example.

$ phone="555-456-1414"
$ echo ${phone%-*}
555-456
$ echo ${phone%%-*}
555

If the match is unsuccessful, the original string is returned.

If you want to replace the matching part of the tail with other content, use the following wording.

# The pattern must appear at the end of the string
${variable/%pattern/string}

# Example
$ foo=JPG.JPG
$ echo ${foo/%JPG/jpg}
JPG.jpg

In the above example, the replaced JPG must appear at the end of the string, so JPG.jpg is returned.

(3) Pattern matching at any position.

The following two syntaxes can check the inside of a string and whether it matches a given pattern. If the match is successful, delete the matched part and replace it with another string to return. The original variables will not change.

# If pattern matches a part of variable,
# The part of the longest match (greedy match) is replaced by string, but only the first match is replaced
${variable/pattern/string}

# If pattern matches a part of variable,
# The part of the longest match (greedy match) is replaced by string, and all matches are replaced
${variable//pattern/string}

The above two grammars are replacements under the longest match (greedy match). The difference is that the former grammar only replaces the first match, and the latter grammar replaces all matches.

$ path=/home/cam/foo/foo.name

$ echo ${path/foo/bar}
/home/cam/bar/foo.name

$ echo ${path//foo/bar}
/home/cam/bar/bar.name

In the above example, the former command only replaces the first foo, and the latter command replaces both foos.

The following example changes the delimiter from : to a newline character.

$ echo -e ${PATH//:/'\n'}
/usr/local/bin
/usr/bin
/bin
...

In the above example, the -e parameter of the echo command means that the \n character of the replaced string is interpreted as a newline character.

Wildcards can be used in the pattern part.

$ phone="555-456-1414"
$ echo ${phone/5?4/-}
55-56-1414

The above example replaces 5-4 with -.

If the string part is omitted, it is equivalent to replacing the matched part with an empty string, that is, deleting the matched part.

$ path=/home/cam/foo/foo.name

$ echo ${path/.*/}
/home/cam/foo/foo

In the above example, the string part after the second slash is omitted, so the part .name matched by the pattern .* is deleted and returned.

As mentioned earlier, there are two extended forms of this grammar.

# The pattern must appear at the beginning of the string
${variable/#pattern/string}

# The pattern must appear at the end of the string
${variable/%pattern/string}

Change case

The following syntax can change the case of variables.

# Convert to uppercase
${varname^^}

# Convert to lowercase
${varname,,}

Below is an example.

$ foo=heLLo
$ echo ${foo^^}
HELLO
$ echo ${foo,,}
hello