Bash's schema extension

Introduction

After Shell receives the command entered by the user, it will split the user's input into tokens according to the space. Then, the Shell will expand the special characters in the lemma, and the corresponding command will be called after the expansion is completed.

This kind of special character expansion is called pattern expansion (globbing). Some of them use wildcards, also known as wildcard expansion. Bash provides a total of eight extensions.

  • Wave line expansion
  • ? character expansion
  • * character expansion
  • Square bracket expansion
  • Brace expansion
  • Variable expansion
  • Subcommand expansion
  • Arithmetic expansion

This chapter introduces these eight extensions.

Bash expands first, and then executes commands. Therefore, Bash is responsible for the results of the expansion, and has nothing to do with the commands to be executed. There is no parameter expansion in the command itself, and the received parameters are executed as they are. This must be remembered.

The English word for module expansion is globbing, this word comes from the early Unix system has a /etc/glob file to save the expanded template. Later, Bash built this feature, but the name was retained.

The relationship between pattern expansion and regular expressions is that pattern expansion appears earlier than regular expressions and can be regarded as primitive regular expressions. Its functions are not as powerful and flexible as regular rules, but its advantages are simplicity and convenience.

Bash allows users to turn off extensions.

$ set -o noglob
# Or
$ set -f

The following command can reopen the extension.

$ set +o noglob
# Or
$ set +f

Wave extension

The wavy line ~ will automatically expand to the current user's home directory.

$ echo ~
/home/me

~/dir means to expand into a subdirectory of the main directory, and dir is the name of a subdirectory in the main directory.

# Enter /home/me/foo directory
$ cd ~/foo

~user means expand into the home directory of user user.

$ echo ~foo
/home/foo

$ echo ~root
/root

In the above example, Bash will return the user's home directory based on the user name after the tilde.

If the user of ~user is a username that does not exist, the tilde expansion will not work.

$ echo ~nonExistedUser
~nonExistedUser

~+ will expand to the current directory, which is equivalent to the pwd command.

$ cd ~/foo
$ echo ~+
/home/me/foo

? Character expansion

The ? character represents any single character in the file path, excluding the null character. For example, Data??? matches all file names that have three characters followed by Data.

# There are files a.txt and b.txt
$ ls ?.txt
a.txt b.txt

In the above command, ? means a single character, so it will match both a.txt and b.txt.

If you match multiple characters, you need to use multiple ?s together.

# There are files a.txt, b.txt and ab.txt
$ ls ??.txt
ab.txt

In the above command, ?? matches two characters.

The ? character extension is a file name extension, and the extension will only occur if the file does exist. If the file does not exist, the expansion will not occur.

# The current directory has a.txt file
$ echo ?.txt
a.txt

# The current directory is empty
$ echo ?.txt
?.TXT

In the above example, if ?.txt can be expanded into a file name, the echo command will output the expanded result; if it cannot be expanded into a file name, echo will output ?.txt as it is.

* Character expansion

The * character represents any number of any characters in the file path, including zero characters.

# There are files a.txt, b.txt and ab.txt
$ ls *.txt
a.txt b.txt ab.txt

In the above example, *.txt represents all files with the suffix of .txt.

If you want to output all files in the current directory, just use * directly.

$ ls *

* can match null characters, the following is an example.

# There are files a.txt, b.txt and ab.txt
$ ls a*.txt
a.txt ab.txt

$ ls *b*
b.txt ab.txt

Note that * will not match hidden files (files beginning with .), that is, ls * will not output hidden files.

If you want to match hidden files, you need to write .*.

# Show all hidden files
$ echo .*

If you want to match hidden files, and at the same time exclude the two special hidden files . and .., you can use it in combination with square bracket expansion, written as .[!.]*.

$ echo .[!.]*

Note that the * character extension is a file name extension, and it will only be extended if the file does exist. If the file does not exist, it will be output as it is.

# There is no file starting with c in the current directory
$ echo c*.txt
c*.txt

In the above example, there is no file starting with c in the current directory, causing c*.txt to be output as it is.

* only matches the current directory, not subdirectories.

# There is a.txt in the subdirectory
# Invalid wording
$ ls *.txt

# Effective writing
$ ls */*.txt

In the above example, the text file is in a subdirectory, *.txt will not produce a match, it must be written as */*.txt. If there are several levels of subdirectories, you must write several levels of asterisks.

Bash 4.0 introduced a parameter globstar, when this parameter is turned on, it allows ** to match zero or more subdirectories. Therefore, **/*.txt can match top-level text files and text files in subdirectories of any depth. For details, please see the introduction of the shopt command later.

Square bracket expansion

The form of square bracket expansion is [...], which will only be expanded if the file does exist. If the file does not exist, it will be output as it is. Any character in the brackets. For example, [aeiou] can match any of the five vowels.

# There are files a.txt and b.txt
$ ls [ab].txt
a.txt b.txt

# Only the file a.txt exists
$ ls [ab].txt
a.txt

In the above example, [ab] can match a or b, provided that the corresponding file exists.

Square bracket expansion belongs to file name matching, that is, the expanded result must conform to the existing file path. If there is no match, it will remain as it is and no expansion will be performed.

# The files a.txt and b.txt do not exist
$ ls [ab].txt
ls: Cannot access'[ab].txt': No such file or directory

In the above example, because the expanded file does not exist, [ab].txt is output as it is, resulting in an error in ls naming.

There are two variants of square bracket expansion: [^...] and [!...]. They mean to match characters that are not in square brackets, and these two ways of writing are equivalent. For example, [^abc] or [!abc] means to match characters other than a, b, and c.

# There are three files aaa, bbb, aba
$ ls ?[!a]?
aba bbb

In the above command, [!a] means that the second character of the file name is not the file name of a, so two files of aba and bbb are returned.

Note that if you need to match [ characters, you can put them in square brackets, such as [[aeiou]. If you need to match the hyphen -, it can only be placed at the beginning or end of the square brackets, such as [-aeiou] or [aeiou-].

[start-end] Extension

Bracket expansion has a shorthand form [start-end], which means to match a continuous range. For example, [ac] is equivalent to [abc], [0-9] matches [0123456789].

# There are files a.txt, b.txt and c.txt
$ ls [ac].txt
a.txt
b.txt
c.txt

# There are files report1.txt, report2.txt and report3.txt
$ ls report[0-9].txt
report1.txt
report2.txt
report3.txt
...

Here are some examples of common abbreviations.

-[az]: All lowercase letters. -[a-zA-Z]: All lowercase and uppercase letters. -[a-zA-Z0-9]: All lowercase letters, uppercase letters and numbers. -[abc]*: All file names beginning with one of the characters a, b, and c. -program.[co]: file program.c and file program.o. -BACKUP.[0-9][0-9][0-9]: All file names starting with BACKUP. followed by three numbers.

This abbreviation has a negative form [!start-end], which means to match characters that do not belong to this range. For example, [!a-zA-Z] means to match non-English letters.

$ echo report[!1–3].txt
report4.txt report5.txt

In the above code, [!1-3] means to exclude 1, 2, and 3.

Bracket expansion

The brace expansion {...} means to expand to all the values ​​inside the braces, and each value is separated by a comma. For example, {1,2,3} expands to 1 2 3.

$ echo {1,2,3}
1 2 3

$ echo d{a,e,i,u,o}g
dag deg dig dug dog

$ echo Front-{A,B,C}-Back
Front-A-Back Front-B-Back Front-C-Back

Note that brace expansion is not a file name expansion. It will expand to all the given values, regardless of whether there is a corresponding file.

$ ls {a,b,c}.txt
ls: Cannot access'a.txt': No such file or directory
ls: Cannot access'b.txt': No such file or directory
ls: Cannot access'c.txt': No such file or directory

In the above example, even if there is no corresponding file, {a,b,c} still expands to three file names, causing the ls command to report three errors.

Another thing to note is that there can be no spaces before or after the comma inside the braces. Otherwise, brace expansion will fail.

$ echo {1, 2}
{1, 2}

In the above example, there are spaces before and after the comma, Bash will think that this is not a brace expansion, but three independent parameters.

There can be no value before the comma, which means that the first item of the expansion is empty.

$ cp a.log{,.bak}

# Equivalent to
# cp a.log a.log.bak

Braces can be nested.

$ echo {j{p,pe}g,png}
jpg jpeg png

$ echo a{A{1,2},B{3,4}}b
aA1b aA2b aB3b aB4b

Braces can also be used in conjunction with other modes and are always expanded before other modes.

$ echo /bin/{cat,b*}
/bin/cat /bin/b2sum /bin/base32 /bin/base64 ... ...

# Basically equivalent to
$ echo /bin/cat;echo /bin/b*

In the above example, brace expansion will be performed first, followed by * expansion, which is equivalent to executing two echo commands.

Braces can be used in multi-character patterns, but square brackets are not acceptable (only single characters can be matched).

$ echo {cat,dog}
cat dog

Since the brace expansion {...} is not a file name expansion, it will always be expanded. This is completely different from the square bracket expansion [...]. If the matching file does not exist, the square brackets will not be expanded. At this point, we must pay attention to distinction.

# A.txt and b.txt do not exist
$ echo [ab].txt
[ab].txt

$ echo {a,b}.txt
a.txt b.txt

In the above example, if a.txt and b.txt do not exist, then [ab].txt will become a normal file name, and {a,b}.txt can be expanded as usual .

{start..end} extension

Bracket expansion has a shorthand form of {start..end}, which means expansion into a continuous sequence. For example, {a..z} can be expanded into 26 lowercase English letters.

$ echo {a..c}
abc

$ echo d{a..d}g
dag dbg dcg ddg

$ echo {1..4}
1 2 3 4

$ echo Number_{1..5}
Number_1 Number_2 Number_3 Number_4 Number_5

This abbreviation supports reverse order.

$ echo {c..a}
cba

$ echo {5..1}
5 4 3 2 1

Note that if you encounter unintelligible abbreviations, the brace pattern will be output as it is and will not be expanded.

$ echo {a1..3c}
{a1..3c}

This abbreviation can be nested to form a complex extension.

$ echo .{mp{3..4},m4{a,b,p,v}}
.mp3 .mp4 .m4a .m4b .m4p .m4v

A common use for brace expansion is to create a series of directories.

$ mkdir {2007..2009}-{01..12}

The above command will create 36 new subdirectories, each of which has the name "year-month".

Another common use of this wording is to use it directly in a for loop.

for i in {1..4}
do
  echo $i
done

The above example will loop 4 times.

If there is a leading 0 in front of the integer, each item of the expanded output has a leading 0.

$ echo {01..5}
01 02 03 04 05

$ echo {001..5}
001 002 003 004 005

This abbreviation can also use the second double dot (start..end..step) to specify the step length of the expansion.

$ echo {0..8..2}
0 2 4 6 8

The above code expands 0 to 8, and the length of each increment is 2, so a total of 5 numbers are output.

Using multiple abbreviations together will have the effect of circular processing.

$ echo {a..c}{1..3}
a1 a2 a3 b1 b2 b3 c1 c2 c3

Variable expansion

Bash treats the lemmas at the beginning of the dollar sign $ as variables and expands them into variable values. For details, see the chapter "Bash Variables".

$ echo $SHELL
/bin/bash

In addition to putting the variable name after the dollar sign, it can also be put inside ${}.

$ echo ${SHELL}
/bin/bash

${!string*} or ${!string@} returns all variable names that match the given string string.

$ echo ${!S*}
SECONDS SHELL SHELLOPTS SHLVL SSH_AGENT_PID SSH_AUTH_SOCK

In the above example, ${!S*} expands to all variable names beginning with S.

Subcommand extension

$(...) can be expanded to the result of another command, and all the output of this command will be used as the return value.

$ echo $(date)
Tue Jan 28 00:01:13 CST 2020

In the above example, $(date) returns the result of the date command.

There is another older syntax in which subcommands are placed in backticks, which can also be expanded into the result of the command.

$ echo `date`
Tue Jan 28 00:01:13 CST 2020

$(...) can be nested, such as $(ls $(pwd)).

Arithmetic expansion

$((...)) can be expanded to the result of integer arithmetic, see the chapter "Arithmetic Operations in Bash" for details.

$ echo $((2 + 2))
4

Character class

[[:class:]] represents a character class, which expands to one of a certain class of specific characters. The commonly used character classes are as follows.

-[[:alnum:]]: match any English letters and numbers -[[:alpha:]]: match any English letter -[[:blank:]]: Space and Tab key. -[[:cntrl:]]: Unprintable characters with ASCII code 0-31. -[[:digit:]]: Match any digits 0-9. -[[:graph:]]: AZ, az, 0-9 and punctuation marks. -[[:lower:]]: Match any lowercase letter az. -[[:print:]]: Printable characters with ASCII code 32-127. -[[:punct:]]: Punctuation marks (except printable characters AZ, az, 0-9). -[[:space:]]: Space, Tab, LF (10), VT (11), FF (12), CR (13). -[[:upper:]]: Match any uppercase letter AZ. -[[:xdigit:]]: Hexadecimal characters (AF, af, 0-9).

Please see the example below.

$ echo [[:upper:]]*

The above command outputs all file names beginning with capital letters.

After the first square bracket of a character class, an exclamation mark ! can be added to indicate negation. For example, [![:digit:]] matches all non-digits.

$ echo [![:digit:]]*

The above command outputs all file names that do not start with a number.

The character class also belongs to the file name extension. If there is no matching file name, the character class will be output as it is.

# There is no file starting with a capital letter
$ echo [[:upper:]]*
[[:upper:]]*

In the above example, since there is no matching file, the character class is output as it is.

Use precautions

There are some precautions for the use of wildcards, which must be known.

**(1) The wildcard is explained first, and then executed. **

After receiving the command, Bash finds that there is a wildcard in it, it will expand the wildcard, and then execute the command.

$ ls a*.txt
ab.txt

The execution process of the above command is that Bash first expands a*.txt to ab.txt, and then executes ls ab.txt.

**(2) When the file name extension does not match, it will be output as it is. **

When there is no matching file, the file name extension will be output as it is.

# There is no file name beginning with r
$ echo r*
r*

In the above code, since there is no file name beginning with r, r* will be output as it is.

Here is another example.

$ ls *.csv
ls: *.csv: No such file or directory

In addition, as mentioned earlier, the brace expansion {...} is not a file name extension.

**(3) Only applicable to single-layer paths. **

All file name extensions only match single-level paths and cannot match across directories, that is, files in subdirectories cannot be matched. In other words, wildcards such as ? or * cannot match path separators (/).

If you want to match files in a subdirectory, you can write it as follows.

$ ls */*.txt

Bash 4.0 adds a new globstar parameter, which allows ** to match zero or more subdirectories. For details, see the introduction of the shopt command later.

**(4) Wildcard characters can be used in the file name. **

Bash allows wildcards in file names, that is, file names include special characters. To quote the file name at this time, you need to put the file name in single quotation marks.

$ touch'fo*'
$ ls
fo*

The above code creates a fo* file, and then * is part of the file name.

Quantifying Words

The quantifier is used to control the number of pattern matches. It can only be used when the extglob parameter of Bash is turned on, but it is generally turned on by default. The following command can be inquired.

$ shopt extglob
extglob on

If the extglob parameter is off, you can open it with the following command.

$ shopt -s extglob

There are several quantifiers as follows.

-? (pattern-list): Match zero or one pattern. -*(pattern-list): Match zero or more patterns. -+(pattern-list): Match one or more patterns. -@(pattern-list): Only match one pattern. -!(pattern-list): match anything other than the given pattern.

$ ls abc?(.)txt
abctxt abc.txt

In the above example, ?(.) matches zero or one point.

$ ls abc?(def)
abc abcdef

In the above example, ?(def) matches zero or one def.

$ ls abc+(.txt|.php)
abc.php abc.txt

In the above example, +(.txt|.php) matching file has a .txt or .php suffix name.

$ ls abc+(.txt)
abc.txt abc.txt.txt

In the above example, the +(.txt) matching file has one or more .txt suffixes.

$ ls a!(b).txt
a.txt abb.txt ac.txt

In the above example, !(b) means to match any content except a single letter b, so except for ab.txt, other file names can be matched.

The quantifier is also a file name extension. If there is no matching file, it will be output as it is.

# No file name starting with abc
$ ls abc?(def)
ls: Cannot access'abc? (def)': No such file or directory

In the above example, because there is no matching file, abc?(def) is output as it is, causing the ls command to report an error.

shopt command

The shopt command can adjust the behavior of Bash. It has several parameters related to wildcard expansion.

The usage of the shopt command is as follows.

# Open a parameter
$ shopt -s [optionname]

# Close a parameter
$ shopt -u [optionname]

# Query whether a parameter is closed or open
$ shopt [optionname]

(1) dotglob parameter

The dotglob parameter allows the expansion result to include hidden files (that is, files beginning with a dot).

Under normal circumstances, the expansion result does not include hidden files.

$ ls *
abc.txt

Open dotglob and hidden files will be included.

$ shopt -s dotglob
$ ls *
abc.txt .config

(2) nullglob parameter

The nullglob parameter allows a null character to be returned when the wildcard does not match any file name.

By default, wildcards will remain unchanged when they do not match any file name.

$ rm b*
rm: cannot delete'b*': there is no such file or directory

In the above example, since the current directory does not include the file name beginning with b, the file name extension of b* will not occur and remains unchanged, so the rm command reports an error that there is no b* file.

By turning on the nullglob parameter, you can make the unmatched wildcard return an empty string.

$ shopt -s nullglob
$ rm b*
rm: missing operand

In the above example, because there is no file name matching b*, rm b* expands to rm, which causes the error to become "missing operand".

(3) failglob parameter

When the failglob parameter makes the wildcard not match any file name, Bash will report an error directly instead of letting each command handle it.

$ shopt -s failglob
$ rm b*
bash: no match: b*

In the above example, after opening failglob, because b* does not match any file name, Bash directly reported an error and no longer let the rm command to process it.

(4) extglob parameter

The extglob parameter makes Bash support some extended syntax of ksh. It should be turned on by default.

$ shopt extglob
extglob on

Its main application is to support the quantification grammar. If you don't want to support the quantifier syntax, you can turn it off with the following command.

$ shopt -u extglob

(5) nocaseglob parameter

The nocaseglob parameter allows wildcard expansion to be case-insensitive.

$ shopt -s nocaseglob
$ ls /windows/program*
/windows/ProgramData
/windows/Program Files
/windows/Program Files (x86)

In the above example, after opening nocaseglob, program* is case-insensitive and can match ProgramData etc.

(6) globstar parameter

The globstar parameter can make ** match zero or more subdirectories. This parameter is turned off by default.

Assume the following file structure.

a.txt
sub1/b.txt
sub1/sub2/c.txt

In the above file structure, there is a text file in each of the top-level directory, the first-level subdirectory sub1, and the second-level subdirectory sub1\sub2. How can I use wildcards to display them?

By default, it can only be written as follows.

$ ls *.txt */*.txt */*/*.txt
a.txt sub1/b.txt sub1/sub2/c.txt

This is because * only matches the current directory. If you want to match the subdirectories, you can only write them layer by layer.

After turning on the globstar parameter, ** matches zero or more subdirectories. Therefore, **/*.txt can get the desired result.

$ shopt -s globstar
$ ls **/*.txt
a.txt sub1/b.txt sub1/sub2/c.txt