Linux bash scripts

One way of utilizing the flexibility of Linux is using command scripts. A command script is simply a file, which contains a set of normal Linux commands that the command shell will perform automatically in the given order. Compared to real programming languages, like Python, Perl or c, programming with Linux (bash, tcsh, csh or sh) is computationally rather ineffective. However, often handy Linux scripts can be constructed in a few minutes. You do not have to know too much about command scripting to be able to write simple programs that save a lot of time.

Constructing a script file

A script file is a simple text file that can be constructed with normal text editors like nano, Emacs or vi. To create a new script file, type for example:

nano my_test.script

A script file usually starts with a line which defines the command shell to be used. In this guide, we use bash shell, which is the default command shell at CSC. The bash defining row is:

#!/bin/bash

After that you add the Linux commands you wish to perform. In practice, just type in the file the commands that you would normally use to do the task in an interactive command shell. For example, the following script can be used to create a subdirectory mapfiles and copy all .map files there:

#!/bin/bash
mkdir mapfiles
cp *.map mapfiles/

If a line in the script starts with a # character, it will be skipped, and rest of the line is considered as a comment (except for the first line that starts with #!).

#!/bin/bash
# This is a comment line that is not executed

mkdir mapfiles
cp *.map mapfiles/

After saving the script file and closing the editor, you can perform the commands in the script file by giving the command:

source my_test.script

Optionally, you could give execution permissions for your script file with the command

chmod u+x my_test.script

and then execute the script with the command:

./my_test.script

Variables and arrays

You can use variables, loops and conditional statements in the scripts. Variables can be set with syntax:

variable=value

Note that there are no spaces around the equals sign. Variables are recalled with $ sign,

$variable

or

${variable}

For example, the command

echo $variable

writes the value of variable to the output. Note that in bash scripts the variables are considered to be either strings (i.e. text) or integers. This means that decimal numbers can't be used in bash scripts for mathematical operations.

Example of using string variables:

$ name=Veikko
$ familyname=Salo
$ address="CSC Espoo"
$ echo "Person: ${name} ${familyname} works at ${address}."

Person: Veikko Salo works at CSC Espoo.

For integer variables, you can do simple arithmetic with syntax ((expression)). Commonly used arithmetic operations are listed in the table below:

Operator	Function
`+`	addition
`-`	subtraction
`*`	multiplication
`/`	division
`%`	division remainder
`**`	exponentiation

Simple integer arithmetic examples:

$ a=5
$ c=3
$ ((c = a + b))
$ echo  $a plus $b is equal to $c
5 plus 3 is equal to 8
$ ((d = a / b))
$ ((e = a % b))
$ echo "$a divided by $b results $d and reminder $e"
5 divided by 3 results 1 and reminder 2

Bash can also use one-dimensional array variables, i.e. variables that contain a list of items. A specified array item can be recalled by using an index number in brackets with the array variable name (${variable[index]}). For example, we can define a simple three element array with the command:

array=(a b c)

We can now recall either the whole array or just on element from it using the command

echo ${array[*]}

This prints out

a b c

while command

echo ${array[2]}

prints

Note that in the array, the indexing starts from 0 and thus the sample command above prints out the third element of the array. You can check the number of items in the array by adding the # sign to the beginning of the variable name. For example, in this case the command

echo ${#array[*]}

prints out the value

A special case of array variables is $ that holds command line arguments, i.e. items that you can provide as input parameters for your script. In the case of this argument, array $0 refers to the name of the actual script, $1 refers to the first arguments, $2 to the second and so on. $# refers to the number of arguments and $@ to the full argument list. Below is a sample script that illustrates using the $ array variable:

#!/bin/bash
from_dir=$1
to_dir=$2
mkdir $to_dir
cp $from_dir/*.map $to_dir

If we now execute this script, named e.g. my_script2.sh, we have to give two arguments for the command. The first argument is in this case used to define a source directory for the copy command while the second argument is the target directory. For example, the command

./my_script2.csh source_data map_files

would copy all the file with extension .map from a directory called source_data to a new directory called map_files.

Quotation marks

Three different quotation marks are used in bash. Quotation marks are frequently needed to define variables and commands to be executed. Following quotation marks can be used:

"" Take text within quotes literally after substituting any variables
'' Take text enclosed within quotes literally
`` Take text enclosed within quotes as a command, execute the command and then replace with output of the command at the location of quotation marks

Below are some examples to illustrate the functional differences of different quotation marks. Quotation marks can be used to operate with variables and arguments. When the double or single quotation marks are used, all the text inside the quotation marks are used as one argument. The difference between these two quotation marks is that when double quotation marks are used, variables are substituted by their values, while with single quotation all text is used as it is. If you run the commands

variable=sample1
echo "value = $variable"

the result will be

value = sample1

But if you use single quotation marks instead

echo 'value = $variable'

you will get output

value = $variable

In Linux commands and scripts quotation marks are typically used to define arguments that contain spaces or other special characters. Say we would like to use grep to pick all rows that contain a string file size from a file called files.txt. The following command would not work:

grep file size files.txt

If you run the command above, you'll get an error message, as the word size is now interpreted to be the second argument defining the input file. We can fix the situation by using quotation marks:

grep "file size" files.txt

Now, the first argument defining the string to be searched is file size (including the space between the words), and the second argument defining the input file is now files.txt as originally intended.

The third quotation mark type `` has a special meaning. With these quotation marks, you can make one Linux command produce an argument for another Linux command. The basic syntax `` is:

command1 `command2`

where command1 will use the product of command2 as an argument. In a bash script, the same functionality can be done also with the syntax

$(command)

Loops and conditional statements

Loops and conditional statements are rarely used in interactive command-line usage. However, they are frequently used in scripts to perform similar commands several times and to control the commands to be executed. Bash provides a wide selection loops, conditional statements and other control structures. In this section, we show examples of some of the most commonly used control structures.

A for loop performs specified commands iteratively so that on each iteration the loop variable is set to be equal to one of the items in the given element list. In bash, a for loop is made with command structure:

for variable in element_list
do
   commands
done

For example, the loop

for filename in sample1.txt sample2.txt sample3.txt
do
   echo ${filename}
done

would print out

sample1.txt
sample2.txt
sample3.txt

Typically, the argument list contains file names to be processed, but it can also be any other parameter too. For example, say we have a directory called project_3 that contains nine files called sample1.txt, sample2.txt, ..., sample9.txt. To see the contents of the directory, we may use command ls:

$ ls project_3/
sample1.txt sample3.txt sample5.txt sample7.txt sample9.txt  
sample2.txt sample4.txt sample6.txt sample8.txt

If we would like to rename each of these files so that they have the extension .old, we could run command mv nine times, or we could use a for loop:

for filename in sample1.txt sample2.txt sample3.txt sample4.txt \
sample5.txt sample6.txt sample7.txt sample8.txt sample9.txt
do
   echo "Renaming file: ${filename}"
   mv project_3/${filename} project_3/${filename}.old
done

The for loop above is still quite clumsy as we need to write all the file names to the element list. We can avoid this by substituting the element list with $(ls project_3/). Now, command ls project3 is used to produce a list of file names to be processed:

for filename in $(ls project_3/)
do
   echo "Moving file: $filename"
   mv project_3/$filename project_3/"$filename".old
done

In bash, you can also create a for loop where a numerical index variable is increased automatically by a certain step size in each iteration. In this case, the syntax is:

for ((variable=start; variable<=end; i++))

Below is a for loop that performs the same renaming operation as above, but using just numbers as elements.

for ((number=1; number<=9; number++))
do
   echo "Moving file: sample${number}.txt"
   mv project_3/sample${number}.txt project_3/sample${number}.txt.old
done

In a while loop, the loop keeps running as long as the defined condition statement is true. In bash, a while loop can be made with the syntax:

while [[ condition ]]
do
   commands
done

The renaming operation, made above with a for loop, could also be done with a while loop:

number=1
while [[ $number -le 9 ]]
do
   echo "Moving file: sample${number}.txt"
   mv project_3/sample${number}.txt project_3/sample${number}.txt.old
   ((number = number + 1))
done

In the example above, a variable called number is first set to have value 1. The value of this variable is then increased by 1 in the end of each iteration cycle. The iterations are continued until the variable reaches value 10.

Conditional statements (if) can be made as follows:

if [[ condition ]]
then
   commands
else
   commands
fi

You can use operands listed in the table below in the conditional statements of if and while commands. Note that bash uses different conditional statements for strings and integers. For example, the equality of strings is tested with == while the equality of integers is tested with -eq. The syntax is also strict about the spaces between the brackets, and the condition statement definition [[a == b]] will not work and should be fixed to [[ a == b ]].

Commonly used string, integer and file operands of if and while statements are listed below.

Statement	Operation
`[[ a == b ]]`	True if strings a and b are equal
`[[ a != b ]]`	True if strings a and b are not equal
`[[ a =~ b ]]`	True if strings a and b are similar (allows wildcards)
`[[ a < b ]]`	True if string a is alphabetically before string b
`[[ a > b ]]`	True if string a is alphabetically after string b
`[[ a -eq b ]]`	True if integers a and b are equal
`[[ a -ne b ]]`	True if integers a and b are not equal
`[[ a -lt b ]]`	True if integer a is less than b
`[[ a -gt b ]]`	True if integer a is greater than b
`[[ a -le b ]]`	True if integer a is less or equal to b
`[[ a -ge b ]]`	True if integer a is greater or equal to b
`[[ -e name ]]`	True if file exists
`[[ -n a ]]`	True if string a has non-zero length
`[[ A \|\| B ]]`	True if condition A or condition B is true (logical OR)
`[[ A && B ]]`	True if condition A and condition B is true (logical AND)
`[[ ! A ]]`	True if condition A is not true

Below are some examples of if command structures.

Check if the integer variable x is greater than 10:

if [[ $x -gt 10 ]]
then
   echo "The value of variable x is more than 10"
fi

Check if the variable x is greater than 10 but smaller than 20:

if [[ $x -gt 10 && $x -lt 20 ]]
then
   echo "The value of variable x is more than 10 but less than 20"
else
   echo "The value of x is out of range"
fi

You can compare also variables containing text (strings):

if [[ $answer == "yes" ]]
then
   echo "Your answer was: yes"
elif [[ $answer == "no" ]]
then
   echo "Your answer was no"
else
   echo "You didn't answer yes or no"
fi

When using less than and more than comparisons, you should be careful not to mix string and integer comparisons. For example, the following condition

[[ 123 > 3 ]]

is FALSE because string 123 is alphabetically before string 3. The numerical comparison

[[ 123 -gt 3 ]]

is, however, TRUE.

There are a number of operators you can use to test different attributes of a file. The most commonly used operator is -e that checks if a file exists. As an example, let's assume that we have a simple list of file names called checklist.txt. We would like to check which of these files are found in the current directory. We can use a for loop to study all the file names and the if command with -e condition to test if the file exists:

for file_name in $(cat checklist.txt)
do
   if [[ -e $file_name ]]
   then
      echo "File $file_name was found"
   else
      echo "File $file_name was not found"
   fi
done

Printing the output

In the previous examples we have already used the echo command to write text and variables to the standard output (i.e. to the screen or to a file by standard output redirection). For example, the command

echo "Hello world"

prints out

Hello world

echo can be used for printing output in many cases, but it does not provide good tools for creating well formatted output with defined columns. In situations where well-structured text output is needed, printf should be used instead of echo. The syntax of printf is:

printf "format definition" arguments_to_print

The format definition defines what types of output is to be printed. Common types include text (%s), integers (%i), and floating point numbers (%f). The format statements can also define how much space is reserved for each argument and how it is located in the column. Below are some simple examples to illustrate the usage of the printf command:

printf "%i %s %s %f\n" 1 Hello World 23.75

prints out

1 Hello World 23.750000

Here, the format statement defines that the first argument is considered to be an integer, second and third as strings and the fourth argument as a floating point number. Note that by default, printf does not add a newline character to the end of the output. To do that, the format statement should end with definition \n.

In the next example we define how many characters are reserved for each argument. The command

printf "%4i %10s %10s %6.2f\n" 1 Hello World 23.75

prints out

1 Hello World 23.75

Here, we reserve four characters for the first integer, then ten characters for each of the strings. The floating point number is presented with six characters, two of which are after the decimal point.

You can also add text and control characters like tabulator (\t) to the format statement. The command

printf "This is my %i:st %s %s\t %6.1f\n" 1 Hello World 23.75

prints out

This is my 1:st Hello World 23.8

In Linux scripts, printf is typically used to print out values stored in variables. For example, the commands

unit=3g
value=5.3
printf "The resulting value from:%4s\t is:\t%6.2f\n" $unit $value

prints out

The resulting value from: 3g is: 5.30