Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

What this section covers

  • Why error checking is important
  • Error handling best practices
  • Functions to make error checking easy

Overview

One of the most common problems with writing any kind of program is lack of proper error handling.

If a program does not sanity check its operations and data, it can sometimes proceed for many subsequent steps until finally either a bad (or empty) result is generated, or some called program that does sanity check its data notices and terminates execution. It is then challenging to backtrack to where the original, causative error occurred – assuming there are even log files available to examine!

Some languages (Python, Java, R) automatically detect certain types of errors (e.g. file not found) and, by default, stop program execution and may report an execution stack trace that can be displayed to the user. While these stack traces are not usually meaningful on their own, they are better than nothing, and certainly better than allowing the program to blithely continue.

But built-in error runtime error checks are no help in sanity checking program data, because only the programmer knows what the data should look like! Enter user-defined error checking. Again, some languages make this less cumbersome:

  • Python assert statement
  • Java assert keyword
  • R stopifnot statement

Bash does not have built-in error checking, or the equivalent of These languages also have throw/catch exception handling constructs, which lets a program detect and deal with errors – if possible.So .

There are mechanisms in bash that provide some of this functionality (see especially set -e and set -x described in https://www.turnkeylinux.org/blog/shell-error-handling and http://linuxcommand.org/lc3_wss0150.php, and trap-ing signals in http://linuxcommand.org/lc3_wss0150.php).

Here, though, we're going to explore how to add "roll our own error handling mechanisms. This will also force us to better understand shells & sub-shells, exit and result codes, and other communication between execution contexts.

Shells and sub-shells

Every bash program has its own execution environment (sub-shell), which is a child process of its calling parent shell. A new sub-shell is created, runs, and returns when:

  • a built-in bash utility (e.g. ls) is run from the command line (or from within script)
  • a custom script is run from the command line (or from within a script)
  • backtick evaluation is used to execute commands (e.g. echo `date`)
  • any set of commands enclosed in parentheses is run, e.g.
    • ( date )

Parentheses evaluation is similar to backtick evaluation except that the standard output of backtick evaluation is automatically connected to the standard input of the caller. To connect the standard output of parentheses evaluation to the standard input of the caller, the the parentheses expression must be "evaluated" with a dollar sign. Consider:

  • today=`date`
  • today=$(date)
    • the date command is run in a sub-shell, writing its data to its standard output
    • date's standard output stream is connected to the calling shell's standard input
    • the caller's standard input text is stored in the today variable

Here are the main communication methods between shell environments:

  • Input to sub-shells
    • program arguments
    • environment variables
    • file or stream data
  • Output from sub-shells
    • exit code
    • standard output
    • file data

Environment variables

In addition to passing arguments to a program, a caller may set environment variables (normal bash shell variables) that can be read in the called environment. However by default, variables in a parent shell are not copied into sub-shells unless they are exported using the export keyword.

Code Block
languagebash
# a normal bash variable is not visible to sub-shells
foo=abc
( echo $foo )

# exported bash variables are visible to sub-shells
export foo
( echo $foo )

export bar="def"
( echo $bar )

Script exit codes and function return values

Unlike most other programming languages, bash functions and scripts can only return a single integer between 0 and 255. By convention a return value of 0 means success, and any other return value is an error code.

A function can return this value using the return keyword (e.g. return 0); the return value is then stored in the special $? variable, which can be checked by the caller. Function callers are always in the same execution environment (sub-shell) as any bash functions they call. Since this not very much information, function return values are not often used or checked. Instead, as we've seen, functions are often called for their standard output, which serves as a return value proxy.

A script can also return an integer value, called the exit code, using the exit keyword (e.g. exit 255). The exit code is returned to the script caller (in the parent shell) in the $? variable. Note that no further code in the current sub-shell is executed after exit is called.

The main use of exit codes is to check that a called program completed successfully.

Code Block
languagebash
# A successful exit code is 0
ls
echo $?

# Any non-0 exit code is an error (here the code is 2)
ls not_a_file
echo $?

Note that in the non-0 exit code case, the program may also report error information on standard error (e.g. ls: cannot access not_a_file: No such file or directory above).

Tip
titleTip

The $? return code variable must be checked immediately after the called program or sub-shell completes, because any further actions in the caller will change $?. One way to do this is to save off the value $? of in another variable (e.g. res=$?).

exercise 1

On the command line, call exit with various codes in a parentheses sub-shell and check the result in the caller.

...

titleTip

...

" error handling with a set of functions. Our goals are:

  • To stop execution immediately when an error is detected
  • Have the last line of the log file describe the fatal error (easy to tail)
  • Make it easy to frequently check for errors of different sorts in our code
  • Report the check being done (even when successful) in order to leave "bread crumbs" in the log file for troubleshooting.
  • Report all diagnostics to standard error so it will not interfere with a function's standard output that is meant to be captured by the caller
    • So if a function reports error checks to standard error, and also reports information on standard output, only the standard output will be captured by backtick or parentheses evaluation.

The step_04.sh Script

Here's a step_04.sh script that builds on our step_03.sh work, located in ~/workshop/step_04.sh.

Code Block
languagebash
titlestep_04.sh
#!/bin/bash

# Script version global variable. Edit this whenever changes are made.
__ADVANCED_BASH_VERSION__="step_04"

# =======================================================================
#  Helper functions
# =======================================================================

# Shorter format date
date2() { date '+%Y-%m-%d %H:%M:%S'; }

# Echo's its arguments and the date to std error
echo_se() { echo "$@ - `date2`" 1>&2; }
maybe_echo() {
  local do_echo=${ECHO_VERBOSE:-1}
  if [[ "$do_echo" == "1" ]]; then echo_se "$@"; fi
}

# General function that exits after printing its text
# in a standard format which can be easily grep'd.
err() {
  echo_se "** ERROR: $@ ...exiting"; exit 255;
}
# Function to check result code of programs.
# Exits with a standard error message if code is non-zero.
# Otherwise displays a completion message.
#   arg 1 - the return code (usually $?)
#   arg 2 - text describing what ran (optional)
check_res() {
  if [[ "$1" == "0" ]]; then maybe_echo ".. check_res: $2 OK";
  else err "$2 returned non-0 exit code $1"; fi
}
# Function that checks if a directory exists and exits if not.
#   arg 1 - the directory name
#   arg 2 - text describing the directory (optional)
check_dir() {
  if [[ ! -d "$1" ]]; then err "$2 Directory '$1' not found"
  else maybe_echo ".. $2 directory '$1' exists"; fi
}
# Function that checks if a file exists
#   arg 1 - the file name
#   arg 2 - text describing the file (optional)
check_file() {
  if [[ ! -e "$1" ]]; then err "$2 File '$1' not found"
  else maybe_echo ".. $2 file '$1' exists"; fi
}
# Function checks if a file exists & has non-0 length, else exits.
#   arg 1 - the file name
#   arg 2 - text describing the file (optional)
check_file_not_empty() {
  if [[ ! -e "$1" ]]; then err "$2 File '$1' not found"
  elif [[ ! -s "$1" ]]; then err "$2 File '$1' is empty"
  else maybe_echo ".. $2 file '$1' exists and is not empty"; fi
}
# Checks that its 1st argument is not empty
#   arg 1 - the value
#   arg 2 - text desribing what the value is (optional)
check_arg_not_empty() {
  local val="$1"; local info=${2:-'argument'}
  if [[ "$val" == "" ]]; then err "$info value is empty"
  else maybe_echo ".. $info value not empty"
  fi
}
# Function that checks whether the two values supplied are equal (as strings)
#   arg 1 - 1st value
#   arg 2 - 2nd value
#   arg 3 - text describing 1st value (optional)
#   arg 4 - text describing 2nd value (optional)
check_equal() {
  local val1="$1"; local val2="$2"
  local tag1=${3:-"val1"}; local tag2=${4:-"val2"}
  if [[ "$1" == "$2" ]]; then
    maybe_echo ".. check_equal $tag1 '$val1' OK"
  else
    echo_se "check_equal: not equal:
    ${tag1}: '$val1'
    ${tag2}: '$val2'
    "
    err "check_equal $tag1 $tag2"
  fi
}

# Sets up auto-logging to a log file in the current directory
# using the specified logFileTag (arg 1) in the log file name.
auto_log() {
  local logFileTag="$1"
  local logFilePath="./autoLog_${logFileTag}.log"
  check_arg_not_empty "$logFileTag" 'logFileTag'
  exec 1> >(tee "$logFilePath") 2>&1
  check_res $? "Logging to '$logFilePath'"
}

# =======================================================================
#  Command processing functions
# =======================================================================

# function that says "Hello World!" and displays user-specified text.
function helloWorld() {
  local txt1=$1
  local txt2=$2
  shift; shift
  local rest=$@

  echo "Hello World!"
  echo "  text 1: '$txt1'"
  echo "  text 2: '$txt2'"
  echo "  rest:   '$rest'"
}

# function that displays its 1st argument on standard output and
# its 2nd argument on standard error
function stdStreams() {
  local outTxt=${1:-"text for standard output"}
  local errTxt=${2:-"text for standard error"}
  echo    "to standard output: '$outTxt'"
  echo_se "to standard error:  '$errTxt'"
}

# function that illustrates auto-logging and capturing function output
#  arg 1 - (required) tag to identify the logfile
#  arg 2 - (optional) text for standard output
#  arg 3 - (optional) text for standard error
function testAutolog() {
  local logFileTag="$1"
  local outTxt=${2:-"text for standard output"}
  local errTxt=${3:-"text for standard error"}

  auto_log "$logFileTag"

  echo -e "\n1) Call stdStreams with output and error text:"
  stdStreams "$outTxt" "$errTxt"

  echo -e "\n2) Capture echo output in a variable and display it:"
  local output=`echo $outTxt`
  echo -e "   echo output was:\n$output"

  echo -e "\n3) Call echo_se with error text:"
  echo_se "$errTxt"

  echo -e "\n4)Capture echo_se function output in a variable and display it:"
  output=`echo_se "$errTxt"`
  echo -e "echo_se output was: '$output'"
}

# =======================================================================
#  Main script command-line processing
# =======================================================================

function usage() {
  echo "
advanced_bash.sh, version $__ADVANCED_BASH_VERSION__

Usage: advanced_bash.sh <command> [arg1 arg2...]

Commands:
  helloWorld [text to display]
  stdStreams [text for stdout] [text for stderr]
  testAutolog <logFileTag> [text for stdout] [text for stderr]
"
  exit 1
}

CMD=$1    # initially $1 will be the command
shift     # after "shift", $1 will be the 2nd command-line argument; $2 the 3rd, etc.
          # and $@ will be arguments 2, 3, etc.
# Only show usage if there is a command argument,
# making it possible to source this file
if [[ "$CMD" != "" ]]; then
  case "$CMD" in
    helloWorld) helloWorld "$@"
      ;;
    stdStreams) stdStreams "$1" "$2"
      ;;
    testAutolog) testAutolog "$1" "$2" "$3"
      ;;
    *) usage
      ;;
  esac
fi

The Parts

There are a number of error checking functions defined in step_04.sh – and many more that could be added of course – but this is a good start to get the ideas across.

err function

The main error function is just called err, and is called with some descriptive text elsewhere when an error is detected. It outputs that text with some surrounding decoration, and exits with a non-0 error code.

If we do our error checking properly, and an error occurs, the text from err will be the last line in the log file. Or the text "...exiting" can also be grep'd for in the log file to see if something bad happened.

Code Block
languagebash
# General function that exits after printing its text
# in a standard format which can be easily grep'd.
err() {
  echo_se "** ERROR: $@ ...exiting"; exit 255;
}

check_res function

The check_res function checks the result/exit code passed in to it (usually our friend $?).

Note that here and in other error checking functions, maybe_echo is used for diagnostic messages, so that they can be suppressed via the ECHO_VERBOSE environment variable, while text describing real errors is always displayed.

Code Block
languagebash
# Function to check result code of programs.
# Exits with a standard error message if code is non-zero.
# Otherwise displays a completion message.
#   arg 1 - the return code (usually $?)
#   arg 2 - text describing what ran (optional)
check_res() {
  if [[ "$1" == "0" ]]; then maybe_echo ".. check_res: $2 OK";
  else err "$2 returned non-0 exit code $1"; fi
}

Let's test the check_res function in a parentheses sub-shell with one or two arguments.

Code Block
languagebash
tmux new
source workshop/step_04.sh
( check_res 0 )
( check_res 0 testing )
( check_res 4 bad_thing )

exit

file checking functions

Several file/directory checking functions are defined:

  • check_dir
  • check_file
  • check_file_not_empty– handy for checking that a file both exists and is not empty

Code Block
languagebash
# Function that checks if a directory exists and exits if not.
#   arg 1 - the directory name
#   arg 2 - text describing the directory (optional)
check_dir() {
  if [[ ! -d "$1" ]]; then err "$2 Directory '$1' not found"
  else maybe_echo ".. $2 directory '$1' exists"; fi
}
# Function that checks if a file exists
#   arg 1 - the file name
#   arg 2 - text describing the file (optional)
check_file() {
  if [[ ! -e "$1" ]]; then err "$2 File '$1' not found"
  else maybe_echo ".. $2 file '$1' exists"; fi
}
# Function checks if a file exists & has non-0 length, else exits.
#   arg 1 - the file name
#   arg 2 - text describing the file (optional)
check_file_not_empty() {
  if [[ ! -e "$1" ]]; then err "$2 File '$1' not found"
  elif [[ ! -s "$1" ]]; then err "$2 File '$1' is empty"
  else maybe_echo ".. $2 file '$1' exists and is not empty"; fi
}

See https://www.gnu.org/software/bash/manual/html_node/Bash-Conditional-Expressions.html for conditional expressions, and https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html for conditional constructs such as if or case.

exercise 1

Explore these file checking functions in the safety of a sub-shell.

Expand
titleSolution


Code Block
languagebash
tmux new
source step_04.sh

( check_dir ~/ )
( check_dir ~/ 'my home' )
( check_file_not_empty ~/workshop/step_04.sh 'script' )
( check_file not_a_file.txt )

exit


value checking functions

Finally, we have two value checking functions:

  • check_arg_not_empty – handy for checking whether captured output is present or not.
  • check_equal – checks that two values passed in are equal (as strings).

Code Block
languagebash
# Checks that its 1st argument is not empty
#   arg 1 - the value
#   arg 2 - text desribing what the value is (optional)
check_arg_not_empty() {
  local val="$1"; local info=${2:-'argument'}
  if [[ "$val" == "" ]]; then err "$info value is empty"
  else maybe_echo ".. $info value not empty"
  fi
}
# Function that checks whether the two values supplied are equal (as strings)
#   arg 1 - 1st value
#   arg 2 - 2nd value
#   arg 3 - text describing 1st value (optional)
#   arg 4 - text describing 2nd value (optional)
check_equal() {
  local val1="$1"; local val2="$2"
  local tag1=${3:-"val1"}; local tag2=${4:-"val2"}
  if [[ "$1" == "$2" ]]; then
    maybe_echo ".. check_equal $tag1 '$val1' OK"
  else
    echo_se "check_equal: not equal:
    ${tag1}: '$val1'
    ${tag2}: '$val2'
    "
    err "check_equal $tag1 $tag2"
  fi
}

exercise 2

Explore these value checking functions in the safety of a sub-shell.

Expand
titleSolution


Code Block
languagebash
tmux new
source ( exit 0step_04.sh

( check_arg_not_empty 123 'important integer' )
res=$?
echo "exit code: $res" 

( exit 255 )
res=$?
echo "exit code: $res"

x

...

languagebash

...

( check_arg_not_empty ''  'important integer' )
( check_equal abc abc )
( check_equal abc def string1 string2 )

exit


simpler auto_log function

Notice that having these error checking functions at our disposal can lead to simpler, more readable code. For example, the auto_log function can be re-written like this:

Code Block
languagebash
# Sets up auto-logging to a log file in the current directory
# using the specified logFileTag (arg 1) in the log file name.
auto_log() {
  local logFileTag="$1"
  local logFilePath="./autoLog_${logFileTag}.log"
  check_arg_not_empty "$logFileTag" 'logFileTag'
  exec 1> >(tee "$logFilePath") 2>&1
  check_res $? "Logging to '$logFilePath'"
}