Shells and sub-shells

What this section covers

  • Shell and sub-shell environments
  • Program and function exit codes, and checking them
  • Capturing standard output with parentheses evaluation
  • Passing environment variables to a script using export
  • Extending the current shell environment using source
  • Testing source'd script functions with parentheses sub-shells

Overview

Every bash program has its own execution environment (sub-shell), which is a child process of its calling parent shell.

Here are the main communication methods between shell execution environments:

  • Input to sub-shells
    • program arguments
    • environment variables
    • standard input stream
    • file data
  • Output from sub-shells
    • exit code
    • standard output and standard error streams
    • file data

A new sub-shell (child shell) is created, runs, and returns whenever:

  • a built-in bash utility (e.g. ls) is run from the command line (or from within script)
  • a custom script (e.g. step_03.sh) or program is run from the command line (or from within a script)
  • backtick evaluation is used to execute commands (e.g. echo `date`)
    • the date function runs in the echo function's sub-shell
  • any set of commands enclosed in parentheses is run, e.g.
    • ( date )

Parentheses sub-shells

A parenthesis sub-shell is a new, child shell, created when a command is enclosed in parentheses ( ). Note that any environment variables set in a parenthesis sub-shell are not visible to the parent.

baz=abc
echo $baz

(baz=xya)
echo $baz  # will stil be abd

Parentheses evaluation

Parentheses evaluation is similar to backtick evaluation except that special syntax is needed to connect the standard output of backtick evaluation to the standard input of the caller.

To capture the standard output of parentheses evaluation, the parentheses expression can be "evaluated" with a dollar sign ($). Consider:

  • today=`date`
  • today=$(date)
    • because it is enclosed in parentheses, the date command is run in a sub-shell, writing its data to its standard output
    • date's standard output stream is connected to the calling shell's standard input by the dollar sign ($) before the opening parenthesis.
  • In both cases the caller's standard input text is stored in the today variable

Script exit codes and function return values

Unlike most other programming languages, bash functions and scripts can only return a single integer between 0 and 255. By convention a return value of 0 means success (true), and any other return value is an error code (false).

A function can return this value using the return keyword (e.g. return 0). The return value is then stored in the special $? variable, which can be checked by the caller. Since this not very much information, function return values are not often used or checked. Instead, as we've seen, functions are often called for their standard output, which serves as a return value proxy.

A script can also return an integer value, called the exit code, using the exit keyword (e.g. exit 255)

  • No further code in the current sub-shell is executed after exit is called.
  • A program's exit code is returned to the script caller (in the parent shell) in the $? variable.
  • The default exit code for a script is the exit code of the last command executed.
    • this will be the exit keyword's argument if exit is called explicitly

The main use of exit codes is to check that a called program completed successfully.

# A successful exit code is 0
ls
echo $?

# Any non-0 exit code is an error (here the code is 2)
ls not_a_file
echo $?

Note that in the non-0 exit code case, the program may also report error information on standard error (e.g. ls: cannot access not_a_file: No such file or directory above).

Tip

The $? return code variable must be checked immediately after the called program or sub-shell completes, because any further actions in the caller will change $?. One way to do this is to save off the value $? of in another variable (e.g. res=$?).

calling exit in a parentheses sub-shell

On the command line, let's call exit with various codes in a parentheses sub-shell and check the result in the caller.

Tip

We will do this in a new tmux or screen session, since accidentally calling exit at top-level (instead of in a sub-shell) will log you off the server!

See this nice tmux cheat sheet: http://atkinsam.com/documents/tmux.pdf

# Invoke tmux from your login command line
tmux new

# Now you're in a tmux. Mine has a green bar at the bottom
( exit 0 )
echo $?

( exit 255 )
res=$?
echo "exit code: $res"

# exit tmux session
exit
# You're back at your login command line now

More on capturing output

We've already seen some examples of capturing output from echo using backtick evaluation. Now let's read the contents of a file into a variable using parentheses evaluation.

echo Some text in a file > dat.txt
cat dat.txt
dat=$( cat dat.txt )    # same as dat=`cat dat.txt`
echo $dat

This is great, but what if the referenced file doesn't exist?

dat=$( cat not_a_file )
echo $?
echo $dat

The dat variable is empty, and the exit code returned was 1.

Here the cat utility was kind enough to return a non-0 exit code – not all programs are as well written!. And what if the file existed but was empty?

Rather than checking an exit code, it is often more robust to sanity check the returned output; for example, checking to see if it is empty. If you execute this in your tmux, be sure to enclose it all in parentheses or else your tmux will exit!

dat=$( cat not_a_file )

if [[ "$dat" == "" ]]; then
  echo "ERROR: no data found" 1>&2; exit 255
else
  echo "Data is: '$dat'"
fi

# or using -z to test for an empty string
if [[ -z "$dat" ]]; then echo "ERROR: no data found" 1>&2; exit 255;  
else echo "Data is '$dat'"; fi

See https://www.gnu.org/software/bash/manual/html_node/Bash-Conditional-Expressions.html for conditional expressions, and https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html for conditional constructs such as if or case.

Setting environment variables for a script

In addition to passing arguments to a program, a caller may set environment variables (normal bash shell variables) that can be read in the called environment. However by default, variables in a parent shell are not copied into (most) sub-shells unless they are exported using the export keyword. (The exception is parentheses sub-shells, which inherit the parent's environment variables).

# a normal bash variable is not visible to sub-shells
foo=abc
tmux new
echo $foo
exit  # exit tmux sub-shell

# exported bash variables are visible to sub-shells
export foo
tmux new
echo $foo
exit  # exit tmux sub-shell

Note:  Exported variables are copied in the the child shell. So any changes to an environment variable made in a sub-shell are not reflected in the parent.

export bar=abc
( echo $bar; export bar=123; echo $bar )

# bar in the parent shell will still be "abc"
 echo $bar

Changing the current environment with source

We just saw how definitions in a sub-shell are not reflected back in the parent. But there is a way to "incorporate" all a child shell's definitions into the calling parent shell using the source function, which reads and executes all code in the specified file in the parent's environment.

When a file is source'd:

  • All environment variables and functions defined in the source'd file will be available in the parent shell after source is called.
  • Any top-level code in the source'd file is executed.
  • The exit code returned from source-ing is the exit code of the last executed code in the source'd file

The step_03.sh Script

Here's a step_03.sh script that builds on our step_02.sh work, located in ~/workshop/step_03.sh.

#!/bin/bash

# Script version global variable. Edit this whenever changes are made.
__ADVANCED_BASH_VERSION__="step_03"

# =======================================================================
#  Helper functions
# =======================================================================

# Shorter format date
date2() { date '+%Y-%m-%d %H:%M'; }

# Echo's its arguments and the date to std error
echo_se() { echo "$@ - `date2`" 1>&2; }
maybe_echo() {
  local do_echo=${ECHO_VERBOSE:-1}
  if [[ "$do_echo" == "1" ]]; then echo_se "$@"; fi
}

# Sets up auto-logging to a log file in the current directory
# using the specified logFileTag (arg 1) in the log file name.
auto_log() {
  local logFileTag="$1"
  if [[ "$logFileTag" != "" ]]; then
    local logFilePath="./autoLog_${logFileTag}.log"
    maybe_echo ".. logging to $logFilePath"
    exec 1> >(tee "$logFilePath") 2>&1
    res=$?
    if [[ "$res" != "0" ]]; then
      echo_se "** ERROR: auto logging returned non-0 exit code $res"
      exit 255
    fi
  else
    echo_se "** ERROR in auto_log: no logFile argument provided"
    exit 255
  fi
}

# =======================================================================
#  Command processing functions
# =======================================================================

# function that says "Hello World!" and displays user-specified text.
function helloWorld() {
  local txt1=$1
  local txt2=$2
  shift; shift
  local rest=$@
  echo "Hello World!"
  echo "  text 1: '$txt1'"
  echo "  text 2: '$txt2'"
  echo "  rest:   '$rest'"
}

# function that displays its 1st argument on standard output and
# its 2nd argument on standard error
function stdStreams() {
  local outTxt=${1:-"text for standard output"}
  local errTxt=${2:-"text for standard error"}
  echo    "to standard output: '$outTxt'"
  echo_se "to standard error:  '$errTxt'"
}

# function that illustrates auto-logging and capturing function output
#  arg 1 - (required) tag to identify the logfile
#  arg 2 - (optional) text for standard output
#  arg 3 - (optional) text for standard error
function testAutolog() {
  local logFileTag="$1"
  local outTxt=${2:-"text for standard output"}
  local errTxt=${3:-"text for standard error"}

  auto_log "$logFileTag"

  echo -e "\n1) Call stdStreams with output and error text:"
  stdStreams "$outTxt" "$errTxt"

  echo -e "\n2) Capture echo output in a variable and display it:"
  local output=`echo $outTxt`
  echo -e "   echo output was:\n$output"

  echo -e "\n3) Call echo_se with error text:"
  echo_se "$errTxt"

  echo -e "\n4)Capture echo_se function output in a variable and display it:"
  output=`echo_se "$errTxt"`
  echo -e "echo_se output was: '$output'"
}

# =======================================================================
#  Main script command-line processing
# =======================================================================

function usage() {
  echo "
advanced_bash.sh, version $__ADVANCED_BASH_VERSION__

Usage: advanced_bash.sh <command> [arg1 arg2...]

Commands:
  helloWorld [text to display]
  stdStreams [text for stdout] [text for stderr]
  testAutolog <logFileTag> [text for stdout] [text for stderr]
"
  exit 255
}

CMD=$1    # initially $1 will be the command
shift     # after "shift", $1 will be the 2nd command-line argument; $2 the 3rd, etc.
          # and $@ will be arguments 2, 3, etc.
# Only show usage if there is a command argument,
# making it possible to source this file
if [[ "$CMD" != "" ]]; then
  case "$CMD" in
    helloWorld) helloWorld "$@"
      ;;
    stdStreams) stdStreams "$1" "$2"
      ;;
    testAutolog) testAutolog "$1" "$2" "$3"
      ;;
    *) usage
      ;;
  esac
fi

The Parts

modified command argument processing

To allow our script to be source'd, top-level command argument processing has been modified so that the usage function (which calls exit) is only called if there is a command argument provided.

CMD=$1    # initially $1 will be the command
shift     # after "shift", $1 will be the 2nd command-line argument; $2 the 3rd, etc.
          # and $@ will be arguments 2, 3, etc.
# Only show usage if there is a command argument,
# making it possible to source this file
if [[ "$CMD" != "" ]]; then
  case "$CMD" in
    helloWorld) helloWorld "$@"
      ;;
    stdStreams) stdStreams "$1" "$2"
      ;;
    testAutolog) testAutolog "$1" "$2" "$3"
      ;;
    *) usage
      ;;
  esac
fi

So we only see usage if we type something after the script name:

# Does not show usage
~/workshop/step_03.sh

# Shows usage
~/workshop/step_03.sh x

date2 and maybe_echo functions

The echo_se function has been modified to call a new date2 function, which calls date specifying a custom, shorter date format.

We've also added a maybe_echo function that calls echo_se if the user wants verbose messages (which is the default, based on the ECHO_VERBOSE environment variable, but that the user can change by export'ing a different value to the script).

# Shorter format date
date2() { date '+%Y-%m-%d %H:%M'; }

# Echo's its arguments and the date to std error
echo_se() { echo "$@ - `date2`" 1>&2; }
maybe_echo() {
  local do_echo=${ECHO_VERBOSE:-1}
  if [[ "$do_echo" == "1" ]]; then echo_se "$@"; fi
}

testing source'd functions

Another nice thing about source'ing a file, it lets us easily test functions we've defined, for example in a parentheses sub-shell. Let's experiment with this a bit. Again, it is good to do this in a tmux or screen session just in case! For example:

tmux new
source ~/workshop/step_03.sh
( helloWorld My name is Anna )
( stdStreams )

We can also test the new maybe_echo function, with and without verbose output:

tmux new
source ~/workshop/step_03.sh

# Normal verbose output
( maybe_echo "hello world" )

# Suppress verbose output
export ECHO_VERBOSE=0
( maybe_echo "hello world" )

# exit tmux session
exit 

auto_log function changes

There are also a couple of changes to the auto_log function:

  • It calls maybe_echo instead of echo_se to report the log file path, so that output can be suppressed.
    • But echo_se is called if an error is detected, since we never want to suppress actual error information.
  • The exit code returned by the exec 1> >(tee "$logFilePath") 2>&1 line is captured in a res variable, then checked.

# Sets up auto-logging to a log file in the current directory
# using the specified logFileTag (arg 1) in the log file name.
auto_log() {
  local logFileTag="$1"
  if [[ "$logFileTag" != "" ]]; then
    local logFilePath="./autoLog_${logFileTag}.log"
    maybe_echo ".. logging to $logFilePath"
    exec 1> >(tee "$logFilePath") 2>&1
    res=$?
    if [[ "$res" != "0" ]]; then
      echo_se "** ERROR: auto logging returned non-0 exit code $res"
      exit 255
    fi
  else
    echo_se "** ERROR in autoLog: no logFile argument provided"
    exit 255
  fi
}

exercise 1

In a sub-shell, test the auto_log function – with and without a logFileTag argument – and check the exit code.

 Solution
tmux new
source ~/workshop/step_03.sh

# Exit code will be 0 on successful execution
( auto_log test2 )
echo $?

# Exit code will be 255 when an error is detected
( auto_log )
echo $?

exit # exit tmux session