Part 3: Writing text

Part 3: Writing text

We've looked at some ways of viewing text, so now we'll address how to write it.

echo - the bash print function

Other programming languages usually supply a print statement or function that can direct text to a file or to standard output.

In bash, the print utility directs text to actual printers, so there is a different method of text output: the echo command. Let's give it a try:

echo Hello world!

As you can see, echo just takes all the text following the command name and writes it back out to standard output – the Linux stream that by default is mapped to your Terminal. So the text appears on your Terminal!

With raw text as arguments to echo, extra whitespace (one or more spaces) is removed.

# These commands display the same output echo Goodbye cruel world! echo Goodbye cruel world!

A couple of useful options:

  • echo -e says to enable interpretation of backslash escapes

    • so, for example, \n is interpreted as a linefeed, and \t as a tab character

  • echo -n says don't output the trailing newline (linefeed) character

See the difference between these two calls to echo:

echo hello # displays "hello" on the next terminal line, # then the prompt on a new line echo -n hello # displays "hello" on the next terminal line # followed by the prompt (no newline)

Exercise 3-1

What is the difference in character count when you echo hello with and without the -n option?

wc -c

echo hello | wc -c # Reports 6 characters, including the newline echo -n hello | wc -c # Reports 5 characters because the linefeed # is not part of the output

The reason for this difference is that by default, echo actually reads then outputs all the characters on the line, including the trailing linefeed.

Now see the difference between these two calls to echo:

echo "hello \n goodbye" # Displays "hello \n goodbye" on one line echo -e "hello \n goodbye" # Displays "hello" on one line and " goodbye" # on the next line

The difference is that the \n is interpreted as a newline when the -e option is provided.

To better understand what is meant by "interpretation of backslash escapes" in the -e option description, we need to look at how the shell evaluates other metacharacters, and how that is affected by quoting in the shell. But first we'll explore environment variables, since they use the dollar sign ( $ ) metacharacter for evaluation.

Environment variables

Environment variables are just like variables in a programming language (in fact bash is a complete programming language): they are names that hold a value assigned to them. As with all programming language variables, they have two operations:

  1. variable definition - assign a value to a variable name

  2. variable reference - use the variable name to represent the value it holds

In bash, you define (set/assign) an environment variable like this:

varname=hello # Assign the environment variable named "varname" # the value "hello"

Careful – do not put spaces around the equals sign when assigning environment variable values! The shell is very picky about this.

Also, variable names can only contain alphnumeric (A-Z, a-z, 0-9) and underscore ( _ ) characters, and must begin with a letter.

An environment variable can be referenced by putting the dollar sign ( $ ) metacharacter in front of the variable name.

echo The value of variable varname is: $varname

When it sees a dollar sign ( $ ), the shell will evaluate the variable name after it (here varname) and substitute its value before writing output text.

You can also reference an environment variable using a slightly longer syntax: ${<variable name>}. This is always a good idea to do if the variable name includes an underscore ( _ ) character:

my_varname=hello echo ${my_varname}

It's perfectly fine to evaluate an environment variable that has not been assigned a value - it will just be "empty".

echo The value of variable XYZZY is: $XYZZY

There are a number of pre-defined environment variables in the shell, such as USER (your account name), HOME (full pathname of your Home directory) and PATH (more on PATH later). The env command will list them along with their values.

Exercise 3-2

Output a string that includes your account name and your Unix group name using environment variables.

Use env to see your built-in environment variables.
You may want to pipe output to grep -i, or less -I then search for "group" ignoring case (/group in less).

env | grep -i group # Is there an environment variable with # "group" in its name?

Examining the env output we find that the variable MY_GROUP contains our Unix group.

echo The Unix group for $USER is $MY_GROUP

Quoting in the shell

When the shell processes a command line, it first parses the text into tokens ("words"), which are groups of characters separated by whitespace (one or more space characters). Quoting affects how this parsing happens, including how metacharacters are treated and how text is grouped.

There are three types of quoting in the shell:

  1. single quoting (e.g. 'some text') – this serves two purposes

    • it groups together all text inside the quotes into a single token

    • it tells the shell not to "look inside" the quotes to perform any evaluation

      • all metacharacters inside the single quotes are ignored

      • in particular, any environment variables in single-quoted text are not evaluated

    • single quoting preserves any whitespace present in the text

      • i.e. spaces, newlines (\n) , and tabs (\t)

  2. double quoting (e.g. "some text") – also serves two purposes

    • it groups together all text inside the quotes into a single token

    • it allows environment variable evaluation, but inhibits some metacharcters

      • e.g. asterisk ( * ) pathname globbing (more on globbing later...)

        • and some other metacharacters

    • double quoting also preserves whitespace in the text

      • i.e. spaces, newlines (\n) , and tabs (\t)

  3. backtick quoting (e.g. `date`)

    • evaluates the expression inside the backtick marks ( ` )

    • the standard output of the expression replaces the text inside the backtick marks ( ` )

    • an alternate (and preferred) syntax is $( <command> ), e.g. $( date )

The quote characters themselves ( '  "  ` ) are metacharacters that tell the shell to "start a quoting process" then "end a quoting process" when the matching quote is found. Since they are part of the processing, the enclosing quotes are not included in the output.

Let's look at examples of these.

Single and double quotes

The first rule of quoting is: always enclose a command argument in quotes if it contains spaces so that the command sees the quoted text as one item.

To see more on how quoting affects text grouping, we'll use quotes to define some multi-word environment variables.

Always use single ( ' ) or double ( " ) quotes when you define an environment variable whose value contains spaces.

See the difference between:

foo='Hello world' # correct - defines variable "foo" to have # the value "Hello world" foo=Hello world # error - Command 'world' not found

The 2nd expression above, without the quotes, produces an error. What's going on? The shell parses the input into two tokens: "foo=Hello" and "world". It assigns the value "Hello" to the variable foo, then tries to execute world, which it thinks is a command.

As for the difference between single quotes and double quotes, these two expressions produce the same output because the assigned text does not contain any special metacharacters:

foo="My name is Anna"; echo $foo foo='My name is Anna'; echo $foo

But these two expressions are different:

foo="My USER name is $USER"; echo $foo # The text "$USER" is evaluated # and its value substituted foo='My USER name is $USER'; echo $foo # The text "$USER" is left as-is

Here the single quotes tell the shell to treat the quoted text as a literal, and not to look inside it for metacharacter processing.

Note that the quote characters themselves ( '  "  ` ) are metacharacters that tell the shell to "start a quoting process" then "end a quoting process" when the matching quote is found. Since they are part of the processing, the enclosing quotes are not included in the output.

So far so good. But what if you want text to include quotes? For example, if you want to output this text:

The value of 'FOO' is "Hello world!"

Two common approaches:

  • Use the backslash ( \ ) character to escape the following character

    • escaping means treat the next character as a literal, even if it is a special metacharacter

  • Use combinations of single and double quotes.

Examples

FOO="Hello world!" echo "The value of 'FOO' is \"$FOO\"" # Escape the double quotes # inside double quotes echo "The value of 'FOO' is" '"'$FOO'"' # Single-quoted text after # double-quoted text

If you see the greater than ( > ) character after pressing Enter, it can mean that your quotes are not paired, and the shell is waiting for more input to contain the missing quote (either single or double). Just use Ctrl-c to get back to the prompt.

Exercise 3-3

How would you output this text: Use the backslash character \ for escaping

A couple of possibilities: 

# Single quotes inhibit metacharacter processing echo 'Use the backslash character \ is for escaping' # Escape the escape character inside double quotes echo "The backslash character \\ is used for escaping"

Multi-line text

If you want to output multi-line text, you can:

  • Start the text with a single or double quote

    • press Enter when you want to start a new line

    • keep entering text and Enter until you're satisfied

    • enter the matching single or double quote then Enter

  • Use echo -e to enable interpretation of backslash escapes

    • Note that backslash escapes include some that represent non-printable characters

      • e.g. newline/linefeed ( \n ), and tab ( \t )

Exercise 3-4

How would you output this text:

My
name is
Anna

A couple of possibilities: 

echo 'My name is Anna' echo -e "My\nname\nis\nAnna"

Final quoting subtleties - the shell removes whitespace (spaces, newlines and tabs) unless the expression is enclosed in quotes.

# These expressions produce the same output because # the shell removes whitespace: echo Hello world! echo Hello world! # Echo a multi-line text variable without quotes, and the # shell removes whitespace (here the newline) foo='aa bb' echo $foo # But enclose the multi-line variable in double quotes and whitespace # is preserved echo "$foo"

Backtick evaluation quoting

backtick ( ` ) evaluation quoting is one of the underappreciated wonders of Unix.

The shell:

  • evaluates the expression/command inside the backtick marks ( ` )

  • the standard output of the expression replaces the text inside the backtick marks ( ` )

Examples, using the date function that just writes the current date and time to standard output, which appears on your Terminal.

date # Calling the date command just displays # date/time information echo date # Here "date" is treated as a literal word, and # written to output echo `date` # The date command is evaluated and its output # replaces the command # Assign a string including today's date to variable "today" today="Today is: `date`"; echo $today

The alternative parentheses evaluation syntax $(<some command>), is equivalent to `<some command>`, and can be easier to read when the command to be evaluated is complex.

echo "Today is: `date`" echo "Today is $(date)" echo "Today is $( date )"

Exercise 3-5

How would you output this text using a command to calculate the number of lines:

The haiku.txt file has 11 lines

Use parentheses evaluation or backtick evaluation and wc -l

Provide the file data to wc -l via cat so that the filename is not displayed:

cat haiku.txt | wc -l

echo "The haiku.txt file has $(cat haiku.txt | wc -l) lines"

Notice that the evaluated expression can be complex!

Redirection

So far text we've been working with output to standard output, which I keep reminding you is mapped to your Terminal. But you can redirect text elsewhere.

Recall the three standard Unix streams: they each have a number, a name and redirection syntax:

  • standard output is stream 1

    • redirect standard output to a file with a the > or 1> operator

      • a single > or 1> overwrites any existing data in the target file

      • a double >> or 1>> appends to any existing data in the target file

  • standard error is stream 2

    • redirect standard error to a file with a the 2> operator

      • a single 2> overwrites any existing data in the target file

      • a double 2>> appends to any existing data in the target file

Some standard output examples:

echo hello > out.txt cat out.txt # Displays "hello" echo world 1>> out.txt # Appends "world" to out.txt cat out.txt # Displays "hello" "world" on 2 lines echo goodbye world 1>out.txt # Overwrites out.txt file cat out.txt # Displays "goodbye world"

Notice when using redirection, the output does not appear on the Terminal, it only goes to the specified file.

If you want output to go to both the Terminal and a file, you can use the tee command. You can also specify the tee -a option to append the input text to the file you specify.

echo Hello Anna | tee out.txt # Displays "Hello Anna" cat out.txt # Also displays "Hello Anna" echo Goodbye Anna| tee -a out.txt # Displays "Goodbye Anna" student01@gsafcomp01:~$ cat out.txt Hello Anna Goodbye Anna

Note that the > redirection metacharacter sends its output to a file, not to another program's standard input stream as with the | pipe metacharacter.

The standard error stream

So what's this standard error stream? Recall our discussion of Part 1: The Bash shell and commands | Command input errors? Well, error information is written to standard error, not to standard output!

It is easy to not notice the difference between standard output and standard error when you're in an interactive Terminal session – because both outputs are sent to the Terminal. But they are separate streams, with different meanings.

When executing commands you will want to manipulate standard output and standard error appropriately – especially for 3rd party programs.

Let's look at a command that shows the difference between standard error and standard output:

ls haiku.txt xxx.txt

Produces this output in your Terminal:

ls: cannot access 'xxx.txt': No such file or directory haiku.txt

What is not obvious, since both streams are displayed on the Terminal, is that:

  • the diagnostic text "ls: cannot access 'xxx.txt': No such file or directory" is being written to standard error

  • the listing of the existing file (haiku.txt) is being written to standard output

To see this, redirect standard output and standard error to different files and look at their contents:

ls haiku.txt xxx.txt 1> out.txt 2>err.txt cat out.txt # Displays "haiku.txt" cat err.txt # Displays "ls: cannot access 'xxx.txt': No such file or directory"

What if you want both standard output and standard error to go to the same file? You use this somewhat odd 2>&1 redirection syntax:

# Redirect both standard output and standard error to the out.txt file student30@gsafcomp02:~$ ls haiku.txt xxx.txt > out.txt 2>&1 # Display the contents of the out.txt file student30@gsafcomp02:~$ cat out.txt ls: cannot access 'xxx.txt': No such file or directory haiku.txt

One final redirection trick. There is a special Linux file called /dev/null that serves as a "global trash can". That is – it just throws away anything you write to it. So you can direct standard output and/or standard error to /dev/null to ignore it completely.

Exercise 3-6

Show the difference between  standard output and standard error by redirecting standard error to /dev/null.

This will only display "haiku.txt"

ls haiku.txt xxx.txt 2>/dev/null

Show the difference between  standard output and standard error by redirecting standard output to /dev/null.