Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. In the 1st, the more command reads some input from the jabberwocky.txt file
    • then writes the output to standard output, which is displayed on your Terminal
    • it pauses at page boundaries (--More--) waiting for input on standard input
    • when it receives a space character on on standard input it reads more input from jabberwocky.txt
    • then writes the output to standard output, which is displayed on your Terminal
  2. In the 2nd, the cat command reads its input from the jabberwocky.txt file
    • then writes its output to standard output
    • the pipe operator ( | ) then connects the standard output from cat to standard input of the more command
    • the more command then reads its input from standard input, instead of from a file
      • then writes its output to standard output, which is displayed on your Terminal
      • more continues its processing similar to #1, except reading from its standard input instead of the file

...

Expand
titleAnswer...

Just entering the cat command with no arguments appears to "hang" – that is, nothing happens and you don't see the command prompt (just Ctrl-c to get it back).

Reading the man page for cat says this:

Code Block
languagebash
NAME
    cat - concatenate files and print on the standard output
SYNOPSIS
    cat [OPTION]... [FILE]...
DESCRIPTION
    Concatenate FILE(s) to standard output.
    With no FILE, or when FILE is -, read standard input.

The SYNOPSYS says in addition to one of more optional options to cat ( [OPTION}... ) arguments to cat are also optional ( [FILE]... ).

Since there was no FILE provided, cat reads from standard input – but there's no data there either, so it just sits and waits for some to appear.

...

With no options, head shows the first 10 lines of its input and tail shows the last 10 lines. You can use the -n option followed by a number to specify how many lines lines to view, or just put the number you want after a dash (e.g. -5 for 5 lines or -1 for 1 line).

...

But what if you want to see lines in the middle of a file? Here's where a special feature of tail comes in handy. If you use tail and put a plus sign (+) in front of the number (with or without the -n option), tail will start its output at that line.

Let's pipe line-numbered output from cat to tail to see how this works:. Note we use cat -n to provide input with line numbers because neither head nor tail has line numbering options.

Code Block
languagebash
cat -n haiku.txt | tail -n 75   # display the last 75 lines of haiku.txt

cat -n haiku.txt | tail -n +75  # display text in haiku.txt starting at line
7 cat -n haiku.txt | tail +10    # display text in haiku.txt starting at line 10

When you use the tail -n +<integer> syntax it will display all output from that line to the end of its input. So to view only a few lines starting at a specified line number, pipe the output to head:

Code Block
languagebash
# display 2 lines of haiku.txt starting at line 9     # line 5
cat -n haiku.txt | tail +6  -n +9 | head# -2display cattext -nin haiku.txt | head -10 | tail -2

Exercise 2-4

Use cat, head and tail to display the middle stanza of haiku.txt.

Expand
titleHint...

Use cat -n to see the numbering of haiku.txt lines, then a combination of head/tail or tail/head.

Expand
titleAnswer...

There are three 3-line stanzas in haiku.txt. The middle stanza is lines 5-7.

 starting at 
                               # line 6

When you use the tail -n +<integer> syntax it will display all input starting from that line until the end of its input. So to view only a few lines starting at a specified line number, pipe the output to head:

Code Block
languagebash
cat
# 
-n
display 2 lines of haiku.txt starting at line 9
cat -n haiku.txt | tail -n +
5
9 | head -
3
2
cat -n haiku.txt | tail +
5
9 | head -
3
n 2

cat -n haiku.txt | head -
7
10 | tail -
3

Text lines and the Terminal

...

2

Exercise 2-4

Use cat, head and tail to display the middle stanza of haiku.txt.

Expand
titleHint...

Use cat -n to see the numbering of haiku.txt lines, then a combination of head/tail or tail/head.


Expand
titleAnswer...

There are three 3-line stanzas in haiku.txt. The middle stanza is lines 5-7.

Code Block
languagebash
tail
cat -
1
n 
mobydick
haiku.txt | tail -n +5 | head -n 3
cat -n 
mobydick
haiku.txt |
more

Note that most Terminals let you increase/decrease the width/height of the Terminal window. But there will always be single lines too long for your Terminal width and too many lines of text for its height.

So how long is a line? And how many lines are there in a file? The wc (word count) command can tell us this.

  • wc -l reports the number of lines in its input
  • wc -c reports the number of characters in its input (including invisible linefeed characters)
  • wc -w reports the number of words in its input (groups of space-separated text characters)

Examples:

Code Block
languagebash
wc -l mobydick.txt
 tail +5 | head -3

cat -n haiku.txt | head -7 | tail -3


Text lines and the Terminal

Sometimes a line of text is longer than the width of your Terminal. In this case the text is wrapped. It can appear that the output is multiple lines, but it is not. We can see that by looking at lines of the mobydick.txt file, that has some very long lines:

Code Block
languagebash
tail -1 mobydick.txt
cat -n mobydick.txt | more

Note that most Terminals let you increase/decrease the width/height of the Terminal window. But there will always be single lines too long for your Terminal width or too many lines of text for its height.

So how long is a line? And how many lines are there in a file? The wc (word count) command can tell us this.

  • wc -l reports the number of lines in its input
  • wc -c reports the number of characters in its input (including invisible linefeed characters)
  • wc -w reports the number of words in its input (groups of space-separated text characters)

Examples:

Code Block
languagebash
wc -l mobydick.txt            # Reports the number of lines in the
                              # mobydick.txt file
cat mobydick.txt | wc -l      # Reports the number of lines of its input

tail -1 mobydick.txt | wc -c  # Reports the number of characters in 
                        # Reports the number of lines in# the last mobydick.txt file
catline
head -5 mobydick.txt | wc -l -c  # Reports the total number of characters in
        # Reports the number of lines of its input tail -1 mobydick.txt | wc -c  # Reports the number of characters of# the first last5 mobydick.txt linelines

When you give wc -l multiple files, it reports the line count of each, then a total.

...

Tip

Note the slight difference when you give wc -l a file name versus when you pipe input to it.

  • wc -l <filename> displays the number of lines then and the file name.
  • cat <filename> | wc -lonly displays the number of lines in its anonymous input from standard input.

...

We've talked about viewing text using various Unix commands – but what exactly is text? That is, what is stored in files that the shell interprets as text?

Inside of files, text isn't characters at all – it is all numbers, because that's all computers know.

On standard Unix systems, each text character is stored as one byteeight binary bits – in a format called ASCII (American Standard Code for Information Interchange). Eight bits can store 2^8 = 256 values, numbered 0 - 255.

...

However not all ASCII "characters" are printable -- in fact the "printable" characters start at ASCII 32 (space).

ASCII values 0 - 31 have special meanings. Many were designed for use in early modem protocols, such as EOT (end of transmission) and ACK (acknowledge), or for printers, such as VT (vertical tab) and FF (form feed).

The non-printable ASCII characters we care most about are:

  • Tab (decimal 9, hexadecimal 0x9, octal 0o011)
    • backslash escape: \t
  • Linefeed/Newline (decimal 10, hexadecimal 0xA, octal 0o012)
    • backslash escape: \n
  • Carriage Return (decimal 13, hexadecimal 0xD, octal 0o015)
    • backslash escape: \r

...

  • The numeric offset of the 16-character line, in hexadecimal (base 16)
    • 16 decimal is 0x10 hex
  • The numeric value (ASCII code) for each character, again in hexadecimal
    • each 2-digit hex number represents one 8-bit byte/character
  • The translated text
    • The display character associated with each ASCII code, or a period ( . ) for non-printable charactersthe , written between a greater than ( > ) and less than ( < ) sign

Notice that spaces are ASCII 0x20 (decimal 32), and the newline characters appear as 0x0a (decimal 10).

Why hexadecimal? Programmers like hexadecimal (base 16) because it is easy to translate hex digits to binary, which is how everything is represented in computers. And it can sometimes be important to know which binary bits are 1s and which are 0s. See Decimal and Hexadecimal for more information.

...