1.1. Sometimes a single keyword has entries in multiple sections. For instance, passwd
has entries under both section 1 and section 5. In most cases, man
returns the entry in the lowest-numbered section, but you can force the issue by preceding the keyword by the section number. For instance, typing man 5 passwd returns information on the passwd
file format rather than the passwd
command.
Table 1.1 Manual sections
Some programs have moved away from man
pages to info
pages. The basic purpose of info
pages is the same as that for man
pages. However, info
pages use a hypertext format so that you can move from section to section of the documentation for a program. Type info info to learn more about this system.
There are also pages specifically for the built-in (internal) commands called the help
pages. To read the help
pages for a particular built-in command, type help command. For instance, to get help on the pwd
command, type help pwd at the shell prompt. To learn more about how to use the help
pages, type help help at the shell prompt.
The man
pages, info
pages, and help
pages are usually written in a terse style. They're intended as reference tools, not tutorials! They frequently assume basic familiarity with the command, or at least with Linux in general. For more tutorial information, you must look elsewhere, such in books or on the Web.
Using Streams, Redirection, and Pipes
Streams, redirection, and pipes are some of the more powerful command-line tools in Linux. Linux treats the input to and output from programs as a stream, which is a data entity that can be manipulated. Ordinarily, input comes from the keyboard and output goes to the screen. You can redirect these input and output streams to come from or go to other sources, such as files. Similarly, you can pipe the output of one program as input into another program. These facilities can be great tools to tie together multiple programs.
Part of the Unix philosophy to which Linux adheres is, whenever possible, to do complex things by combining multiple simple tools. Redirection and pipes help in this task by enabling simple programs to be combined together in chains, each link feeding off the output of the preceding link.
Exploring File Descriptors
To begin understanding redirection and pipes, you must first understand the different file descriptors. Linux handles all objects as files. This includes a program's input and output stream. To identify a particular file object, Linux uses file descriptors:
Standard Input
Programs accept keyboard input via standard input, abbreviated STDIN. Standard input's file descriptor is 0
(zero). In most cases, this is the data that comes into the computer from a keyboard.
Standard Output
Text-mode programs send most data to their users via standard output, abbreviated STDOUT. Standard output is normally displayed on the screen, either in a full-screen text-mode session or in a GUI terminal emulator, such as an xterm
. Standard output's file descriptor is 1
(one).
Standard Error
Linux provides a second type of output stream, known as standard error, abbreviated STDERR. Standard error's file descriptor is 2
(two). This output stream is intended to carry high-priority information such as error messages. Ordinarily, standard error is sent to the same output device as standard output, so you can't easily tell them apart. You can redirect one independently of the other, though, which can be handy. For instance, you can redirect standard error to a file while leaving standard output going to the screen. This allows you to view the error messages at a later time.
Internally, programs treat STDIN, STDOUT, and STDERR just like data files – they open them, read from or write to the files, and close them when they're done. This is why the file descriptors are necessary and why they can be used in redirection.
Redirecting Input and Output
To redirect input or output, you use operators following the command, including any options it takes. For instance, to redirect the STDOUT of the echo
command, you would type something like this:
The result is that the file path.txt
contains the output of the command (in this case, the value of the $PATH
environment variable). The operator used to perform this redirection was >
and the file descriptor used to redirect STDOUT was 1
(one).
The cat
command allows you to display a file's contents to STDOUT. It is described further in the section “Processing Text Using Filters” later in this chapter.
A nice feature of redirecting STDOUT is that you do not have to use its file descriptor, only the operator. Here's an example of leaving out the 1
(one) file descriptor, when redirecting STDOUT:
You can see that even without the STDOUT file descriptor, the output was redirected to a file. However, the redirection operator (>
) was still needed.
You can also leave out the STDIN file descriptor when using the appropriate redirection operator. Redirection operators exist to achieve several effects, as summarized in Table 1.2.
Table 1.2 Common redirection operators
Most of these redirectors deal with output, both because there are two types of output (standard output and standard error) and because you must be concerned with what to do in case you specify a file that already exists. The most important input redirector is <
, which takes the specified file's contents as standard input.
A common trick is to redirect standard output or standard error to /dev/null
. This file is a device that's connected to nothing; it's used when you want to get rid of data. For instance, if the whine
program is generating too many unimportant error messages, you can type whine 2> /dev/null to run it and discard its error messages.
One redirection operator that requires elaboration is the <<
operator. This operator implements something called a here document. A here document takes text from subsequent lines as standard input. Chances are you won't use this redirector on the command line. Subsequent lines are standard input, so there's no need to redirect them. Rather, you might use this command in a script to pass data to an interactive program. Unlike with most redirection operators, the text immediately following the <<
code isn't a filename; instead, it's a word that's used to mark the end of input. For instance, typing someprog << EOF causes someprog
to accept input until it sees a line that contains only the string EOF
(without even a space following it).
Some programs that take input from the command line expect you to terminate input by pressing Ctrl+D. This keystroke corresponds to an end-of-file marker using the American Standard Code for Information Interchange (ASCII).
Piping Data between Programs
Programs can frequently operate on other programs' outputs. For instance, you might use a text-filtering command (such as the ones described shortly in “Processing