.. _advanced: Advanced usage ============== So far, we have only covered the simplest aspects of what the shell can do. But it can be used for far more than this, and can even be used as a full-blown scripting language capable of running complex applications. While this level of mastery is probably unnecessary for most users, there are a few advanced topics that are very useful and worth covering in more detail, even in an introductory document such as this. The interested reader is referred to more complete guides, such as `this one `_. Redirection ----------- The *standard output* of commands, normally intended to be displayed on the terminal, can be *redirected* to a file if needed, using the ``>`` symbol. For example, assuming that: .. code-block:: console $ ls docs html LICENSE README.md The file listing can be *redictered* to a file, called ``listing.txt`` in the example below: .. code-block:: console $ ls > listing.txt This creates the file specified, and the output normally shown by ``ls`` is not visible on the terminal. It has however been stored in the ``listing.txt`` file, as we can verify with ``cat``: .. code-block:: console $ cat listing.txt docs html LICENSE README.md This can also work in *append* mode, where the output of the command is appended to the file, rather than overwriting its entire contents. For example: .. code-block:: console $ app1 input output -options > log.txt $ app2 arg1 arg2 >> log.txt will create the ``log.txt`` file in the first line, and record any output from the ``app1`` command. The second line will then *append* its output to the log file. Likewise, we can redirect the *standard input* to feed in the contents of a file as input, rather than typing it in, using the ``<`` symbol. For example: .. code-block:: console $ sort < myfile.txt will feed the contents of ``myfile.txt`` to the ``sort`` command's *standard input*. Pipes ----- This is a special type of redirection, where the *standard output* of one command can be fed directly into the *standard input* of another, using the ``|`` symbol. Both commands run concurrently, with the second command able to process the output of the first as soon as it is provided. This can be incredibly useful to build compound commands. For example: .. code-block:: console $ grep ERROR log.txt | sort | uniq ERROR: error type one ERROR: input file not found ERROR: something bad happened uses the ``grep`` command to find all lines in ``log.txt`` that contain the character string ``ERROR``, then feeds those lines (which would normally be displayed on the terminal) via the pipe as input for the ``sort`` command. This sorts the lines in alphabetical order, and feed its output to the ``uniq`` command, which remove duplicates. The outcome of the full pipeline is a list of all unique error messages logged in the ``log.txt`` file. Another particularly useful example is to capture the output from a command expected to produce a lot of output, and browse through it at a more suitable pace rather than seeing it fly past on the terminal. This can be done using the command ``less`` (a paginator): .. code-block:: console $ complex_process -verbose | less This ability to quickly implememt otherwise non-trivial functionality is one of the great strengths of the command-line. Unix is full of little tools like ``grep``, ``sort`` and ``uniq`` that are designed to operate on text and to be daisy-chained in this manner. Conditional execution --------------------- While BASH provide its own ``if`` statement for more complex situations, it also offers a simple construct to allow execution of one command based on the success or failure of another, using the ``&&`` and ``||`` operators respectively. For example: .. code-block:: console $ myapp args -options || echo "myapp failed to run!" >> log.txt will record the fact that the ``myapp`` command has failed to the ``log.txt`` file. On the other hand: .. code-block:: console $ stage1 -options inputdata/ tmpdata/ && stage2 tmpdata/ outputdata/ will only run the ``stage2`` command if the ``stage1`` executable has completed successfully (useful if the data produced by the first command is to be processed by the next one). Variables --------- It is often useful to store information in variables. For instance, you might want to use a long and complicated filename often, and rather than typing it in every time you need it, you could use a variable. Variables are assigned using the ``=`` symbol (beware: no spaces around it), and retrieved (dereferenced) using the ``$`` symbol. For example: .. code-block:: console $ logfile=/some/complicated/location/myapp/logs/run1.txt $ myapp input intermediate > $logfile $ otherapp intermediate output >> $logfile ... The variable ``logfile`` is set to the filename of the logfile, and the output of all subsequent commands is then redirected to that file (see above). Iterating with for loops ------------------------ It is often required to perform the same command for a number of files. This can be achieved simply and effectively with a ``for`` loop, like this: .. code-block:: console $ for item in logs/run*.txt; do grep OUTPUT $item; done This will find all lines that contain the token ``OUTPUT`` in the logfiles stored in the ``logs/`` folder that match the filename ``run*.txt``, and print them on the terminal. What actually happens here is that a variable ``item`` is used to store each token listed after the ``in`` keywords (until the end of line or ``;`` symbol), and the command(s) between the ``do`` and ``done`` keywords are then executed for each token. The current value of the token can then be retrieved within the loop by dereferencing it like any other variable, using the ``$`` symbol. Note that the above does not need to be all on the same line. In practice, lines can be broken wherever the ``;`` was used in the example above: .. code-block:: console $ for item in logs/run*.txt > do > grep OUTPUT $item > done Parameter substitution ---------------------- There are certain operations that can be performed on variables at the point where they are being dereferenced. Of these, the most useful are probably the ability to strip a suffix or prefix. This is done using a syntax like ``${var#prefix}`` or ``${var%suffix}``. This is most useful in scripts and when combined with ``for`` loops. For example: .. code-block:: console $ for data in *.dat > do > process $data ${data%.dat}.out > ${data%.dat}.log > done will run the ``process`` command on all files in the current folder that end with the ``.dat`` suffix, and pass as second argument the same filename with the ``.dat`` suffix stripped and replaced with the ``.out`` suffix. The output of each command will individually be stored in log files, each with the ``.log`` suffix. If the current folder contained the files: .. code-block:: console $ ls backup/ final.dat original.dat parameters.txt trial2.dat Then the commands actually run will be: .. code-block:: console $ process final.dat final.out > final.log $ process original.dat original.out > original.log $ process trial2.dat trial2.out > trial2.log There are many other types of parameter substitutions possible, see the `relevant documentation `_ for details.