The document provides an overview of the Unix shell and its basic concepts:
- The shell allows users to interact with the operating system by entering commands
- Common Unix shells include bash, csh, and ksh
- The shell prompt displays information about the user and current directory
- Commands can be run internally by the shell or externally by spawning new processes
- Processes have attributes like the command line, exit status, and environment variables
1. The Unix Shell: Basic and Advanced Concepts
Corrado Santoro
ARSLAB - Autonomous and Robotic Systems Laboratory
Dipartimento di Matematica e Informatica - Universit`a di Catania, Italy
santoro@dmi.unict.it
Master Cloud P.A.
Corrado Santoro The Unix Shell
3. What is a shell
”She sells sea shells
on the sea shore”
The shell is the part of an operating system which allows a
user to interact with the system itself.
It can be:
Textual (terminal)
Graphical
The shell name means that it constitutes a “wrapper” or
“packaging” of its internal part which is the overall operating
system.
Corrado Santoro The Unix Shell
4. The Unix shell
It is the most complete shell since it not only allows user to
launch programs and operate with the system, but it is also a
programming language.
There are several Unix shells available:
Bourne Shell (sh, bash): the first shell and the most
common used.
C Shell (csh): created in Unix-BSD, it has a programming
language different than that of sh.
Tenex C Shell (tcsh): evolution of the csh.
Korn Shell (ksh): a different shell developed by David
Korn (AT&T labs), with its proper programming language.
Corrado Santoro The Unix Shell
5. The Unix shell
From the user point of view, a shell is characterised by a
prompt from which a user can enter a command:
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ command
The prompt usually gives the user usefull information, i.e.:
The name of the computer
The current directory
The name of the current user
...
Corrado Santoro The Unix Shell
6. The Unix shell
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ command
The command can be:
An internal command: it is interpreted and executed
internally by the shell itself.
An external command: it is the name of an executable
program, which is run by the shell by spawning a new
process; when the program terminates, the control returns
to the shell.
Corrado Santoro The Unix Shell
7. Basics of Unix processes
A Unix process is characterised by:
The command line
The exit value
The environment variables
The standard streams (stdin, stdout, stderr)
All of these part take a fundamental role in the interaction with
the shell.
Corrado Santoro The Unix Shell
8. Command Line and Exit Status
/didattica/MasterCloud/shell$ command par1 par2 par3
The executable command is run and all the command line
is passed to the main function as an array of strings.
The return value of the main function is the exit status of
the command.
✞
int main(int argc, char **argv)
✡✝ ✆
For the exit status:
0: the execution succeeds with no errors;
non zero (positive): one or more errors occur (in general,
each specific error has an associated exit status value).
Corrado Santoro The Unix Shell
9. An Example
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
The executable file ls is searched for (it is in /bin/ls);
The shell spawns a new (child) process and run the
executable (fork + exec);
The command line, as the array of strings [ "ls", "/"],
is passed to the main function;
The shell calls the wait() function, waiting for the child to
terminate;
When “ls” terminates, the control is given to the shell
again, which gathers the exit code (return value) of “ls”.
Corrado Santoro The Unix Shell
10. Inspecting Exit Status
Using the shell internal command echo $? we know the exit
status of the last process executed:
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ echo $?
0
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ echo $?
Here the command terminated with success (zero value), but
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio
ls: cannot access /ciccio: No such file or directory
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ echo $?
2
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
here the command terminated with error (non-zero value,
/ciccio dir does not exist).
Corrado Santoro The Unix Shell
11. Command Line and Exit Status
In summary:
The command line is the interface from the shell to an
external command (executable).
The exit status is the interface from the command to the
shell.
Corrado Santoro The Unix Shell
13. Internal Commands and Variables
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio
ls: cannot access /ciccio: No such file or directory
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ echo $?
2
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
The command echo is an internal command, it is executed by
the shell itself. Its maning is “show on the screen the literal
which follows” (similar to printf()).
What is the literal “$?” ?
Each literal starting with $ represents an environment
variable, i.e. a variable belonging to the process, which can be
read and written.
$? is a special shell variable which expands to “the exit status
of the last command executed”.
Corrado Santoro The Unix Shell
14. Environment Variables
Each process has a set of environment variables (pairs
[name, value]) which can be used for several
functionalities.
Environment variables are inherited (from the parent) by a
child process.
The opposite does not hold: a variable modified by a child
process is not propagated to the parent.
Variables can be read or written in a C program using the
functions getenv and setenv.
Corrado Santoro The Unix Shell
15. The Shell Environment
The shell internal commands env and set can be used to
show all the shell environment variable:
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ env
...
USER=corrado
LC_TELEPHONE=it_IT.UTF-8
INSIDE_NEMO_PYTHON=
SSH_AUTH_SOCK=/run/user/1000/keyring-4hePqy/ssh
SESSION_MANAGER=local/Corrado-1215P:@/tmp/.ICE-unix/1452,unix/Corrado-1215P:/tmp/.ICE-unix/145
USERNAME=corrado
DEFAULTS_PATH=/usr/share/gconf/default.default.path
XDG_CONFIG_DIRS=/etc/xdg/xdg-default:/etc/xdg
PATH=/home/corrado/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
DESKTOP_SESSION=default
PWD=/home/corrado/didattica/MasterCloud/shell
LANG=en_US.utf8
GDMSESSION=default
SHLVL=1
XDG_SEAT=seat0
HOME=/home/corrado
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
...
Corrado Santoro The Unix Shell
16. Some Usefull Shell Variables
USER: the ID of the current user
PWD: the current working directory
HOME: the home directory of the current user
PATH: a colon-sperated list of complete directory names,
which are places in which the shell searches for the
executable of an external command
USER=corrado
PATH=/home/corrado/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:
/bin:/usr/games:/usr/local/games
PWD=/home/corrado/didattica/MasterCloud/shell
Corrado Santoro The Unix Shell
18. Towards Shell Programming: Combining several
commands into a single line
The semicolon “;” can be used to separate different commands
typed in a single line.
All the commands are executed in sequence no matter their exit
status.
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / ; echo $?
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
0
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio ; echo $?
ls: cannot access /ciccio: No such file or directory
2
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio ; echo CIAO $USER
ls: cannot access /ciccio: No such file or directory
CIAO corrado
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
Corrado Santoro The Unix Shell
19. Combining several commands only on successful
execution
The double ampersand “&&” can be used to separate different
commands typed in a single line, but a command is executed
only if the previous one succeeded, i.e. its exit status is zero.
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / && echo CIAO $USER
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
CIAO corrado
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio && echo CIAO $USER
ls: cannot access /ciccio: No such file or directory
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
This is an initial form of “if-then” conditional expression
Corrado Santoro The Unix Shell
20. The complete conditional expression
The double ampersand “&&” is indeed a conditional expression
which includes also the double pipe “||” to indicate the “else”
part.
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / && echo CIAO $USER || echo Errore
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
CIAO corrado
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio && echo CIAO $USER || echo Err
ls: cannot access /ciccio: No such file or directory
Errore
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio || echo Errore
ls: cannot access /ciccio: No such file or directory
Errore
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / || echo Errore
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
The complete syntax is:
command && then-part || else-part
Either the then-part or the else-part can be omitted.
Corrado Santoro The Unix Shell
22. STDIN, STDOUT, STDERR
Each process is characterised by three files which are automatically
open when the process is started:
STDIN: standard input, the file for data input. Usually the
keyboard.
STDOUT: standard output, the file for data output. Usually the
screen.
STDERR: standard error, the file for error messages output.
Usually the screen.
Their presence is tied to the execution model of the old mainframe
computers ...
They are associated respectively to file descriptors “0”, “1” and “2” of
the process and are propagated from the parent process to the child.
Corrado Santoro The Unix Shell
23. Stream redirection
stdin, stdout and stderr can be redirected to other files, in a
shell command, using the characters <, > and 2>.
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / > list.txt
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ cat list.txt
bin
boot
cdrom
dev
etc
home
home2
initrd.img
lib
lost+found
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
vmlinuz
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
Corrado Santoro The Unix Shell
24. Stream redirection
stdin, stdout and stderr can be redirected to other files, in a
shell command, using respectively the characters <, > and 2>.
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / 2> list.txt
bin cdrom etc home2 lib media opt root sbin sys usr vmlinuz
boot dev home initrd.img lost+found mnt proc run srv tmp var
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ cat list.txt
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio 2> list.txt
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ cat list.txt
ls: cannot access /ciccio: No such file or directory
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
Corrado Santoro The Unix Shell
25. Stream redirection and output suppression
If you want to suppress the output (or the error messages) of a
command, you can perform redirection to the special file
/dev/null. It is a sink without basement :-)
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls / > /dev/null
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls /ciccio 2> /dev/null
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
Corrado Santoro The Unix Shell
26. Exercise
Write a command which searches for a word in some files and
prints “Found” if at least a file contains the given word, “Not
found” otherwise.
Use the command grep which has the following syntax:
grep <word-to-search> <space-separated-list-of-file
The command prints the list of files which contain the word and
returns:
0, if at least one file contains the word
1, if no files contain the word
Corrado Santoro The Unix Shell
27. Stream pipelining
The stdout of a command can “feed” the stdin of another
command in order to perform stream-based processing. This is
performed by concatenating commands using the single pipe
character “|”. The syntax is:
command-1 | command-2
Example: searching for a file which includes the word “Shell”:
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls
list.txt robotEscher_Eura_blur.eps Shell1.dvi Shell1.nav Shell1.ps Shell1.tex Shell1.vr
Makefile Shell1.aux Shell1.log Shell1.out Shell1.snm Shell1.toc
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ls | grep Shell
Shell1.aux
Shell1.dvi
Shell1.log
Shell1.nav
Shell1.out
Shell1.ps
Shell1.snm
Shell1.tex
Shell1.toc
Shell1.vrb
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
Corrado Santoro The Unix Shell
29. Scripting
The shell allows the execution of commands written in a text file
as they were submitted from the from the keyboard.
Let us suppose to write the text file “listshells”:
✞
#!/bin/sh
ls . | grep Shell
✡✝ ✆
We can “execute” is in one of the following ways:
By invoking the shell and passing the file name as
parameter: sh listshells
By marking the file as executable, with chmod +x
listshells, and then executing it with ./listshells.
Corrado Santoro The Unix Shell
31. Scripting and Magic Numbers
✞
#!/bin/sh
ls . | grep Shell
✡✝ ✆
What is the meaning of the first line?
In general, the first two (or four) bytes of a file make the so-called
magic number, which is used to univocally identify the file type.
The exec system function reads the magic number and, if it knows
the type, executes the file accordingly.
7F 45 4C 46, a ELF (executable and linkable format) file
CA FE BA BE, a Java class file
’M’ ’Z’, a Windows EXE file
’#’, ’!’, an interpreted file, the pathname which follows is the
interpreter
Corrado Santoro The Unix Shell
32. Scripting and Arguments
We want to search for the word not in the current dir but in a
given dir, so let’s “parametrise” our script:
✞
#!/bin/sh
ls $1 | grep Shell
✡✝ ✆
The word is now searched for in the directory given as first parameter
when invoking the script:
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$ ./listshells /home/corrado
Shell.txt
corrado@Corrado-1215P:˜/didattica/MasterCloud/shell$
$1, $2, $3, ... are special variables which expand to the relevant
positional parameter given in the command line.
$0 is the name of the command itself.
They play the same role of argv of a C program.
The variable $# expands to the number of parameters.
Corrado Santoro The Unix Shell
33. The “if” conditional expression in script
We knew the conditional expression made of double
ampersand “&&” and double pipe “||”. It can be used also in
scripts.
However there is a “more meaningful” conditional expression
based on the construct if-then-else:
if command
then
... then part
else
... else part
fi
Corrado Santoro The Unix Shell
34. Scripting and Conditionals
Let us analyse the following script:
✞
#!/bin/sh
if ls $1 | grep Shell > /dev/null
then
echo "Found"
else
echo "Not Found"
fi
✡✝ ✆
Corrado Santoro The Unix Shell
35. Conditionals and the Test Program
The if construct takes as argument a “program invocation”
and behaves according to the exit value of the program
itself.
If the program exits with success (exit code = 0), the then
part is executed, otherwise the else part.
But, can we use “classical conditionals”, such as
comparison of variables, etc., in a if construct?
The answer is Yes!, by using the test program.
It is Unix tool which allows to specify, as argument by
means of a proper syntax (see man test), the type of test
needed and returns
Success (exit code = 0), if the test succeed
Not success (exit code != 0), if the test fails
Corrado Santoro The Unix Shell
36. Test Example
The following script prints “Starting something” if the literal
“start” is given as the first parameter of the script itself; otherwise it
prints “Invalid parameter”.
✞
#!/bin/sh
if test $1 = start
then
echo Starting something
else
echo Invalid parameter
fi
✡✝ ✆
The following script prints “File exists” if the file with the path
given as argument is present in the file system; “File DOES NOT
exists”, otherwise.
✞
#!/bin/sh
if test -e $1
then
echo File exists
else
echo File DOES NOT exist
fi
✡✝ ✆
Corrado Santoro The Unix Shell
37. Conditionals and the Test Program
The test program has an alias which has a strange name:
“[”
The meaning of a “square bracket” is to make the test
statement more similar to a boolean condition of a
classical programming language
If “[” is used instead of “test”, the line must terminate with
“]”.
Corrado Santoro The Unix Shell
38. Test Example with [
The following script prints “Starting something” if the literal
“start” is given as the first parameter of the script itself; otherwise it
prints “Invalid parameter”.
✞
#!/bin/sh
if [ $1 = start ]
then
echo Starting something
else
echo Invalid parameter
fi
✡✝ ✆
The following script prints “File exists” if the file with the path
given as argument is present in the file system; “File DOES NOT
exists”, otherwise.
✞
#!/bin/sh
if [ -e $1 ]
then
echo File exists
else
echo File DOES NOT exist
fi
✡✝ ✆
Corrado Santoro The Unix Shell
39. Iteration: the “for” construct
The shell provides an internal command to perform
iterations
Iteration is always performed through a list of
space-separated literals
As in similar constructs, an iteration variable must be
specified.
The syntax of for is:
for iteration-variable in list of literals
do
... body of the iteration
done
Corrado Santoro The Unix Shell
40. For Example
The following script iterates over numbers from 1 to 10 and prints the
iteration value.
✞
#!/bin/sh
for i in 1 2 3 4 5 6 7 8 9 10
do
echo $i
done
✡✝ ✆
The following script iterates over numbers from 1 to 10 and prints
“Hello” for numbers less than or equal to 5 and “World” for numbers
greater than 5.
✞
#!/bin/sh
for i in 1 2 3 4 5 6 7 8 9 10
do
if [ $i -le 5 ]
then
echo Hello
else
echo World
fi
done
✡✝ ✆
Corrado Santoro The Unix Shell
41. Variables
An environment variable can be set using the syntax:
var-name= literal-value
It can be read and used in other statments with the syntax:
$var-name
There is no algebra in shell! No operations can be
performed with variables
✞
#!/bin/sh
for i in 1 2 3 4 5 6 7 8 9 10
do
if [ $i -le 5 ]
then
echo Hello
else
echo World
fi
done
✡✝ ✆
Corrado Santoro The Unix Shell
42. Capturing the output (stdout) of a command
Can we capture the output (stdout) of a command in a
variable, in order to use it in a conditional or iteration
expression?
For example, can we capture the output of ls in order to
make operations with the listed files?
We can do that by using the reverse-quote symbol: “‘”
The following example gathers the output of ls into variable
FILES and then prints it:
✞
#!/bin/sh
FILES=‘ls‘
echo $FILES
✡✝ ✆
Corrado Santoro The Unix Shell
43. Example
Let us write a script which, given a path name, prints only
the name of directories present in that path
We can use test -d FILE to check if FILE exists and is a
directory
✞
#!/bin/sh
for file in ‘ls $1‘
do
if [ -d $1/$file ]
then
echo File $1/$file is a directory
fi
done
✡✝ ✆
Corrado Santoro The Unix Shell
44. Executing Mathematical Operations
But... given that a variable contains a numeric literal, can
we perform some mathematical operations on it?
The let internal command can execute mathematical
operations on variables
It is present only in bash and not in other Bourne shells
(sh, dash, etc.)
✞
#!/bin/bash
for i in 1 2 3 4 5 6 7 8 9 10
do
let result = $i * 2
echo $result
done
✡✝ ✆
Corrado Santoro The Unix Shell
45. Executing Mathematical Operations
In any case, for mathematical and other operations, there
is the expr program
It executes the mathematical operation given as argument
and prints the result
It also can perform operations on strings, such as
computing the length, extracting a substring, etc.
If it is enclosed into a reverse-quoted string, the output can
be gathered into another variable
The following example prints the value of the iteration
variable times 2 (quoting the “*” with backslash is
necessary since the basic meaning of “*” is “all files”):
✞
#!/bin/sh
for i in 1 2 3 4 5 6 7 8 9 10
do
result=‘expr $i * 2‘
echo $result
done
✡✝ ✆
Corrado Santoro The Unix Shell
46. Generating Numerical Sequences
The examples given uses a sequence of numbers in a for
statement, which is given manually
By using the seq program we can automatically generate a
certain numeric sequence
Syntax is:
seq start end
seq start increment end
✞
#!/bin/sh
for i in ‘seq 1 10‘
do
result=‘expr $i * 2‘
echo $result
done
✡✝ ✆
Corrado Santoro The Unix Shell
47. Iteration: the “while” construct
The shell provides also a while construct
The condition tested has the same semantics of the if: it
runs a command and test the exit value
The syntax of while is:
while command
do
... body of the iteration
done
Corrado Santoro The Unix Shell
48. While example
Let us write a script which iterates on the arguments given
and prints each argument
We use the internal command shift which performs a
“parameter shifting”, i.e. removes the first parameter from
the list and “shifts left” the other; it also decrements the
variable $# “number of parameters”.
✞
#!/bin/sh
while [ "$#" -gt "0" ]
do
echo $1
shift
done
✡✝ ✆
Corrado Santoro The Unix Shell
49. Conditionals: the “case” construct
Like any structured programming language, the shell also
has a case construct
The syntax of case is:
case expr in
first-case)
body of first case
;;
second-case)
body of second case
;;
.... *)
body of default case
;;
esac
Corrado Santoro The Unix Shell
50. Case example
✞
#!/bin/sh
case $1 in
start)
echo starting...
;;
stop)
echo stopping...
;;
*)
echo Invalid command
;;
esac
✡✝ ✆
Corrado Santoro The Unix Shell
51. Reading stdin
The stdin can be read using the command read, its syntax
is:
read [ varname ]
If varname is specified, the input is stored into it, otherwise
the built-in variable $REPLY is used
✞
#!/bin/bash
# the -n option in echo does not print the tailing CR
echo -n "Insert a string: "
read
echo $REPLY
✡✝ ✆
Corrado Santoro The Unix Shell
52. Exercise
Let us write a script which performs a recursive copy of all
the files of a source dir into a destination dir
For each file:
if it is regular, it has to be copied directly;
if it is a directory, it has be created (if not exists) or
re-created (if exists)
To write the script, we can use the “find dir” program,
which prints all the files of dir scanning recursively also the
sub-dirs.
Corrado Santoro The Unix Shell
53. Exercise 1
✞
#!/bin/bash
if [ "$#" -ne "2" ]
then
echo usage: $0 source-dir dest-dir
exit 1
fi
cd $1
for file in ‘find .‘
do
if [ "$file" == "." ]
then
continue
fi
if [ -d $file ]
then
# the file is a directory
if [ -e $2/$file ]
then
rm -rf $2/$file
fi
mkdir $2/$file
else
# the file is a regular file?
cp $file $2/$file
fi
done
✡✝ ✆
Corrado Santoro The Unix Shell
54. Exercise 2
Exercise 1 +
1 check if the dest dir is not empty; in this case do not
perform the copy unless option “-w” is specified
2 check if the dest dir is not empty; in this case, ask to
overwrite the content unless option “-w” is specified which,
instead, overwrites the content by default
Corrado Santoro The Unix Shell
55. Exercise 2.1
✞
#!/bin/bash
if [ "$1" == "-w" ]; then
FORCE_OVERWRITE=y
shift
else
FORCE_OVERWRITE=n
fi
if [ "$#" -ne "2" ]; then
echo usage: $0 source-dir dest-dir
exit 1
fi
cd $1
for file in ‘find .‘; do
if [ "$file" == "." ] ; then
continue
fi
if [ -d $file ] ; then
# the file is a directory
if [ -e $2/$file ] ; then
if [ "$FORCE_OVERWRITE" == "n" ] ; then
echo Destination directory $2/$file is not empty
exit 2
else
rm -rf $2/$file
fi
fi
mkdir $2/$file
else
# the file is a regular file?
cp $file $2/$file
fi
done
✡✝ ✆
Corrado Santoro The Unix Shell
56. Exercise 2.2
✞
#!/bin/bash
if [ "$1" == "-w" ] ; then
FORCE_OVERWRITE=y
shift
else
FORCE_OVERWRITE=n
fi
if [ "$#" -ne "2" ] ; then
echo usage: $0 source-dir dest-dir
exit 1
fi
cd $1
for file in ‘find .‘; do
if [ "$file" == "." ] ; then
continue
fi
if [ -d $file ] ; then # the file is a directory
if [ -e $2/$file ] ; then
if [ "$FORCE_OVERWRITE" == "n" ] ;then
echo -n "Destination directory $2/$file is not empty, do you want to overwrite
read
if [ "$REPLY" != "y" ] ; then
echo "Copy aborted"
exit 2
fi
fi
rm -rf $2/$file
fi
mkdir $2/$file
else
# the file is a regular file?
cp $file $2/$file
fi
done Corrado Santoro The Unix Shell
57. Using stdout as argument in a pipe
Let us suppose we want to search for a string into the C
source files of our project which is organised with some
directories
We should:
1 recursively scan the main dir
2 filter only “.c” files
3 search for the pattern into such filtered “.c” files
Can we make it using a single-line command with some
pipelines?
Corrado Santoro The Unix Shell
58. Using stdout as argument in a pipe
1 recursively scan the main dir
2 filter only “.c” files
✞
$ find . | grep ".c$"
...
...
...
./inemo/source/sensors.c
./inemo/source/imu_fusion.c
./inemo/source/timers.c
./inemo/source/telemetry.c
./ghettopilot-read-only/src/imu2/twi.c
./ghettopilot-read-only/src/imu2/imu.c
./ghettopilot-read-only/src/imu/nunchuck_test.c
$
✡✝ ✆
Notice: grep ’.c$’ means “find strings terminating with .c”
3 search for the pattern into such filtered “.c” files: we
should get the output above and use as argument of
another grep command, e.g.
grep myfunc the output above
Corrado Santoro The Unix Shell
59. Using stdout as argument in a pipe
The xargs command reads data from stdin and use it as
argument of another (given) command:
1 recursively scan the main dir
2 filter only “.c” files
3 search for the pattern into such filtered “.c” files
✞
$ find . | grep ".c$" | xargs grep main
./inemo/source/imu_fusion.c:int main(void)
./ghettopilot-read-only/src/imu2/main.c:int main()
./ghettopilot-read-only/src/imu/nunchuck_test.c:int main(void)
./ghettopilot-read-only/src/imu/main.c:int main(void)
./ghettopilot-read-only/src/imu/integration1.c:int main(void)
$
✡✝ ✆
Corrado Santoro The Unix Shell
61. What is a regular expression?
A Unix regular expression (regex) is a string which, by
using a well-formed syntax, specifying a pattern matching
A regex is represented by a string in which each character
has a well-defined meaning
Regexs are used in many shell utilities (grep, find, sed,
awk, etc.), provided natively in some script languages
(perl) and also in libraries for “classical” programming
languages (C, C++, Java, Python, etc.).
Corrado Santoro The Unix Shell
62. Regex and POSIX
POSIX defines the “standard syntax” of regex, however some
programs adopt their own dialect.
Therefore, there are many “regex types”:
posix-basic
posix-extended
posix-awk
posix-egrep
emacs
...
Corrado Santoro The Unix Shell
63. Some syntactical elements
ˆ Matches the begin of a string
$ Matches the end of a string
. Matches any character
* Matches zero or more of the previous specification
+ Matches one or more of the previous specification
? Matches zero or one of the previous specification
Quote next char
Corrado Santoro The Unix Shell
64. An example
Let us find all files terminating with .java in dir
/home/corrado/software:
✞
find /home/corrado/software -regex "ˆ.+.java$"
✡✝ ✆
We are using the find command with the -regex option.
ˆ matches the begin of the string
.+ matches one or more of any character
. matches the dot “.”
java matches the string java
$ matches the end of the string
Corrado Santoro The Unix Shell
65. Begin and End of the string
Are ˆ and $ always necessary? Let us try these new
commands:
✞
find /home/corrado/software -regex java
find /home/corrado/software | grep java
✡✝ ✆
Even if also grep uses regex, the behaviour of the two
commands is quite different:
The former matches only if the whole string is java (there
are implicit “start” and “end” markers)
The latter matches any string which includes the word
java
An excerpt from find man page:
✞
-regextype type
Changes the regular expression syntax understood by -regex and -iregex tests
which occur later the command line. Currently-implemented types are
emacs (this is the default), posix-awk, posix-basic, posix-egrep and
posix-extended.
✡✝ ✆
Corrado Santoro The Unix Shell
66. Regex equivalence in find and grep
Find the file whose pathname is java:
✞
find /home/corrado/software -regex java
find /home/corrado/software | grep "ˆjava$"
✡✝ ✆
Find the files whose pathname includes java:
✞
find /home/corrado/software -regex ".*java.*"
find /home/corrado/software | grep java
✡✝ ✆
Corrado Santoro The Unix Shell
67. Other syntactical elements
[...] Matches one of the enclosed characters:
[abc] Matches “a” or “b” or “c”
[aeiou] Matches a lowercase wovel
[AEIOUaeiou] Matches a wovel no matter its case
[start-end] Matches one of the characters from start to
end:
[a-z] Matches characters from “a” to “z”
[a-z ] Matches characters from “a” to “z”, or the
underscore
[a-zA-Z] Matches characters from “a” to “z” or from “A” to
“Z”
[ˆstart-end] DOES NOT Match one of the characters
enclosed:
[ˆab] Matches all chars except “a” and “b”
Corrado Santoro The Unix Shell
68. Examples
Regex for classical C or Java identifiers (variables): they must
begin with an lowercase or uppercase letter or underscore, and
then continue with zero or more letters or numbers or
underscore:
✞
"ˆ[A-Za-z_][A-Za-z0-9_]*$"
✡✝ ✆
Match all chars excluding spaces:
✞
"[ˆ ]*"
✡✝ ✆
Corrado Santoro The Unix Shell
69. Repeated set specification
We can specify that a certain pattern must be present n times,
or from n to m times:
{n} The previous pattern must be present n or more
times
{n,m} The previous pattern must be present from n to m
times
This specification belongs to “posix-basic” regex (not
“emacs” regex), so in the find command we must specify
the -regextype option.
Find all files whose names include a double “l”:
✞
find . -regextype posix-basic -regex ".*l{2}.*"
✡✝ ✆
Find all files whose names include from 1 to 2 “l”:
✞
find . -regextype posix-basic -regex ".*l{1,2}.*"
✡✝ ✆
Corrado Santoro The Unix Shell
70. Backreference
If a matched set has to be matched again, the matching
expression can be “grouped” with “(” and “)”, and then
“backreferenced” with “n”, where n is number of group: Find all
files whose names starts with two same uppercase characters:
✞
find . -regextype posix-basic -regex "ˆ([A-Z])1.*"
✡✝ ✆
Matches all five-letter lowercase palindromes:
✞
find . -regextype posix-basic -regex "([a-z])([a-z])[a-z]21"
✡✝ ✆
Corrado Santoro The Unix Shell
71. Grouping (POSIX-extended)
In POSIX-extended syntax, grouping can be also used to
specify OR-based pattern matching:
List all files whose names start with ONE or TWO:
✞
ls | egrep "ˆ(ONE|TWO).*"
✡✝ ✆
egrep is the version of grep which uses POSIX-extended
regexs.
In POSIX-extended, grouping is defined with ( and ) (without
backslashes).
Corrado Santoro The Unix Shell
73. What is a “sed”?
sed is a Unix utility whose name means “stream editor”
It which performs a programmable sequence of
processing actions (a script) on the data coming from
STDIN
Data processed is then copied to STDOUT, so the basic
usage of sed is within a shell “pipeline”
The syntax is:
sed ’script’ Read from stdin, apply the script and
write to stdout;
sed ’script’ input-file Read from input-file, apply
the script and write to stdout;
sed ’script’ input-file > output-file Read
from input-file, apply the script and write to output-file.
Corrado Santoro The Unix Shell
74. Basic usage of sed: find and replace
Scripting in sed is performed using some commands made
of single-characters (this could make reading a sed script
hard).
A basic usage of sed is to find a certain pattern in the
incoming stream and replace it with another pattern.
The syntax of the script is the following:
s/pattern-to-search/pattern-to-replace/
To express patterns, regexs can be also used.
Corrado Santoro The Unix Shell
75. Find and replace: Examples
Show file “data.txt” and replace all occurrence of “ONE” with
“TWO”:
✞
cat data.txt | sed ’s/ONE/TWO/’
✡✝ ✆
List all file names of current dir and replace all “.PDF”
extensions with “.pdf”:
✞
ls . | sed ’s/.PDF$/.pdf/’
✡✝ ✆
List all file names of current dir and remove any 1-to-3 char
extension:
✞
ls . | sed ’s/..{1,3}$//’
✡✝ ✆
Corrado Santoro The Unix Shell
76. Find and replace: Examples
As for regex, in sed a search pattern can use backreferences.
Given a list of coples “lastname” “firstname” in file “names.txt” ...
✞
Santoro Corrado
Verdi Giuseppe
Rossi Mario
Bianchi Francesco
✡✝ ✆
... let us generate a new file with “firstname” and “lastname” (-r
option enables posix-extended regex):
✞
cat names.txt | sed -r ’s/([A-Za-z]+) ([A-Za-z]+)/2 1/’ > names_2.txt
✡✝ ✆
Corrado Santoro The Unix Shell
77. Sed: standard syntax of script
A Sed script is a list of lines of the form:
line-spec command
where line-spec specifies the lines to which apply the
command; it may refer to:
number, apply the command to the specified line number
(base-1);
from,to, apply the command to the lines in the interval from
to;
$, apply the command to the last line; it can also be used
in from,$;
/pattern/, apply the command to lines matching the pattern
Corrado Santoro The Unix Shell
78. Sed: standard syntax of script
A Sed script is a list of lines of the form:
line-spec command
where command may be:
s/.../.../, find and replace command seen before;
d, delete (suppress) the line
q, quit (end processing)
p, print the line when sed is used with the -n option.
“-n” Option: the normal behaviour of sed is to print each line
(both matching and not matching), with this option printing must
be explicitly performed with the p command.
Corrado Santoro The Unix Shell
79. Some Examples
Given the file “names.txt” ...
✞
# Here is a list of names
Santoro Corrado
...
Bianchi Francesco
✡✝ ✆
... let us generate a new file with “firstname” and “lastname” but
exclude first line:
✞
cat names.txt | sed -r ’2,$ s/([A-Za-z]+) ([A-Za-z]+)/2 1/’ > names_2.txt
✡✝ ✆
Suppress lines beginning with “#”:
✞
cat names.txt | sed -r ’/ˆ#/ d’ > names_2.txt
✡✝ ✆
Show includes:
✞
cat mysource.c | sed -n ’/ˆ#include/ p’
✡✝ ✆
Corrado Santoro The Unix Shell
81. What is a “awk”?
awk is a Unix tool which performs a stream editing, but in a
more versatile way than grep and sed
Syntax of awk scripts is similar to a programming language
As in other stream editors, data is processed from STDIN
and then copied to STDOUT
Its name comes from the develpers: A. Aho, P. Weinberger
and B. W. Kernighan.
The syntax is:
awk ’script’ Read from stdin, apply the script and write to
stdout.
Since the script may include several lines, its basic usage is
within a shell script.
Corrado Santoro The Unix Shell
82. Basic awk script syntax
An awk script is a set of lines:
pattern { action(s) }
pattern matches a certain text line coming from stdin;
action(s) is a set of commands (expressed in an imperative
language) to be executed.
Example (show includes):
✞
cat mysource.c | awk ’/ˆ#include/ { print }’
✡✝ ✆
Here the print command outputs the entire line.
Corrado Santoro The Unix Shell
83. Token Separation
A basic and built-in functionality of awk is token separation,
i.e. the entire line is automatically split in a list of
“space-separated” tokens which can be accessed using $1, $2,
etc..
Example: show filenames and owners
✞
ls -l | awk ’ { print $3, $9 }’
✡✝ ✆
We can also use the keywords BEGIN and END in the pattern
part to specify some commands to be executed at the
beginning and the end of the input stream:
✞
#!/bin/bash
ls -l | awk
’BEGIN { print "Owner Filename" }
{ print $3, $9 }’
✡✝ ✆
Corrado Santoro The Unix Shell
84. Variables in awk
Variables may appear in a script. They are untyped (or
runtime-typed) and artihmetic operations are possible.
Example: Show ower and filename and also count files:
✞
#!/bin/bash
ls -l | awk
’ BEGIN {
nfiles=0;
print "Owner Filename";
}
{ print $3, $9; nfiles=nfiles + 1; }
END { print "There are", nfiles, "files"; } ’
✡✝ ✆
Corrado Santoro The Unix Shell
85. Conditionals in awk
if constructs can appear in an awk script. The syntax is C-like:
if { then part }
else { else part }
Variables may appear in a script. They are untyped (or runtime-typed)
and artihmetic operations are possible.
Example: Show ower and filename and also count files but stop after
10 files
✞
#!/bin/bash
ls -l | awk
’ BEGIN {
nfiles=0;
print "Owner Filename";
}
{ print $3, $9;
nfiles=nfiles + 1;
if (nfiles == 10) {
exit 1;
}
}
END { print "There are", nfiles, "files"; } ’
✡✝ ✆
Corrado Santoro The Unix Shell
86. Conditionals in awk
if constructs can use the special operator “˜” to match a regular
expression. The regex is expressed using the usual way: /regex/
Example: Show ower and filename and but filter only files beginning
with “test”
✞
#!/bin/bash
ls -l | awk
’ BEGIN {
print "Owner Filename";
}
{
if ($9 ˜ /ˆtest/) {
print $3, $9;
}
}’
✡✝ ✆
Corrado Santoro The Unix Shell
87. Other Structured Constructs in awk
In addition to if, awk provides the classical constructs of structured
programming, they include:
while
do while
for
switch/case
break/continue
The syntax is C-like, i.e.
✞
#!/bin/bash
awk
’ BEGIN { for (i = 0; i < 10;i++) { print i; } }’
✡✝ ✆
Corrado Santoro The Unix Shell
88. Records, fields and built-in variables
Awk, in performing stdin processing, uses the following concepts:
A record corresponds to a “unit of processing”, i.e. a line coming
from stdin;
A record is composed of one or more fields, which are literals
separated by a given symbol, i.e. the space.
To help processing, awk provides the following built-in variables which
may be read and written:
NR, number of records, that is, the line number
NF, number of fields of the current record
FS, the field separator (space by default)
RS, the record separator (newline by default)
These variables may be changed in order to process a file/stream
with a specific format.
Corrado Santoro The Unix Shell
89. Processing a file with specific format in awk
Let us consider the /etc/passwd file:
✞
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
corrado:x:1000:1000:Corrado,,,:/home/corrado:/bin/bash
✡✝ ✆
Here the field separator is “:”, so we can process the file in awk by
changing the FS variable. E.g. let us write a script which prints the
login name and the associated home directory:
✞
#!/bin/bash
cat /etc/passwd | awk ’ BEGIN { FS=":" } { print $1, $6; } ’
✡✝ ✆
Corrado Santoro The Unix Shell
90. Arrays in awk
Awk language provides associative arrays, i.e. arrays
whose index can be any type,e.g.
mydata[1] = 20;
array["Corrado"] = "Santoro";
Arrays can be used in loops with the statement: for (a
in array) { .... }
E.g. let us gather usernames and print them at the end of
processing:
✞
#!/bin/bash
cat /etc/passwd | awk
’BEGIN { i = 0; FS=":"; }
{ user[i] = $1; i++; }
END { for (u in user) { print user[u]; } }’
✡✝ ✆
Corrado Santoro The Unix Shell
91. Other usefull shell tools (a
non-exhaustive list)
Corrado Santoro The Unix Shell
92. Other shell tools
sort sorts input lines to output, several sort criteria can be
expressed (e.g. alphanumeric, numeric, date/time, reverse, etc.)
true a “stupid” program which returns “0”, it can be used to
perform infinite loops, e.g.:✞
while true; do echo ciao; done
✡✝ ✆
Corrado Santoro The Unix Shell
94. Exercises (I)
Print the username given the user id.
List all the usernames which have a “login shell” and order
them ascending.
Given a file including a list of names, print only duplicates.
Corrado Santoro The Unix Shell
95. Exercises (II)
Given this /etc/network/interfaces file:
auto eth0
iface eth0 inet static
address 192.0.2.7
netmask 255.255.255.0
gateway 192.0.2.254
auto wlan0
iface wlan0 inet static
address 151.97.6.10
netmask 255.255.255.0
gateway 151.97.6.1
print, for each interface, the following information:
interface name
IP address
gateway address
Corrado Santoro The Unix Shell
96. The Unix Shell: Basic and Advanced Concepts
Corrado Santoro
ARSLAB - Autonomous and Robotic Systems Laboratory
Dipartimento di Matematica e Informatica - Universit`a di Catania, Italy
santoro@dmi.unict.it
Master Cloud P.A.
Corrado Santoro The Unix Shell