-
Basic Commands
-
Every Unix command comes in the form:
command-name [-switches] <arguments>
It is essential to realize that one of the first things the Unix shells do when evaluating your command it is to split the line up according to the (white space, space characters, tabs and new lines). -
Switches:
-
The switches are often one letter with or without a hyphen or a hyphen followed by a series of letters each of which turns on a different switch in the command you are calling eg. :
ls -l (gives a "long" listing of the directory contents )
ls -li (gives a "long" listing which includes the inode numbers)
ls -lih (long listing, inodes and prints the file sizes in "human readable" form '1K' rather than '1024')For some commands the order of the switches is important such as for :tar xzvf tarfile.tar.gz
the x (extract) should come first (though it doesn't have to) and the f switch should come last to be followed immediately by the filename. In general newer versions of commands are less sensitive to the order of the switches.
Gnu endorses switches in the form--switch-namebut often have the familiar (to people who have been using Unix for years) one letter switches coupled to the same function. You may have seen the--nodepsor--force switchesfor rpm. Or the--helpswitch which many commands have which should print out basic usage information. -
Arguments:
-
Some commands take one argument such as the archive name in tar, some commands require two arguments such as cp and mv which require two arguments (from and to). eg.
cp from_file ../other_directory/to_file (from file is in this directory)
mv ../other_directory/from_file ./
(move from_file from other directory to this directory (./) from_file will keep the same name in this directory.
Many commands will take a whole list of arguments and act on each of them. This is often what happens "behind the scenes" when you use wild-cards. More on this in a bit. You must remember that Unix like Linux is a moving target. Sometimes there are some details which don't make perfect sense. Learning Unix or Linux is a bit like learning a foreign language, different people's ways of thinking don't always fit perfectly logically together. Do not despair however computer programmers are a fairly logical bunch and there are far fewer exceptions that in any human language your have ever tried to learn. -
Some basic commands for negotiating the file-system, and getting your files where you want them to be.
-
-
pwd mnemonic: print working directory (where the h* am I)
- First and foremost you need to know where you are, or where your shell thinks you are. No command in Unix runs outside the file-system. When you are at the command line, you are somewhere in the file-system (this is true of any Unix system even one you are accessing through ssh or even ftp). When you log into a shell, which is what you do when you log into any Unix system, you are usually put in your home directory. On Linux this is usually /home/the_user_name_you_have_chosen, or in the special case of the root user /root (If you still insist on doing everything as root rather than as by making a "normal" user account for yourself, I guess you'll have to learn the hard way...)
There are no switches for pwd.
-
cd (change directory)
- to change your current working directory you type cd /the_directory_where_you_want_to_go. A pwd after a cd should prove to you that you have in fact changed directories. Note: pwd and cd are what are know as shell built-in commands. They are so essential to your shell that they are built-in. We will cover what this means more in depth when we get to the section on shells (shell:LINK)
-
ls -a -h -i -l (list directory contents)
- We already saw ls in the overview above. The -a switch shows a listing of all the files including the hidden "dot" files. Many configuration files in your home directory are preceded by a "dot". These dot files are often not listed by default so that your listings are less cluttered. There are many more switches to ls of course which many one day be incredibly handy or many never be of any use to you. This will be the case for most external commands. Every time you look at a man page is an opportunity to learn about another switch you did not know about, or forgot. I just looked at the ls man page to prepare these notes and found the -A and -B switches which I did not know about and smiled, those will surely be useful to me one day. Sometimes however there are way too many options in the man page for any of them to feel useful (we've all been there), just remember you can search the man pages with
/search_string<enter>then keep hitting/<enter>for the next matches or more simplynfor next match. More on searching below. -
cp -r <SOURCE_file> <DESTINATION_file> (copy)
- cp need two arguments. SOURCE_file and DESTINATION_file can be in either relative or absolute file hierarchy mode (meaning ../DESTINATION or /home/you/DESTINATION ) see rel_absolute:LINK. The -r switch is important when you want cp to copy a directory and all it's contents. It stands for recursive, because in this case cp has to walk through the directory to get the inodes and filenames for all the files and subdirectories in the directory to be copied, and make a new hierarchy with new directories and files at the new location in the file-system. There's a lot of work under the hood. If you are interested cp has a backup function which might be useful if you don't need anything more complicated.
-
mv <SOURCE_file> <DESTINATION_file> (move)
- mv also requires two arguments. If you want to move a file to your current working directory you can use ./ (see rel_absolute:LINK) as the second argument this will preserve the filename of the SOURCE_file. It is important to realize that move is actually different commands depending on what you are asking the command to do. If you are moving a filename within a single partition. mv is calling a simple rename function which moves the place where the inode address is held in the file-system (see filesys_inode:LINK). As long as you stay within the same partition, mv does not have to create a new inode, or move any data on the disk. It only changes some file-system entries. If however you move a file or a directory across partitions, a new inode must be created (partitions don't share inodes) and all the data must be copied to the new partition, as well as the file-system entries being updated and the old entry in the file-system unlinked.
-
ln -s <filename> <linkname> (link)
-
Often it is useful to have several names for the same file in different directories, often as a shortcut, sometimes a program you cannot modify (or don't wish to modify) is looking for a file in a certain place, and you want to have that file magically appear there, without making a copy. Linking is used extensively for the whole Linux source code tree. You can have several different versions of the kernel in /usr/src such as /usr/src/linux-2.2.19 , /usr/src/linux-2.4.2 and /usr/src/linux-2.4.18 and make a symbolic link (that's what the -s switch is for) in /usr/src called linux which points to the actually version of the kernel source you are interested in now.
ln -s /usr/src/linux-2.4.18 /usr/src/linux
if you were already in the /usr/src/ directory you could just type :ln -s linux-2.4.18 linux
this creates a link called linux in the directory /usr/src which points to /usr/src/linux-2.4.18. If you ls /usr/src/linux you will get exactly the contents of /usr/src/linux-2.4.18. Every action such as cp, mv cd will treat this link as though it were the real directory itself. Simlinks which are created using absolute filenames can be moved. The relative filename will be broken if the a relatively linked symlink is moved.
-
-
Utilities for looking at files:
-
less-
Less is what is called a pager. It will allow you to read a text-file or a stream of data (but not edit it). Less takes control of your terminal screen and prepares the data into "pages" pieces which fit in your terminal window.
You have perhaps noticed that some commands produce so much data that it flies by your terminal too quickly to read. Some terminals allow you to scroll back through this data, but usually it is better to control the presentation of the data using less. (less is the successor of a command named more which has less features).- Usage 1):
less <textfile>Read textfile. - Usage 2):
dmesg | lessRead from a pipe (see Filters/Pipes). The dmesg command redisplays the messages from your kernel which scroll by during your system's booting up, but dmesg's output scrolls by too quickly to be read. By piping this data to less, less takes care of the paging, allowing you to read the data one screen-full at a time.
/search_string<enter>then keep hitting/<enter>for the next matches or more simplynfor next match. You can also search backwards using?. You can also use regular expressions in these searches.
You quit less to return to your shell prompt withq. - Usage 1):
cat-
cat can print the contents of a file to standard output (see filters:LINK).
cat textfile
will dump the contents oftextfileto the screen and return you to a prompt. Unless your file is very short this is often not extremely useful. (cat /proc/mountsis a file which is quite useful.It shows exactly what devices your kernel knows are mounted on your system (well talk more about the /proc and /dev subdirectories on the last day of this class)).
What is most important about cat is that it can dump the contents of a file into a pipe and thus into any Unix command. A file with the name of a different file on every line could be sent as the arguments to another Unix command. A file with a different command on every line can be piped to a shell which will execute each command one after the other.
The name cat is short for contatentate. This is because cat can append the contents of one file to the contents of anothercat file1 file2 file3
will print the contents of each of these files to the standard output (the screen.cat file1 file2 file3 >sum_file
will dump the contents of the first three files into the file sum_file. Hence the first three files will be concatenated into the fourth.
Power usage: In combination with pipe, cat will allow you to send a list of shell commands stored in a file (one command per line) to a shell eg.cat file_of_commands |bash
(most people will use sh for this unless there are bash specific things you need like commands which treat your environment directly.)
-
Wild cards/Globs:
-
Shells use a set of special characters called wild-cards, these are similar to regular expressions (which we will look at shortly), in that they match patterns but are more restricted in what they can do. This kind of pattern matching is also sometimes called globing. The words wild-cards and globing both refer to a similar idea, using one character to represent a set of characters the same way a joker in cards can replace any of the other cards. The most common wild card is :
-
*
- which represents any character or group of characters, and no character. it matches anything.
eg: C*T matches CT CAT CAAAAAAAAAAAT Cqp4thpeup7(T^UAQTORT -
?
- matches any single character
eg: C?T matches CAT CaT or C8T but not COAT or CONCAT -
[]
- square brackets allow you to specify a group of characters which could be matched, but any pair of brackets will match only one character. eg: C[AO]T will match CAT or COT but not COAT You can think of each of the characters within the brackets being separated by an or. You can also match a range of characters eg: instead of typing [01234556789] to match any digit you can type [0-9] or [a-z] for the lowercase alphabetical characters [A-z] for upper and lowercase. Be careful however what the range covers varies on how the characters are represented in the ASCII character set. If you study Perl regular expressions later on, there is a lot of work to make these kind or character set matches to be portable across different languages. But for accelerating your work on a daily basis the shell's wild-cards aren't bad.
-
-
Quoting
- All the shells have special characters. A blank space we saw from the very beginning separates the different elements of a command-line. We have just seen three wild-card characters, and will soon see the characters for pipes and redirection of the standard io. But what happens if you want to use a special character without its special meaning. Say someone has set you a file named "who is afraid of virginia woolf?" five white spaces and a question mark all of which mean differnt things to the shell. You will have to quote the filename just like I did in the last sentence. To quote just a single character you can use a
\. So\?is an actual question mark?*an actual asterisk and\an actual space. This is called escaping the character (you remove it's special meaning. The above sentence escaped would be who\ is\ afraid\ of\ virginia\ woolf\? the bash shell can do this auto-magically for you (if you know how to ask it to). Double quotes don't take away all the special meanings of a sting, the bash shell will still try to evaluate the variables inside a double quoted string. "this is the $PATH" will still evaluate and expand the $PATH variable. Single quotes will stop all expansion. -
Filters/Pipes:
-
In this section you will learn how to combine Unix commands. This is one of the most powerful parts of the Unix interface. Most of the commands can be connected together by the shell in useful and powerful ways. That fact that the commands can be linked together, also allows each command to be small. It only has to accomplish a small number of tasks well. The value of this type of modularity has been taking over the software world and is one of the founding ideas in object oriented programming.
>short historical aside
One of the reasons why Unix was designed in this modular way has to do with its history. The Unix system was originally developed as an in house tool at At&T. One of the first names this operating system had was PWB for programmer's workbench. From it's start it was not a commercial venture, and Bell Labs gave or sold inexpensive licenses to many major universities.* Because of its modularity, universities found Unix particularly well suited to its computing needs. They could add the pieces they needed to the elements which already existed, many would happily share these programs with other institutions, in the way in which scientists share discoveries. The process of officially making large bodies of programs and their code free started at Berkeley with a version of Unix developed at the University called BSD, and by Richard Stallman who is a professor at MIT, and the free software foundation and the Gnu project which he founded. All of these projects precede, and made possible the rapid development and deployment of Linux which we can enjoy today.<Back to pipes... the pipe "|"
In order for multiple commands to be able to work together they must have a standard interface (or standard io for input output). The Unix interface divides this io into three parts,- standard input (denoted by the number 0)
- this includes the command line arguments which you have already started to learn
- standard output (1)
- by default this is dumped to the screen
- standard error (2)
- Also dumped to the screen by default. Not just errors, but also notifications or any information which should go the the screen instead of being passed on to another program, when the commands are piped together.
-
Examples
There is a small Unix command which allows you to view the commands which your kernel prints to screen during the bootstrapping process( you know all that junk which flies by to quickly to be seen). Well there are times when it is really quite important to be able to see those messages again. The commanddmesgdumps that information to the screen (one again it goes by too quickly to be read). Now either you get really angry at the Unix programmers, why can't they even make anything which is easy to use..., or you calm down and think there must be a reason these intelligent people do things this way. I know of another toollesswhose sole purpose in life is to break a stream of information like dmesg dumps to the screen and break it into manageable screen-sized pieces. If I connect the output of dmesg to less, then less can deal with the paging (or preparing of screenfulls of information) and dmesg can go on doing what it does well dumping information to its standard out. The pipe "|" (above the return key) allows just this. It connects the standard output of one command to the standard input of another :
eg:dmesg | less
Which allows you to see the output of dmesg in the pager less. There are so many different combinations possible using pipes that it is impossible to summarize them, but hear is a short list of examples:- locate gtk | grep include |less
- I want find the location of gtk's include files on my system. The output of
locate gtkproduces too much output and I am looking for the includes, so I pipe togrep include, which matches each line of output from the locate command which matches the string "include". Even this is too much output so I pipe this to less. Each line in the pages output contains both a the string "gtk" and the string "include". I can scroll through this output usingcrtl^dandcrtl^uand even further refine my search in less using "/"
- rpm -qa |grep gtk
- rpm -qa lists all the packages on my system which have been installed using rpm. I want to find all the packages whose name contains the string "gtk"
- bzcat ~/downloads/gnome-db-0.2.96.tar.bz2 | tar xv
- I have downloaded a tar ball compressed using bzip compression to the downloads directory in my home directory. bzcat will decompress this file to a pipe, I can pipe this now decompressed tarball, to the
tar xvwhich will un-tar the tarball. Of course I could do this in two steps rather than one, by callingbunzip2 ~/downloads/gnome-db-0.2.96.tar.bz2which produces the file gnome-db-0.2.96.tar in my current working directory, and thentar xvf gnome-db-0.2.96.tarwhich produces the un-tarred directory gnome-db-0.2.96 with all its sub-folders. By using the pipe the intermediate gnome-db-0.2.96.tar is never created and never takes up space on the disk. The other day when I upgraded X to 4.2 (~300M source code) on a system where the disk space was tight the pipe was crucial to this operation.
-
Redirection < << > >>
The pipe allows the output of a command to get sent to another command, but sometimes you want the output to be recorded in a textfile. This is quite easy use:>filename or >>filename
The first creates the file "filename" if it does not exist and dumps the output of the previous command into it. The second double greater than, is the same except that if the file already exists the information is appended to the end of the file. -
Examples
dmesg >my_dmesgs
saves a copy of the kernel boot messages in the file "my_dmesgs" in your current working directory.find ~/ -name core -type f -print >core_files
saves the names of all the core files in your home directory (and any of its subdirectories) one filename per line in the file "core_files" in your current working directory.corollary 2>
By default the ">" channels the standard output to a file. But it is possible to redirect the standard error using "2>" :make bzImage 2>bzErr
redirects the error messages you might get while making a kernel to a file named "bzErr" which you could then read over at you leisure, were you to be away from your machine when it is working on the compile.2>/dev/null
It is also possible that the error messages are cluttering your standard output. There is a special file on every Unix system called /dev/null which swallows output you don't want. By redirecting the standard error to /dev/null it effectively disappears. -
Relative/Absolute file-system address:
-
There are two ways of specifying the location of a file on a Unix system. With a relative path or an absolute path. The absolute path is perhaps easier to understand at first. It specifies the whole address of the file.
/home/user_you
/usr/X11R6/include/X11
/var/apache/htdocs/nylxs/classes/unix1
Are all absolute addresses, they can specify the location of a file from anywhere in the file-system because they show exactly where with respect to the root "/" the file is.
There is another way of locating a file, with a relative path. Every command is run in some directory on the file-system. Every time you interact with a shell you are in some current working directory. You can always specify the location of a file with respect to the directory where you are right now. For example: If you want to specify a file in your current working directory, you do not have to specify the whole path to that file, but can just use the the name of the file. Say you want to open the file unix_1.html which is in the unix_notes directory in your home directory and you are in this same directory, you can just specifyvi unix_1.htmlrather than the absolute filenamevi /home/marco/unix_notes/unix_1.html.
If you want to refer to a file one directory up you type../filenameOr if you want a file in another subdirectory of two directories up../../other_subdir/filenamesee More on file-systems and inodes -
More on file-systems and inodes
- In each directory there is a file named
.and another named... The file named "dot" actually holds a listing of all the names and inodes of the files in this directory. This file is read bylswhen you ask ls to list the contents of a directory for you. The "dotdot" file holds the inode an address of the parent directory. How does pwd figure out where it is? It asks..the file in each directory which holds the inode number and name of it's parent, and asks the parent directory's..file which gives that directory's parent, and on up the file-system tree. How does pwd know when to stop? There is only one directory in the whole file-system which has itself listed as it's parent directory, the root top level directory "/". -
Environment:
-
Shells
- When you are logged into a unix system, you are never sending commands directly the system, but to a shell which lies between you and system. The shell parses your command lines and sends the different elements to the commands you wish to use. The shell also has a number of functions (or shortcuts) which it can perform for you before calling your command. These include a command history, wild-cards, variables, and the redirection of the standard input and outputs. In order to learn unix you actually learn the shell and a bunch of external commands, which you can combine using the shell. In this class we will be concentrating on bash, one of the shells which has the most functionality, and which is the default on linux systems.
Most of what we are learning is directly usable in any shell. The major differences between shells tend to be around the more advanced aspects of the user interface such as the command history and a few different built in shell commands such as those for setting and un-setting environmental variables (see env:LINK). For an in depth analysis of the difference between shells see (FAQ link) and (powertools) -
Important quitting commands
- If a command you run is out of control or you want to stop it early use
ctrl^cA shortcut for logging out isctrl^d. If you are in multiple embedded sessions, say you have changed user with su a few times and you are in an ssh or ftp session.ctrl^dwill log you out of the sessions one by one. -
Finding , searching replacing
- You must remember that the wild-cards we have already learned already allow you to do quite a bit of searching on your system, at least as far as filenames are concerned. But sometimes you need to search for other criteria, multiple directories at the same time, modification times, permissions, owners, or inside the files themselves. In this section we are going to learn about find and grep.
-
find
-
One of the first facts that separates find from most other unix commands is that find will walk down the directory tree. Every time you use the find command you must specify which directory you wish to search, find will search that directory and all its subdirectories, and all the subdirectories on down. If you wanted a list of every file and directory on your whole system you could type
find / -print. The -print switch tells find what to do with the matches it comes up with (print to the screen). Some versions of find add the -print by default. The only argument find requires is the directory where it is to start its search. All the other arguments you give find will limit the number of matches it will produce.- -name
- name is perhaps the tag you will use most often. You can use wild-cards with the -name tag if you protect them from the shell by using double quotes. Find takes care of expanding the wild-card internally. If you did not put the wild-cards in quotes, it would be expanded by the shell before find ever saw the wild-card. The shells rules for expansion only provide matches in the current working directory. This is probably not what you wanted.
eg: find ~/ -name "*.html" finds all the files in your home directory whose filename contains the characters .html - -type
- will take the types f for file d for directory and even for character device, b for block, but the first two are most useful. Being able to separate out the directory names from a list is useful, especially if you want to use the list from find for further processing such as through a pipe or in command substitution (see Subst:LINK)
- -mtime [+-]n
- modification time-- the time the contents of the file were last modified.
All of the time switches use 24 hour increments. -mtime one will search for files modified exactly 24 hours ago (probably not what you wanted) -mtime -1 will search for files modified less than 24 hours ago. -atime +1 will search for files modified more than 24 hours ago. - -ctime [+-]n
- last time the inode was changed. Be careful. Changing the permissions on a file will change the inode, as will moving the file. This does not mean the contents were modified.
- -atime [+-]n
- last access time. Last time the file was read. It is possible to turn off the access time logging for a partition in /etc/fstab (add the switch noatime to the parameters) This is useful to conserver battery usage on laptops. Has nothing to do with find, but useful.
- -amin -cmin -mmin [+-]n
- identical to -atime and co. except that the argument n specifies minutes.
- -exec
- it is possible to execute shell commands on the results of a search such as
-exec rm '{}' ';'would remove any match. The {} will be replaced by the match. The ; tells exec where the end of its commands are. Both have to be quoted. -exec can be useful, but in general xargs and back-ticks ` are more polyvalent and useful (see:Command Substitutions) - -ok
- same as -exec but prompts you for permission. Wise.
- last but perhaps the most important -print tells find that you want a the matches printed to the screen. Now on my resent Redhat 7.2 the find with which I am doing the testing defaults to including this tag. But you can't count on this being the case everywhere so I would suggest getting in the habit of using it.
eg: find ~/ -name "*willy*" -user marco -print
Will find all the files in your home directory which have willy in the filename _and_ which are owner by user marco.
Be very careful with the spaces when preparing your find requests, each switch or argument needs to be separated from its neighbors with white-space. You can also negate a search by prefacing it with a ! . When you use multiple arguments and combine "ands" and "ors" you should use parentheses \( and \) (they have to be escaped from the shell) in order to group the expressions correctly. An example straight from p.294 of Unix Powertools :find ~/ -atime +5 \( -name "*.o" -o -name "*.tmp" \) -print
Will find all the files in your home directory which were last accessed more than 5 days ago and either end in .o or .tmp . Without the parentheses this command would be ambiguous. -
locate
- Find can be quite slow. Every-time you call find, you are searching through all the inodes of the directories which you have pointed find at. This can produce a lot of disc activity. A lot of disc activity can really slow a system down. Enter locate.
locate calls on a database to do its searches. This is much faster than find and definitely the way to go if you are trying to find out where your distribution keeps certain files. Searches with locate are much more limited than what you can do with find, but often you are just looking for the name of a file anyway.
There is a caveat to locate. Since it does not actually look at your system, but searches a database, the most recent changes to your file-system will not be in the database. There is a commandupdatedbwhich updates the locate database. Usually this command is a cron job step up to run daily by your distribution. If every night at midnight there is suddenly a lot of disk activity and top shows you a find job taking up a lot of resources, this is updatedb rebuilding the database for locate. (see man 5 crontab, /etc/crontab and /etc/cron.d or /etc/cron.daily for specifics). -
grep
-
grep searches through a textfile or from standard input for a line with a match for your search term. eg:
locate XFree86 |grep bin
-i (case insensitivity) -n (line numbers) -l (don't show lines) -v (opposite) wild-cards : ^ $ . [...][^...] r* r? and a delimiter eg ' or / -
regexps
-
The "." is equivalent to the shell's "?" it matches any single character.
From the grep man page:
The fundamental building blocks [of regular expressions] are [those] that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any meta-character with special meaning may be quoted by preceding it with a backslash.
The big difference between the two is that in re's there are characters which specify the number of matches of the previous regular expression. -
adv. vi
- 1,10 s/re//g , g, 1g
-
Command substitution back-ticks ` xargs and $()
-
References
-
- Unix Powertools (2nd edition, O'Reilly, 1997)
- Shell differences FAQ
- Unix is a Four Letter Word
Copyright Marco Scoffier, released under the GFDL