Essential Utilities for Power User
Linux contains powerful utility programs. You can use utility to
Quick Tour of essential utilities
For this tour create sname and smark data files as follows
(Using text editor of your choice)
Note Each data block is separated from the other by TAB
character i.e. while creating the file if you type 11 then press "tab"
key, and then write Vivek (as shown in following files).
sname
Sr.No Name
11 Vivek
12 Renuka
13 Prakash
14 Ashish
15 Rani
smark
Sr.No Mark
11 67
12 55
13 96
14 36
15 67
Now suppose from sname file you wish to
print name of student and print it on-screen, then from shell (Your command
prompt i.e. $) issue command as follows:
$cut -f2 sname
Vivek
Renuka
Prakash
Ashish
Rani
Here cut utility cuts out selected data from sname file. To select
Sr.no. field from sname give command as follows:
$cut -f1 sname
11
12
13
14
15
Command | Explanation |
cut | Name of cut utility |
-f1 | Using (-f) option, you are specifying the extraction field number. (In this example its 1 i.e. first field) |
sname | File which is used by cut utility which is use as input for cut utility. |
You can redirect output of cut utility as
follows
$cut -f2 sname > /tmp/sn.tmp.$$
$cut -f2 smark > /tmp/sm.tmp.$$
$cat /tmp/sn.tmp.$$
Vivek
Renuka
Prakash
Ashish
Rani
$cat /tmp/sm.tmp.$$
67
55
96
36
67
General Syntax of cut utility:
cut -f{field number} {file-name}
Use of Cut utility: Selecting portion of a file.
Now enter following command
$ paste sname smark
11 Vivek
11 67
12 Renuka
12 55
13 Prakash
13 96
14 Ashish
14 36
15 Rani
15 67
Paste utility join textual
information together. Now try as follows
$ paste /tmp/sn.tmp.$$ /tmp/sm.tmp.$$
Vivek 67
Renuka 55
Prakash 96
Ashish 36
Rani 67
Paste utility is useful to put textual information together located in
various files.
General Syntax of paste utility:
paste {file1} {file2}
Use of paste utility: Putting lines together.
:-) Can
you note down basic difference between cut and paste utility?
Now enter following command
$join sname smark
11 Vivek
67
12 Renuka 55
13 Prakash 96
14 Ashish
36
15 Rani
67
Here students names are matched with their appropriate marks. How ? join utility uses the Sr.No. field to join to files. Notice that Sr.No. is the first field in both sname and smark file.
General Syntax of join utility:
join {file1} {file2}
Use of join utility: The join utility
joins, lines from separate files.
Note that join will only work, if there is common field in both file and if values are identical to each other.
Now enter following
command
$ tr "h2" "3x" < sname
11 Vivek
1x Renuka
13 Prakas3
14 As3is3
15 Rani
You can clearly see that each occurrence of character 'h' is replace with '3' and '2' with 'x'. tr utility translate specific characters into other specific characters or range of characters into other ranges.
h -> 3
2 -> x
Now consider following example
$ tr "[a-z]"
"[A-Z]"
hi i am Vivek
HI I AM VIVEK
what a magic
WHAT A MAGIC
{Press CTRL + C to terminate.}
Here tr translate range of characters (i.e. small a to z) into other (i.e. to Capital A to Z) ranges.
General Syntax of tr utility:
tr {pattern-1} {pattern-2}
:-) After typing following paragraph, I came to know my mistake that entire paragraph must be in lowercase characters, how to correct this mistake? (Hint - Use tr utility)
THIS IS SAMPLE PARAGRAPH
WRITTEN FOR LINUX COMMUNITY,
BY VIVEK G GITE (WHO ELSE?)
OKAY THAT IS OLD STORY.
Now considered following data file. (In fact create this file using any text editor)
inventory
egg order 4
cacke good 10
cheese okay 4
pen good 12
floppy good 5
After crating file issue command
$ awk '/good/ { print $3 }'
inventory
4
10
4
12
5
Here awk utility, select each record from file containing the word
"good" and performs the action of printing the third field
(Quantity of available goods.). Now try the following and note down its output.
$ awk '/good/ { print
$1 $3 }' inventory
General Syntax of awk utility:
awk 'pattern action' {file-name}
For $ awk '/good/ { print $3 }' inventory example,
/good/ | Is the pattern used for selecting lines from file. |
{print $3} | This is the action if pattern found. print on of such action. Here $3 means third record in selected record. (What $1 and $2 mean?) |
inventory | File which is used by awk utility which is use as input for awk utility. |
Use of awk utility: To manipulate data.
For this tour create data file as follows
teaormilk
India's milk is good.
tea Red-Lable is good.
tea is better than the coffee.
After creating file give command
$ sed '/tea/s//milk/g' teaormilk >
/tmp/result.tmp.$$
$ cat /tmp/result.tmp.$$
India's milk is good.
milk Red-Lable is good.
milk is better than the coffee.
Here sed utility is used to find every occurrence of tea and replace it with milk. sed - Steam line editor which uses 'ex' editors command for editing text files without starting ex. (Cool!, isn't it? No use of text editor to edit anything!!!)
/tea/ | Find tea word or select all lines having the word tea |
s//milk/ | Replace (substitute) the word milk for the tea. |
g | Make the changes globally. |
Now create file personame file as follows
personame
Hello I am vivek
12333
12333
welcome
to
sai computer academy, a'bad.
what still I remeber that name.
oaky! how are u luser?
what still I remeber that name.
After that issue following command
$ uniq personame
Hello I am vivek
12333
welcome
to
sai computer academy, a'bad.
what still I remeber that name.
oaky! how are u luser?
what still I remeber that name.
Above command prints those lines which are
unique. For e.g. our original file contains 12333 twice, so additional copies of
12333 are deleted. But if you examine output of uniq, you will notice that 12333
is gone (Duplicate), and "what still I remeber that name" remains as
its. Because the uniq utility compare only adjacent lines, duplicate lines must
be next to each other in the file. To solve this problem you can use command as
follows
$ sort personame | uniq
General Syntax of uniq utility:
uniq {file-name}
demofile
hello world!
cartoons are good
especially toon like tom (cat)
what
the number one song
12221
they love us
I too
After saving file, issue following command,
$ grep "too" demofile
cartoons are good
especially toon like tom (cat)
I too
grep will locate all lines for the
"too" pattern and print all such line on-screen. Here grep prints too,
as well as cartoons and toon; because grep treat "too" as expression.
Expression by grep is read as the letter t followed by o and so
on. So if this expression is found any where on line its printed. grep don't
understand words.
In the "Quick Tour of essential utilities", tour you have seen basic utilities. If you use them with other tools, these utilities are very useful for data processing or for other works. In rest part of tutorial we will learn more about patterns, filters, expressions, and off course sed and awk in depth.
What does "cat" mean to you ?
One its the word cat, (second cat is an animal! I know 'tom' cat), If same question is asked to computer (not computer but to grep utility) then grep will try to find all occurrence of "cat" (remember grep read word "cat" as the c letter followed by a and followed by t) including cat, copycat, catalog etc.
Set of characters (may be words or not) is called pattern. For e.g. "dog", "celeron", "mouse", "ship" etc are all example of pattern. Pattern can be change from one to another, for e.g. "ship" as "sheep". If patterns are identified using special characters then such special characters are know as metacharacters. Combination of pattern and metacharacters is know as expressions (regular expressions).
Regular expressions are used by different Linux utilities like
So you must know how to construct regular expression. In the next part of LSST you will learn how to construct regular expression using ex editor.
For this tutorial create one text file.
How to start ex editor?
To start ex editor type ex at shell prompt.
Syntax:
ex {file-name}
Example:
$ ex demofile
The : (colon) is ex prompt where you can type
ex text editor command or regular expression.
Getting started with ex
$ ex demofile
"demofile" [noeol] 20L, 387C
Entering Ex mode. Type "visual" to go
to Normal mode.
Printing text on-screen
You can see the screen something like shown above. Now type
'p' in front of : as follow and press
enter
:p
Okay! I will stop.
:
NOTE
By default p command will print current line, in our case its the last line
of above text file.
Printing lines using range
Now if you want to print 1st line to next 5 line (i.e. 1 to 5 lines) then
give command
:1,5 p
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
NOTE
Here 1,5 is the address. if single number is used (e.g. 5 p) it
indicate line number and if two numbers are separated by comma its range of
line.
Printing particular line
To print 2nd line from our file give command
:2 p
This is vivek from Poona.
Printing entire file on-screen
Give command
:1,$ p
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
.....
Okay! I will stop.
NOTE
Here 1 is 1st line and $ is the special character of
ex which mean last-line character. So 1,$ means print from 1st line to last-line
character (i.e. end of file). Here p stands print.
Printing line number with our text
Give command
:set number
:1,3 p
1 Hello World.
2 This is vivek from Poona.
3
NOTE
This command prints number next to each line. If you don't want number you
can turn off numbers by issuing following command
:set nonumber
:1,3 p
Hello World.
This is vivek from Poona.
Deleting lines
Give command
:1, d
I love linux.
NOTE
Here 1 is 1st line and d command indicates deletes (Which deletes the 1st
line). You can even delete range of line by giving command as
:1,5 d
Coping lines
Give command as follows
:1,4 co $
:1,$ p
I love linux.
It is different from all other Os
....
.....
. (DOT) is special command of linux.
Okay! I will stop.
I love linux.
It is different from all other Os
My brother Vikrant also loves linux.
NOTE
Here 1,4 means copy 1 to 4 lines; co command stands for copy; $ is
end of file. So it mean copy first four line to end of file. You can delete this
line as follows
:18,21 d
Okay! I will stop.
:1,$ p
I love linux.
It is different from all other Os
My brother Vikrant also loves linux.
He currently lerarns linux.
Linux is cooool.
Linux is now 10 years old.
Next year linux will be 11 year old.
Rani my sister never uses Linux
She only loves to play games and nothing else.
Do you know?
. (DOT) is special command of linux.
Okay! I will stop.
Searching the words
(a) Give following command
:/linux/ p
I love linux.
Note
In ex you can specify address (line) using number for various operation.
This is useful if you know the line number, but if you don't know line number,
then you can use contextual address to print line on-screen. In
above example /linux/ is contextual address which is constructed by
surrounding a regular expression with two slashes. And p is print command
of ex.
:-) Try following and not down difference (Hint - Watch p is
missing)
:/Linux/
(b)Give following command
:g/linux/ p
I love linux.
My brother Vikrant also loves linux.
He currently lerarns linux.
Next year linux will be 11 year old.
. (DOT) is special command of linux.
In previous example (:/linux/ p) only
one line is printed. If you want to print all occurrence of the word
"Linux" then you have to use g, which mean global line
address. This instruct ex to find all occurrence of pattern. Try following
:1,$ /Linux/ p
Which give the same result. It means g stands
for 1,$.
Saving the file in ex
Give command
:w
"demofile" 20L, 386C written
w command will save the file.
Quitting
the ex
Give command
:q
q command quits from ex and you
are return to shell prompt.
Note
use wq command to do save and exit from ex.
Find and Replace (Substituting regular
expression)
Give command as follows
:8 p
He currently lerarns linux.
:8 s/lerarns/learn/
:p
He currently learn linux.
Note
Using above command, you are substituting the word "learn" for the
word "lerarns".
Command | Explanation |
8 | Goto line 8, address of line. |
s | Substitute |
/lerarns/ | Target pattern |
learn/ | If target pattern found substitute the expression (i.e. learn/ ) |
Try following command
:1,$ s/Linux/Unix/
Rani my sister never uses Unix
:1,$ p
Hello World.
This is vivek from Poona.
....
..
.....
. (DOT) is special command of linux.
Okay! I will stop.
Note
Using above command, you are substituting all lines i.e. s command
will find all of the address line for the pattern "Linux" and if
pattern "Linux" found substitute pattern "Unix"
Command | Explanation |
:1,$ | Substitute for all line |
s | Substitute |
/Linux/ | Target pattern |
Unix/ | If target pattern found substitute the expression (i.e. Unix/ ) |
Even you can also use contextual
address as follows
:/sister/ p
Rani my sister never uses Unix
:g /sister/ s/never/always/
:p
Rani my sister always uses Unix
Above command will first find the line containing
pattern "sister" if found then it will substitute the pattern
"always" for the pattern "never" (It mean find the line containing
the word sister, on that line find the word never and
replace it with word always.)
Try the following and watch the output very carefully.
:g /Unix/ s/Unix/Linux
3 substitutions on 3 lines
Above command finds all line containing the regular expression
"Unix", then substitute "Linux" for all occurrences of
"Unix". Note that above command can be also written as follows
:g /Unix/ s//Linux
Here // is replace by the last pattern/regular expression i.e. Unix. Its
shortcut. Now try the following
:g /Linux/ s//UNIX/
3 substitutions on 3 lines
:g/Linux/p
Linux is cooool.
Linux is now 10 years old.
Rani my sister always uses Linux
:g /Linux/ s//UNIX/
3 substitutions on 3 lines
:g/UNIX/p
UNIX is cooool.
UNIX is now 10 years old.
Rani my sister always uses UNIX
By default substitute command only substitute
first occurrence of a pattern on a line. Let's tack example, give command
:/brother/p
My brother Vikrant also loves linux who also
loves unix.
Now in above line "also" word
is occurred twice, give the following substitute command
:g/brother/ s/also/XYZ/
:/brother/p
My brother Vikrant XYZ loves linux who also
loves unix.
(Make sure next time it works)
:g/brother/ s/XYZ/also/
Note that "also" is only once substituted. If you want to s
command to work with all occurrences of pattern within a address line give
command as follows
:g/brother/ s/also/XYZ/g
:p
My brother Vikrant XYZ loves linux who XYZ
loves unix.
:g/brother/ s/XYZ/also/g
:p
My brother Vikrant also loves linux who also
loves unix.
The g option at the end instruct s command to perform replacement on all occurrences of the target pattern within a address line.
Replacing word with confirmation from
user
Give command as follows
:g/Linux/ s//UNIX/gc
After giving this command ex will ask you question like -
replace with UNIX (y/n/a/q/^E/^Y)?
Finding words
Command like
:g/the/p
It is different from all other Os
My brother Vikrant also loves linux who also
loves unix.
Will find word like theater, the,
brother, other etc. What if you want to just find the word like "the"
? To find the word (Let's say Linux) you can give command like
:/\<Linux\>
Linux is cooool.
:g/\<Linux\>/p
Linux is cooool.
Linux is now 10 years old.
Rani my sister never uses Linux
The symbol \< and \> respectively match the empty string at the beginning and end of the word. To find the line
which contain Linux pattern at the beginning give command
:/^Linux
Linux is cooool.
As you know $ is end of line character, the ^
(caret) match beginning of line. To find all occurrence of pattern
"Linux" at the beginning of line give command
:g/^Linux
Linux is cooool.
Linux is now 10 years old.
And if you want to find "Linux" at
the end of line then give command
:/Linux $
Rani my sister never uses Linux
Following command will find empty line
:/^$
To find all blank line give command
:g/^$
To view entire file without blank line you can
use command as follow
:g/[^/^$]
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
My brother Vikrant also loves linux who also
loves unix.
He currently learn linux.
Linux is cooool.
Linux is now 10 years old.
Next year linux will be 11 year old.
Rani my sister never uses Linux
She only loves to play games and nothing else.
Do you know?
. (DOT) is special command of linux.
Okay! I will stop.
Command | Explanation |
g | All occurrence |
/[^ | [^] This means not |
/^$ | Empty line, Combination of ^ and $. |
To delete all blank line you can give command
as follows
:g/^$/d
Okay! I will stop.
:1,$ p
Hello World.
This is vivek from Poona.
I love linux.
It is different from all other Os
My brother Vikrant also loves linux who also
loves unix.
He currently learn linux.
Linux is cooool.
Linux is now 10 years old.
Next year linux will be 11 year old.
Rani my sister never uses Linux
She only loves to play games and nothing else.
Do you know?
. (DOT) is special command of linux.
Okay! I will stop.
Try u command to undo, to undo what you
have done it, give it as follows
:u
:1,$ p
Hello World.
This is vivek from Poona.
....
...
....
Okay! I will stop.
Using range of characters in regular
expressions
Try the following command
:g/Linux/p
Linux is cooool.
Linux is now 10 years old.
Rani my sister never uses Linux
This will find only "Linux" and not
the "linux", to overcome this problem try as follows
:g/[Ll]inux/p
I love linux.
My brother Vikrant also loves linux who also
loves unix.
He currently learn linux.
Linux is cooool.
Linux is now 10 years old.
Next year linux will be 11 year old.
Rani my sister never uses Linux
. (DOT) is special command of linux.
Here a list of characters
enclosed by [ and ], which matches any single character in that
range. if the first character of list is ^, then it matches any character
not in the list. In above example [Ll], will try to match L or l
with rest of pattern. Let's see another example. Suppose you want to match
single digit character in range you can give command as follows
:/[0123456789]
Even you can try it as follows
:g/[0-9]
Linux is now 10 years old.
Next year linux will be 11 year old.
Here range of digit is specified by giving first
digit (0-zero) and last digit (1), separated by hyphen. You can try [a-z]
for lowercase character, [A-Z] for uppercase character. Not just this,
there are certain named classes of characters which are predefined. They are as
follows
Predefined classes of characters | Meaning |
[:alnum:] | Letters and Digits (A to Z or a to z or 0 to 9) |
[:alpha:] | Letters A to Z or a to z |
[:cntrl:] | Delete character or ordinary control character (0x7F or 0x00 to 0x1F) |
[:digit:] | Digit (0 to 9) |
[:graph:] | Printing character, like print, except that a space character is excluded |
[:lower:] | Lowercase letter (a to z) |
[:print:] | Printing character (0x20 to 0x7E) |
[:punct:] | Punctuation character (cntrl or space) |
[:space:] | Space, tab, carriage return, new line, vertical tab, or form feed (0x09 to 0x0D, 0x20) |
[:upper:] | Uppercase letter (A to Z) |
[:xdigit:] | Hexadecimal digit (0 to 9, A to F, a to f) |
For e.g. To find digit or alphabet (Upper as
well as lower) you will write
:/[0-9A-Za-Z]
Instead of writing such command you could easily
use predefined classes or range as follows
:/[[:alnum:]]
The . (dot) ma matches any single character.
For e.g. Type following
:g/\<.o\>
She only loves to play games and nothing else.
Do you know?
This will include lo(ves), Do, no(thing) etc.
* Matches the zero or more times
For e.g.
:g/L*
Hello World.
This is vivek from Poona.
....
....
:g/Li*
Linux is cooool.
Linux is now 10 years old.
Rani my sister never uses Linux
:g/c.*and
. (DOT) is special command of linux.
Here first c character is matched, then any single character (.) followed by n number of single character (1 or 100 times even) and finally ends with and. This can found different word as follows command or catand etc.
In the regular expression metacharacters such .
(DOT) or * loss their special meaning if we use as \. or \*. The backslash
removes the special meaning of such meatcharacters and you can use them as
ordinary characters. For e.g. If u want to search . (DOT) character at the
beginning of line, then you can't use command as follows
:g/^.
Hello World.
This is vivek from Poona.
....
..
...
. (DOT) is special command of linux.
Okay! I will stop.
Instead of that use
:g/^\.
. (DOT) is special command of linux.
Using & as Special replacement
character
Try the following
:1,$ s/Linux/&-Unix/p
3 substitutions on 3 lines
Rani my sister never uses Linux-Unix
:g/Linux-Unix/p
Linux-Unix is cooool.
Linux-Unix is now 10 years old.
Rani my sister never uses Linux-Unix
This command will replace, target pattern
"Linux" with "Linux-Unix". & before -Unix means
use "last pattern found" with given pattern, So here last pattern
found is "Linux" which is used with given -Unix pattern (Finally
constructing "Linux-Unix" substitute for "Linux").
:-) Can you guess the output of this command
:1,$ s/Linux-Unix/&Linux/p
Converting Lowercase character to
uppercase
Try the command
:1,$ s/[a-z]/\u &/g
Command | Explanation |
1,$ | Line Address location is all i.e. find all lines for following pattern |
s | Substitute command |
/[a-z]/ | Find all lowercase letter - Target |
\u&/ | Substitute to Uppercase. \u&
means substitute last patter (&) matched with its UPPERCASE
replacement (\u) Note: Use \l (small L) for lowercase character. |
g | Global replacement |
:-) Can you guess the output of following command
:1,$ s/[A-Z]/\l&/g
Congratulation,
for successfully completion of this tutorial of regular expressions.
I hope so you have learn lot from
this. This tutorial is very important to continue with rest of tutorial and to
become power user of Linux. Impress your friends with such expressions. Can you
guess what last expression do?
:1,$ s/^ *$//
Note :
indicates two black
space.
Continue tutorial with awk utility