Text processing utilities and backup utilities of unix

Text processing utilities:


The cat command reads one or more files and prints them to standard output. The operator > can be used to combine multiple files into one. The operator >> can be used to append to an existing file.
The syntax for the cat command is:
cat [options] [files]

$ is printed at the end of each line. This option must be used with -v.

1. $cat file1 // displays the contents of file1
2. $cat file1 file2 > file3  //concatenates file1 and file2, and writes the results in file3
3. $cat file1 >> file2  //appends a copy of file1 to the end of file2


The tail command displays the last ten lines of the file.
The syntax for the tail command is:
tail [options] [file]

Follow the file as it grows.
Displays the lines in the reverse order.
Displays the file at the nth item from the end of the file.
Displays the file at the nth item from the beginning of the file.

1. By default, tail will print the last 10 lines of its input to the  standard output. With command line options the number of lines printed and the printing units (lines, blocks or bytes) may be changed. The following example shows the last 20 lines of filename:

$tail -n 20 filename

2. This example shows all lines of filename from the second line onwards:

$tail -n +2 filename

The head command displays the first ten lines of a file.
The syntax for head command is:

$head [options] <filename>

By default, head will print the first 10 lines of its input to the standard output. The number of lines printed may be changed with a command line option. The following example shows the first 20 lines of filename:

$head -n 20 filename

This displays the first 5 lines of all files starting with   foo:

$head -n 5 foo*


Sort :   Sorting is the ordering of data in ascending or descending sequence. The sort command orders a file. By default sort reorders lines in ASCII collating sequence_ white space first, then numerals, uppercase letters and finally lowercase letters.
      1.Sort by lines:    The easiest sort arranges data by lines. Starting at the beginning of the line, it compares the first character in one line with the first character in another line.
                  Syntax: $ sort  filename     
      2.Sort by fields: in general a field is the smallest unit of data. When a field sort is required   we need to define which fields are to be used for the sort.       
                   Syntax: $ sort –t ” “ –k2 filename
                     Here we use –k option to sort on the specified field. The delimiter option (-t) specifies an alternate delimiter.
     3.Reverse order: To order the data from largest to smallest,we specify reverse order(-r).
                    Syntax: $ sort –t : -k2.3  -r filename
        Here it sorting the data based on 3rd character of the second field in descending order.
          -tchar         –    Use delimiter char to identify fields
          -k n            –    Sorts on nth field
          -k m,n        –    Starts sort on mth field and ends sort on nth              
          -k m.n        –    Starts sort on nth column of mth field
          -u               –    Removes repeated lines
          -n               –    Sorts numerically
          -r                –    Reverses sort order


nl is a Unix utility for numbering lines, either from a file or from standard input, reproducing output on standard output.


·         s – separator- number and data is separated with separator
·         w – width
Eg: $nl filename
$nl –s: file1
$nl –w20 –s: file1
      This command displays uniq lines of the given files.That is, if successive lines of a file are same then they will be removed. This can be used to remove successive empty lines in a given file.
The syntax of usage of this command is given as:
         -c      Prefix lines by their occurrences
         -d      Only print repeated lines
         -D      Print all duplicate lines
         -u      Only print unique lines
             $ uniq  testfile (or uniq –u testfile)
            $ sort f1 | uniq
Grep: (Global Regular Expression)  Searching for a pattern
       UNIX has a special family of commands for handling search requirements, and the principal member of this family is the grep command. grep scans its input for a pattern and displays lines conainning the pattern, the linenumbers  or filenames where the pattern occurs.
               Syntax: grep options pttern filename(s)       
           Grep searches for pattern in one or more files ,or the standard input if no filename is specified. The  first argument is the pattern and the remainning arguments are filenames.
               Eg: $ grep “sales” employee
       It displays the lines containing the string sales from the file employee.
   Grep is also used with multiple filenames.It displays the filenames along with the ouput.
                Eg: $ grep director emp1 emp2
                        $ grep “sales director” emp1 emp2
      Here quoting is not necessary.when we use pattern with multiple words then we have to use quoting.
    option                   significance
       -I               –          Ignores case for matching
       -v              –          Doesn’t display lines matching expression
       -n              –          Displays line numbers along with lines
       -c              –          Display count of number of occurrences
       -l               –          Displays list of filenames only
The fgrep and egrep command:

The fgrep and egrep command are advanced pattern matching command. The fgrep command doesn’t use any meta character for its searched pattern. The primary advantage of fgrep is it can also serch two or more than two strings simultaneously. The fgrep command can be used like this:
$fgrep ‘good


 great’ userfile

Here a single quote is used to mark three strings as one argument. Here we are going to search three different strings good, bad, and great. The egrep command is used to search this in a more compact form than fgrep command:
               $egrep ‘good | bad | great’ userfile

The egrep uses an or ( | ) operator to achieve this. Therefore egrep command is more compact and more versatile than fgrep. Another achievement of egrep is that we can make groups of different patterns. If we use | as operator.
               $egrep ‘sunil | rohan gavasker ‘ players

Here sunil is first pattern and everything to the right is considered as second pattern.
If we want to search both sunil gavasker and rohan gavasker use this:
               $egrep ‘(sunil | rohan)‘ players
$egrep -f pat.lst file1 –this option is for using files instead of directly specifying different patterns.
               $fgrep -f pat.lst file1

Cut command:

             By using this command, we can extract the required columns or fields from the file.
This command extracts the fields based on either character position or on field delimeters position.
       -c       –     Used to extracts the fields based on character position or columns.
        -f       –     Used to extracts the required fields based on field delimeters.
                          (By default field delimeter is tab)
        -d       –     Used to define our own delimeters.
cut –f1,3 filename
             This displays 1st and 3rdwords of each line of the given file.Between word to word TAB
             should available.
cut   –d”:”  -f1,3   filename
              This displays user name ,UID of each legal user of the machine.Herewith –d option
              we are specifying that    :  is the field separator between word to word.
The same result is given by the following command  cut   -d”:”  -f3,1 /etc/passwd.
cut  -d”:”  -f1-3 filename
                This displays 1stword to 3rd word from each line of the given file.
cut  -f “:” –f3- filename
                 This displays 3rdword to till last word of each line of the given file.
cut  -f “:” –f-3 filename
                 This displays 1st word to till 3rdword of each line of the given file.
cut  -c3-5 filename
                  This displays 3rdcharacter to 5th character of each line in the given file.
Paste command:
           This command is used to create new files by gluying together fields or columns
From two or more files.
   Syntax: paste filename1 filename2
Consider the following eample
$cat indo.lst
$cat name.lst
H.D Rao
M.G.V Murthy
$paste indo.lst name.lst >info.lst
The result of paste command is
$cat info.lst
20032       H.D Rao
20034       M.G.V Murthy
20121       P.K.Krishna
$paste –d :  indo.lst name.lst
     This command combines the data in the files by using : symbol in between the fields of two files.
Awk command:
                        This command made a late entry into the UNIX system in 1977 to augment the tool kit with suitable report formatting capabilities. The awk name is from authors Aho, Weinberger and Kernighan.
                        Awk options ‘selection criteria {action}’ file(s)
            $ awk –F” “ ‘$3 > 100 { print }’ file1 (or)
            $ awk –F” “ ‘$3 > 100’ file1 (or)
$ awk –F” “ ‘$3 > 100 { print $0 }’ file1 // displays line in file1 whose 3rd field value is greater than 100
$ awk –F” “ ‘$3 > 100 { print $1,$3 }’ file1 //displays 1st and 3rd field in the lines whose 3rd field values is greater than 100
$ awk –F” “ ‘ /mca/ { print }’ file1 //displays the lines those contain the data ‘mca’
$ awk –F” “ ‘NR==3,NR==6 { print NR,$2,$3}’ file1 //displays line number, 2nd and 3rdfield in the 3rd and 6th line
Printf: for display formatted output
$awk –F” “ ‘NR==3 {
> printf “%3d %20s n”,NR,$1’ file1 //displays line number and 1stfield value.
$awk –F” “ ‘NR==3 {
> printf “%3d %20s n”,NR,$1’ file1 > file2 //output is stored in file2
Comparison Operators: <, <=, ==, !=, >=, >, ~ – matches a regular expression, !~ – doesn’t match a regular expression.
            $ awk –F” “ ‘$3==”director” || $3==”chairman” { print }’ file1
Number Processing: +, -, *, / and %
            $ awk –F” “ ‘$4==”sales” {
            > printf “%20s %10d %8.2f n”,$2,$3,$3/11}’ file1
            $ awk –F” “ ‘$3>100 {
            > count = count + 1
            > printf “%d n”,count}’ file1
Supports count++, count += 2 and ++count
Reading the Program form a File:
            $ cat > sample.awk
            $2==100 {print $1}
            Press ctrl + d
            $ awk –F” “ –f sample.awk file1
Join command:

Join lines of two files based on a common field. You can join two files based on a common field, that you can specify using field.

Syntax:  $ join -t':' -1 N -2 N file1 file2
  • -t’:’ – : is the field separator
  • -1 N : Nth field in 1st file
  • -2 N : Nth field in 2nd file
  • file1 file2 : files that should be joined

In this example, let us combine employee.txt and bonus.txt files using the common employee number field.

$ cat employee.txt
100     Emma    Thomas
200     Alex    Jason
300     Madison Randy
400     Sanjay  Gupta
500     Nisha   Singh
$ cat bonus.txt
$5,000  100
$5,500  200
$6,000  300
$7,000  400
$9,500  500
$ join  -1 1 -2 2 employee.txt bonus.txt
100 Emma Thomas $5,000
200 Alex Jason $5,500
300 Madison Randy $6,000
400 Sanjay Gupta $7,000
500 Nisha Singh $9,500

Pg command:

        pg is a terminal pager program on Unix for viewing text files. It can also be used to page through the output of a command via a pipe. pg uses an interface similar to vi.
Syntax: $pg filename
comm command:
       comm - compare two sorted files line by line. The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines
commreads two files as input, regarded as lines of text. comm outputs one file, which contains three columns. This functionally is similar to diff.Columns are typically distinguished with the <tab>character

 Syntax: comm [OPTION]… FILE1 FILE2

       With  no  options,  produce  three-column  output.  Column one contains lines unique to FILE1, column two contains lines unique to  FILE2,  and column three contains lines common to both files.
       Col1-     suppress lines unique to FILE1
       Col2 -      suppress lines unique to FILE2
       Col3-     suppress lines that appear in both files
eg:  $comm-1  file1  file2        //It displays only 2nd and 3rd  columns
Cmp command:
           cmp is a command line utility for computer systems that use Unix. It compares two filesof any type and writes the results to the standard output. By default, cmp is silent if the files are the same; if they differ, the byte and line number at which the first difference occurred is reported.
Syntax :  cmp [-c] [-i N] [-l] [-s] [-v] firstfile secondfile
Output differing bytes as characters.
Write the byte number (decimal) and the differing bytes (octal) for each difference.
Write nothing for differing files; return exit statuses only.
Output version info.

$cmp file1.txt file2.txt

Compares file1 to file2 and outputs results. Below is example of how these results may look.
$file.txt file2.txt differ: char 1011, line 112
diff command:
      This command is used to display file differences. It also tells you which lines in one file have to be changed to make two files identical.
Syntax:  $ diff file1 file2
         diff  uses certain special symbols and instructions to indicate the changes that  are required to make two files identical. Each instruction uses an address combined with an action that is applied to the first file.
The instruction
1.      7a8  means appending line after line 7, which become line8 in the second file.
      2.   3c3 change line 3 which remains 3 line after the change.
      3.   5,7c5,7changes 3 lines.
tr command:(translate or transliterate)
            When executed, the program reads from the standard input and writes to the standard output. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set with the corresponding elements from the other set.
Syntax:  tr options expression1 expression2 < standard input
Eg:1.$tr ‘abcd’ ‘jkmn’
             maps ‘a’ to ‘j’, ‘b’ to ‘k’, ‘c’ to ‘m’, and ‘d’ to ‘n’.
Sets of characters may be abbreviated by using character ranges. The previous example could be written:
   2.$tr ‘a-d’ ‘jkmn’
   3. $ tr ‘a-z’ ‘A-Z’ < file1
            -d  — delete the specified characters
                        $ tr –d ‘ad’ < file1
            -c – delete all the characters except specified character.
                        $ tr –cd ‘ad’ < file1
Backup utilities:
Tar command:
             In computing, tar (derived from tape archive) is both a file format(in the form of a type of archive bitstream) and the name of a program used to handle such files. It is now commonly used to collect many files into one larger file for distribution or archiving, while preserving file system information such as user and group permissions, dates, and directory structures.
Create. Writing begins at the beginning of the tarfile, instead of at the end.
        r             –
Replace. The named files are written at the end of the tarfile.
        t             –
Table of Contents. The names of the specified files are listed each time they occur in the tar file
       x            –
Extract or restore. The named file s are extracted from the tarfile and written to the directory specified in the tarfile, relative to the current directory
       f            –
File. Use the tarfile argument as the name of the tarfile.
       v         –
Verbose. Shows number of blocks used by each file.
          $tar – cvf archive.tar  f1 f2 f3   //It creates an archive which combine these 3 files                
          $tar –xvf archive.tar   //To display all files in the archive
          $tar –tvf archive.tar  //properties of all files
Tee command:
In computing, tee is a command which displays or pipes the output of a command and copies it into a file or a variable. It is primarily used in conjunction with pipes.
               teeis normally used to split the output of a program so that it can be seen on the display and also be saved in a file. The command can also be used to capture intermediate output before the data is altered by another command or program. The tee command reads standard input, then writes its content to standard output and simultaneously copies it into the specified file(s) or variables. The syntax differs depending on the command’s implementation
 eg:$cat filename |tee filename2      //It copis the output of cat command into filename2   
$   cat filename | tee –a filename2      //It appends the output of cat command to the end of filename2   
Cpio command:
         This command (copy input_output) copies files to and from a backup Device. It uses standard input to take the list of file names. It then copies them with their contents and headers to the standard output which can be redirected to a file or device.This means that cpio  can be used with redirection and piping.        
           cpio uses two key options ,-o(output) and –i(input),either of which (but not both) must be there in the command line.All other options have to be used with either of these key options.
Syntax:  cpio options filename
        1.  $ls | cpio –oF sample //o for creating file, F for specifying the name of the file
         2. $cpio –iF  sample  //i  for extracting data from file.
        3.  $cpio –itF sample  //t for displaying contents
Simple commands:
                  These commands include only single command at a time.
Example: cp filename.
                rm filename.
                md  filename. etc
compound commands:
                                These commands include multiple commands at a time.
                  $who | tee filename
c program:
To create a c program the command which is need  :     vi  filename.c
To compile the c program the command which is used :   cc filename.c
To run the c program the command which is used :      ./a.out
Shell Scripts:
            It is used to combine the group of commands into a single file.Shell scripts were very
much employeed in developing automatic S/W installation scripts and for fine tuning the S/W’s
To create a shell script the command used is:   vi filename.sh
To run the shell program by using the command:   sh  filenme.sh
Example program for shell script:
            echo “Enter first number”
             read x
             echo “Enter second value”
             read y

             echo `expr $x  +  $y`                                                                       

One comment

  1. You can't even design a HTML page properly and you put up some copyrights alert bullshit when trying to readjust the page ourselves? Like, really? A part of your really great content( I appreciate the effort put for the content ) is hidden under the stupid HTML element on the right side. Fix that.

Leave a Reply

Your email address will not be published. Required fields are marked *

Enable Notifications OK No thanks