Friday, July 26, 2013

SIMPLE FILTERS

Filters are commands that accept data normally from the standard input, manipulate it and

write results to the standard output.



n   Simple Filters

n   Filters with Regular Expressions – grep and sed

n   Advanced Filtering using awk



Simple Filters include head, tail, tr, sort, uniq, cut, paste, pr, comm, diff  etc.



HEAD

$ head emp.lst

Displays first 10 lines of the file, from the beginning.



$ head -15 emp.lst

Displays first 15 lines of the file, from the beginning.



TAIL

$ tail emp.st

Displays last 10 lines of the file, from the end.

$ tail -15 emp.lst

Displays last 15 lines of the file, from the end.

$ tail +25 emp.lst

Displays lines starting with the line no.25 to the end.

$ tail –c  -512 emp.st

Copies last 512 bytes from emp.st


TR


$ tr ‘a’ ‘A’

Translates each ‘a’ with ‘A’ in the input

$ tr ‘ax’ ‘by’

Translates a with b and x with y in the input

$ ls –l | tr –s ‘ ‘

Squeezes multiple occurrences of spaces to one.

$ tr –d “|” < emp.lst

Deletes the character ‘|’ from the file emp.lst





SORT

A file (data base file) can be sorted in the ascending

or descending order by sort.

$ sort emp.lst

Sorts in ASCII collating sequence – white space first,

numerals next, uppercase letters and finally lower

case letters.

$ sort –r emp.lst

Sorts in the reverse order

$ sort –n emp.lst

Sorts according to numeric order

$ sort –u emp.lst

Sorts uniquely

$ sort –f emp.lst

Sorts in the insensitive case order

$ sort emp.lst –o emp.lst1

Sorts and stores the output in the file ‘emp.lst1’.

$ sort –c emp.lst

Checks if the file is sorted

$ sort –m emp.lst emp.lst1

Merges two files emp.lst and emp.lst1

$ sort –t”:” emp.lst

Sorts by taking ‘:’ as the delimiter among fields

$ sort –k 2 emp.lst

Sorts on the second field

$ sort –k  3,3  -k 2,2 emp.lst

Sorts according to the third field and the secondary

key is second field.

$ sort –k 5.7, 5.8 emp.lst

Sorts from 7th column of 5th field to the 8th column

of 5th field of emp.lst

Note: Sort considers tab as the default delimiter.

However, as a user we should use ‘:’ as the delimiter.



UNIQ

$ uniq emp.lst (emp.lst must be sorted before)

Displays the lines uniquely

$ uniq –u emp.lst

Displays the lines that are only unique

$ uniq –d emp.lst

Displays the lines that are having duplicates

$ uniq –c emp.lst

Displays the frequency of occurrence of each line



CUT

$ cut –c1 emp.lst

Cuts the file vertically basing on character nos.

$ cut –c1-5,8 emp.lst

Displays 1 to 5 characters and the 8th character of each line

$ cut –f1 emp.lst

Displays first field of the file

$ cut –f1,3 emp.lst

Displays first and third fields.

$ cut –f1-3 emp.lst

Displays first, second and third fields

$ cut –d”:” –f1 emp.lst

Displays the first field by taking “:” as the delimiter


PASTE


$ paste emp.lst emp.lst1

Joins two files emp.lst and emp.lst1 with the tab as the delimiter.

$ paste –d”:” emp.lst emp.lst1

Joins two files emp.lst and emp.lst1 with “:” as

the delimiter.

$ paste –s emp.lst

Would join all the lines two forma single line



PR

$ pr emp.lst

Prints file by adding suitable headers, footers andformatted text. Adds five lines of margin at

the Top and five and the Bottom. The header shows the date and time of lat modification of

the file along with the filename and page number.

$ pr -3 emp.lst

Prints in 3 columns

$ pr –t emp.lst

Suppresses the header and footer

$ pr –d emp.lst

Displays in double line spacing

$ pr –n emp.lst

Lines are numbered

$ pr –o 5 emp.lst

Left margin is 5

$ pr –h “employee file” emp.lst

Header is ‘employee file’.

$ pr +10 emp.lst

Prints from page no. 10

$ pr –l 45 emp.lst

Page length is set to 45

$ pr –l45 emp.lst | lp



DIFF

$ diff emp.lst emp.lst1

Displays file differences. Suggests changes in

order that the two files are identical.

Append            a

Delete               d

Change             c

$ diff –e emp.lst emp.lst1

This produces a set of instructions only



COMM

$ comm emp.lst emp.lst1

Both these files must be sorted. Shows 3 column output. The first column contains the

entries only available in the first file, the second column contains the entries only available to

the second file and the third column contains common entries.

$ comm -1 emp.lst emp.lst1

Suppresses first column in the output.

$ comm -12 emp.lst emp.lst1

Suppresses first and second columns in the output.



$ comm -123 emp.lst emp.lst1

Suppresses all columns in the output.




No comments:

Post a Comment