In this post I want to discuss about sed (Stream editor). Sed is a Unix utility that parses text and implements a programming language which can apply transformations to such text. It was developed by Lee E. McMohan of Bell Labs at 1974. It is available today for most operating systems.
Sed has several commands, the most essential command of sed is the substitution command: s. The substitution command changes all occurrences of the regular expression into a new value.
A simple example is changing “day” in the “old” file to “night” in the “new” file.
sed 's/day/night/' <old > new
I must emphasize the sed editor changes exactly what you tell it to. So if you executed
echo Sunday | sed 's/day/night/'
This would output the word “Sunnight” because sed found the string “day” in the input.
There are four parts to this substitute command:
s Substitute command
day Regular expression pattern search pattern
night Replacement string
The slash as a delimeter
The character after the s is the delimiter. It can be anything you want, however. If you want to change a pathname that contains a slash - say /usr/local/bin to /common/bin – you could use the backslash to quote the slash:
sed 's/\/usr\/local\/bin/\/common\/bin/' <old >new
we can also use an underline, colons or “|” character instead of a slash as a delimiter:
sed 's_/usr/local/bin_/common/bin_' <old >new sed 's:/usr/local/bin:/common/bin:' <old >new sed 's|/usr/local/bin|/common/bin|' <old >new
Using & as the matched string
$ used for pattern matching. It is easy to do this if you are looking for a particular string:
sed 's/abc/(abc)/' <old >new
This won't work if you don't know exactly what you will find. The solution requires the special character "&." It corresponds to the pattern found.
sed 's/[a-z]*/(&)/' <old >new
You can have any number of "&" in the replacement string. e.g. the first number of a line:
% echo "123 abc" | sed 's/[0-9]*/& &/'
123 123 abc
If the input was "abc 123" the output would be unchanged.
% echo "123 abc" | sed 's/[0-9][0-9]*/& &/'
123 123 abc
The original sed did not support the "+" metacharacter. GNU sed does. It means "one or more matches". So the above could also be written using
% echo "123 abc" | sed 's/[0-9]+/& &/'
123 123 abc
Using \1 to keep part of the pattern
The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern. Sed has up to nine remembered patterns. If you wanted to keep the first word of a line, and delete the rest of the line, mark the important part with the parenthesis:
"[a-z]*" matches zero or more lower case letters, Therefore if you type
echo abcd123 | sed 's/\([a-z]*\).*/\1/'
This will output "abcd" and delete the numbers.
If you want to switch two words around, you can remember two patterns and change the order around:
sed 's/\([a-z]*\) \([a-z]*\)/\2 \1/'
You may want to insist that words have at least one letter by using
sed 's/\([a-z][a-z]*\) \([a-z][a-z]*\)/\2 \1/'
If you want to eliminate duplicated words, you can try:
sed 's/\([a-z]*\) \1/\1/'
If you want to detect duplicated words, you can use
sed -n '/\([a-z][a-z]*\) \1/p'
If you wanted to reverse the first three characters on a line, you can use
/g -Global replacement
let's place parentheses around words on a line. Instead of using a pattern like "[A-Za-z]*" which won't match words like "won't," we will use a pattern, "[^ ]*," that matches everything except a space. The following will put parenthesis around the first word:
sed 's/[^ ]*/(&)/' <old >new
If you want it to make changes for every word, add a "g" after the last delimiter and use the work-around:
sed 's/[^ ][^ ]*/(&)/g' <old >new
/1, /2, etc. Specifying which occurance
If you want to modify a particular pattern that is not the first one on the line, you could use "\(" and "\)" to mark each pattern, and use "\1" to put the first pattern back unchanged. This next example keeps the first word on the line but deletes the second:
sed 's/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /' <old >new
There is an easier way to do this. You can add a number after the substitution command to indicate you only want to match that particular pattern. Example:
sed 's/[a-zA-Z]* //2' <old >new
You can combine a number with the g (global) flag. For instance, if you want to leave the first world alone , but change the second, third, etc. to DELETED, use /2g:
sed 's/[a-zA-Z]* /DELETED /2g' <old >new
/p – print
If you use an optional argument to sed, "sed -n," it will not, by default, print any new lines. When the "-n" option is used, the "p" flag will cause the modified line to be printed.
sed -n 's/pattern/&/p' <file
Write to a file with /w filename
you can specify a file that will receive the modified data. An example is the following, which will write all lines that start with an even number, followed by a space, to the file even:
sed -n 's/^[0-9]* /&/w even' <file
previously, I have only used one substitute command. If you need to make two changes, and you didn't want to read the manual, you could pipe together multiple sed commands:
sed 's/BEGIN/begin/' <old | sed 's/END/end/' >new
Multiple commands with -e command
One method of combining multiple commands is to use a -e before each command:
sed -e 's/a/A/' -e 's/b/B/' <old >new
sed -f scriptname
If you have a large number of sed commands, you can put them into a file and use
sed -f sedscript <old >new
where sedscript could look like this:
# sed comment - This script changes lower case vowels to upper case
sed in shell script
If you have many commands and they won't fit neatly on one line, you can break up the line using a backslash:
sed -e 's/a/A/g' \
-e 's/e/E/g' \
-e 's/i/I/g' \
-e 's/o/O/g' \
-e 's/u/U/g' <old >new