Sed is commonly used to search for and replace sub strings in text files, delete or extract lines from text files.

How does SED Work ?

SED is a Stream EDitor. SED works as a "flow" mode : input stream is processed line by line. This ensures good performance and a reduced memory use but prevents SED to have an overview of the entire file.

SED Processing Steps :

  1. Sed reads the first line from the input stream.
  2. First line is treated with any encountered commands from an input script.
  3. Sed displays the resulting line to the standart output except if -n option is specified.
  4. Sed reads next line and repeats previous steps.

Note: MAN page

sed [OPTION]... {script-only-if-no-other-script} [input-file]...

Commands

sed receives a script which contains all actions to be performed on the input stream.

There are two ways to forward this script to the input stream :

Output stream

Two choices for SED output stream :

Let's create a simple file to test sed commands and display results in a terminal window :

cat > hello.txt <<- EOF
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

    # This is the last comment starting with blank characters. 
EOF

This file is saved as hello.txt. To display this file, we can use the cat command :

cat hello.txt 
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

Deleting lines with SED "d command"

The d command is used to delete selected lines. As sed works on a data stream, it is not a real suppression, sed just jumps to the next line.

We may filter lines by requesting a selection on line numbers.

Let's delete line 3 for example :

sed '3d' hello.txt  
This is the first line.
Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

It worked ! Line 3 was deleted.

Line selection can also be done as an interval

Let's delete lines from 2 to 4 :

sed '2,3d' hello.txt  
This is the first line.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

We jumped after line 1 directly to line 4.

We may want to process two actions in the same command line with option '-e' :

The -e option allows several commands to be executed in sequence.

sed -e '1d' -e '3d' hello.txt is the same as

sed -e '1d; 3d' hello.txt 
Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

You may notice line 1 and 3 were deleted.

Let's now delete lines which contain the string : "comment"

sed '/comment/d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Let's remove lines from line 4 to the end of file.

sed '4,$ d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi

Negative condition !

Let's delete lines other than a specified range, line other than 2nd till 4th. The symbol ! indicates negative condition :

sed '2,4!d' hello.txt
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Using Regular Expression

In Regular Expressions, the symbol ^ means beginning of a line, and $ means end of a line. It is thus obvious that the pattern ^$ stands for an empty line. Patterns are indicated between slashes Characters.

To delete empty line :

sed '/^$/d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 6 # This is a comment.
# This is the last comment starting with blank characters. 

Deleting all lines starting with the string "Line"

sed '/^Line/ d' hello.txt
This is the first line.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

Deleting all lines ending with some specified characters could also be useful

sed '/[ih]$/d' hello.txt
This is the first line.
Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

[ih] indicates either 'i' or 'h'. So, this will delete all lines ending with either 'i' or 'h'.

We can use RegEx to filter lines from an interval.

Rules to select lines from an interval (two patterns which are separated by commas) :

sed '/^Line/,/characters\.$/!d' hello.txt  
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Printing with "p command"

First we'll ask sed to apply the command p (print) to the lines of the input file without any filtering.

sed -e 'p' hello.txt
This is the first line.
This is the first line.
Line 2 ? argh 
Line 2 ? argh 
Line 3  ? hi
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.
    Line 6 # This is a comment.


# This is the last comment starting with blank characters. 
# This is the last comment starting with blank characters. 

We can notice all lines are duplicated. Let's understand why : Sed displays by default the resulting line on the output standard unless it is invoked with the -n option. On the other hand, with 'p command' sed is also explicitely asked to display the resulting line. This leads to a duplication of the resulting line.

We will try again with -n option

sed -n -e 'p' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

Lines duplication is no more effective.

The 'p command' can be used symmetrically to the 'd command'

To pick up all lines containing a word for example : let's try again with the word "comment"

sed -n '/comment/p' hello.txt
    Line 6 # This is a comment.
# This is the last comment starting with blank characters. 

The negative form of "d command" will produce the same result :

sed '/comment/!d' hello.txt
    Line 6 # This is a comment.
# This is the last comment starting with blank characters. 

p is mainly used for two reasons :

"s command" for Substitution

Sed has several commands, but the substitute command is the most used one because d and p commands may be replaced with other tools as grep, head, tail or tr.

The substitute command replaces patterns from an input stream (a text file for example) into a new value. This pattern may be a regular expression.

s / pattern / replacement /

By default, it takes place on the first occurrence of the pattern in the line, unless the option g is added at the end of the command :

s / pattern / replacement / g

One can also choose to replace the third occurence :

s / pattern / replacement / 3

Let's create a test file

cat > testS.txt <<- EOF
It the sky we look upon now now
Should tumble and fall
All of the mountains may crumble May crumble to the sea
EOF

A simple example to substitute the first occurence in the line of "crumble" to "fall" :

sed s/crumble/fall/ testS.txt
It the sky we look upon now now
Should tumble and fall
All of the mountains may fall May crumble to the sea

Let's add g option to replace all ocurences in the line:

sed s/crumble/fall/g testS.txt
It the sky we look upon now now
Should tumble and fall
All of the mountains may fall May fall to the sea

And to only replace the second occurence in the line :

sed s/crumble/fall/2 testS.txt
It the sky we look upon now now
Should tumble and fall
All of the mountains may crumble May fall to the sea

Removing blank lines, comments, tab or space characters starting or ending a line may be useful

This is our previous text :

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 

We need two commands and patterns using Regular Expressions :

Let's see the result :

sed -e 's/#.*//' -e '/^$/ d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 6 

To remove all space and tab characters starting or ending a line, we need two more substitute commands :

You would have to explicitly type a space character and a tab character inside the brackets ['Space key''Tab key']

Let's clean our text from comments, blank lines, tab and space characters starting or ending lines :

sed -e 's/#.*//' -e 's/^[   ]*//' -e 's/[   ]*$//' -e '/^$/ d' hello.txt
This is the first line.
Line 2 ? argh
Line 3  ? hi
Line 4 starts with blank characters.
Line 6

Operations order is important :

Together, the fourth commands remove all blank lines, comments, and tabs or spaces at the beginning or the end of a line.

Test

sed '/\?$/d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 
sed '/?$/d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 
sed '/\\\!$/d' hello.txt 
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters. 
sed '/! $/d' hello.txt
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.