Sed is commonly used to search for and replace sub strings in text files, delete or extract lines from text files.

How does SED Work ?

SED is a Stream EDitor. SED works as a "flow" mode : input stream is processed line by line. This ensures good performance and a reduced memory use but prevents SED to have an overview of the entire file.

SED Processing Steps :

Sed reads the first line from the input stream.
First line is treated with any encountered commands from an input script.
Sed displays the resulting line to the standart output except if -n option is specified.
Sed reads next line and repeats previous steps.

Note: MAN page

sed [OPTION]... {script-only-if-no-other-script} [input-file]...

Commands

sed receives a script which contains all actions to be performed on the input stream.

There are two ways to forward this script to the input stream :

From the sed command line : with '-e' option, each action from the script may be separated with semicolons and directly typed in the command line sed -e action1; action2; action3 or each action should be preceded with -e. You could either write sed -e action1 -e action2 -e action3
From an external file (eg myscript.sed) containing the script with "-f option" sed -f script-file. This way, commands are read from a file. This ensures better readability for large scripts, and allows script reuse.

Output stream

Two choices for SED output stream :

The first method consists in applying the command to an input stream, and in redirecting result lines to an output stream. As an example, sed may be applied to an input file, and its output may be redirected to another file.
The second method is the "direct" method, with "-i" option : sed -i applies the command directly on the input file and modify it.

Let's create a simple file to test sed commands and display results in a terminal window :

cat > hello.txt <<- EOF
This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

    # This is the last comment starting with blank characters. 
EOF

This file is saved as hello.txt. To display this file, we can use the cat command :

cat hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

Deleting lines with SED "d command"

The d command is used to delete selected lines. As sed works on a data stream, it is not a real suppression, sed just jumps to the next line.

We may filter lines by requesting a selection on line numbers.

Let's delete line 3 for example :

sed '3d' hello.txt

This is the first line.
Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

It worked ! Line 3 was deleted.

Line selection can also be done as an interval

Let's delete lines from 2 to 4 :

sed '2,3d' hello.txt

This is the first line.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

We jumped after line 1 directly to line 4.

We may want to process two actions in the same command line with option '-e' :

The -e option allows several commands to be executed in sequence.

Supposing the first action to be deleting line 1 and the second one deleting line 3 :

sed -e '1d' -e '3d' hello.txt is the same as

sed -e '1d; 3d' hello.txt

Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

You may notice line 1 and 3 were deleted.

Let's now delete lines which contain the string : "comment"

sed '/comment/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Let's remove lines from line 4 to the end of file.

sed '4,$ d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi

Negative condition `!`

Let's delete lines other than a specified range, line other than 2nd till 4th. The symbol ! indicates negative condition :

sed '2,4!d' hello.txt

Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Using Regular Expression

In Regular Expressions, the symbol ^ means beginning of a line, and $ means end of a line. It is thus obvious that the pattern ^$ stands for an empty line. Patterns are indicated between slashes Characters.

To delete empty line :

sed '/^$/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 6 # This is a comment.
# This is the last comment starting with blank characters.

Deleting all lines starting with the string "Line"

sed '/^Line/ d' hello.txt

This is the first line.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

Deleting all lines ending with some specified characters could also be useful

sed '/[ih]$/d' hello.txt

This is the first line.
Line 2 ? argh 
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

[ih] indicates either 'i' or 'h'. So, this will delete all lines ending with either 'i' or 'h'.

We can use RegEx to filter lines from an interval.

Rules to select lines from an interval (two patterns which are separated by commas) :

sed selects the first line matching the first pattern (before the coma).
It also selects following lines untill amother line matches the second pattern (after the coma).
It jumps on following lines until one matches the first pattern again.
It selects following lines until second RegEx is matched (if none match, selection is done up to the last line).

sed '/^Line/,/characters\.$/!d' hello.txt

Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

Printing with "p command"

First we'll ask sed to apply the command p (print) to the lines of the input file without any filtering.

sed -e 'p' hello.txt

This is the first line.
This is the first line.
Line 2 ? argh 
Line 2 ? argh 
Line 3  ? hi
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 4 starts with blank characters.

    Line 6 # This is a comment.
    Line 6 # This is a comment.


# This is the last comment starting with blank characters. 
# This is the last comment starting with blank characters.

We can notice all lines are duplicated. Let's understand why : Sed displays by default the resulting line on the output standard unless it is invoked with the -n option. On the other hand, with 'p command' sed is also explicitely asked to display the resulting line. This leads to a duplication of the resulting line.

We will try again with -n option

sed -n -e 'p' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

Lines duplication is no more effective.

The 'p command' can be used symmetrically to the 'd command'

To pick up all lines containing a word for example : let's try again with the word "comment"

sed -n '/comment/p' hello.txt

    Line 6 # This is a comment.
# This is the last comment starting with blank characters.

The negative form of "d command" will produce the same result :

sed '/comment/!d' hello.txt

    Line 6 # This is a comment.
# This is the last comment starting with blank characters.

p is mainly used for two reasons :

To display the nth line from a file or selected lines from an interval. You may also choose tail or head tools.
To filter lines from an interval which is delimited by RegEx. Note that to filter lines matching a specific pattern based on RegEx you can use grep tool.

"s command" for Substitution

Sed has several commands, but the substitute command is the most used one because d and p commands may be replaced with other tools as grep, head, tail or tr.

The substitute command replaces patterns from an input stream (a text file for example) into a new value. This pattern may be a regular expression.

s / pattern / replacement /

By default, it takes place on the first occurrence of the pattern in the line, unless the option g is added at the end of the command :

s / pattern / replacement / g

One can also choose to replace the third occurence :

s / pattern / replacement / 3

Let's create a test file

cat > testS.txt <<- EOF
It the sky we look upon now now
Should tumble and fall
All of the mountains may crumble May crumble to the sea
EOF

A simple example to substitute the first occurence in the line of "crumble" to "fall" :

sed s/crumble/fall/ testS.txt

It the sky we look upon now now
Should tumble and fall
All of the mountains may fall May crumble to the sea

Let's add g option to replace all ocurences in the line:

sed s/crumble/fall/g testS.txt

It the sky we look upon now now
Should tumble and fall
All of the mountains may fall May fall to the sea

And to only replace the second occurence in the line :

sed s/crumble/fall/2 testS.txt

It the sky we look upon now now
Should tumble and fall
All of the mountains may crumble May fall to the sea

Removing blank lines, comments, tab or space characters starting or ending a line may be useful

This is our previous text :

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

We need two commands and patterns using Regular Expressions :

The first is a substitute command -e 's/#.*/' replaces any comments by an empty string : this command removes any characters from the "#" character to the end of the line. The symbol . means any character and the symbol * means 0 or more of "any" characters.
The second is a delete command which removes all blank lines -e '/^$/ d'

Let's see the result :

sed -e 's/#.*//' -e '/^$/ d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.
    Line 6

To remove all space and tab characters starting or ending a line, we need two more substitute commands :

to remove (replace with an empty string) tab or space characters starting the line :-e 's/^[ ]*//'
to remove tab or space characters ending the line : -e 's/[ ]*$//'

You would have to explicitly type a space character and a tab character inside the brackets ['Space key''Tab key']

[ ] indicates either 'space character' or 'tab character'
the symbol * placed after the brackets means that the characters inside brackets may be repeted 0 or more times.

Let's clean our text from comments, blank lines, tab and space characters starting or ending lines :

sed -e 's/#.*//' -e 's/^[   ]*//' -e 's/[   ]*$//' -e '/^$/ d' hello.txt

This is the first line.
Line 2 ? argh
Line 3  ? hi
Line 4 starts with blank characters.
Line 6

Operations order is important :

Comments might start in the middle of a line. Therefore comments are first removed from a line, potentially leaving white space characters that were before the comment.
The second and third command removes all trailing blanks, so that lines that are now blank are converted to empty lines.
The last command deletes empty lines.

Together, the fourth commands remove all blank lines, comments, and tabs or spaces at the beginning or the end of a line.

Test

sed '/\?$/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

sed '/?$/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

sed '/\\\!$/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.

sed '/! $/d' hello.txt

This is the first line.
Line 2 ? argh 
Line 3  ? hi
    Line 4 starts with blank characters.

    Line 6 # This is a comment.

# This is the last comment starting with blank characters.