Linux Unix Cut Command

The cut command is used for text processing. It is a command line utility. You can use this command to extract portions of text from a file.

Selection can be done on:

The cutcommand selects sections from each line of the input stream and sends the result to the standard output.

Let's create a simple file named 'cut.txt' to test this command:

cat > cut.txt << EOF
Luka:M:14
Mathias:M:11
Jules:M:11
Eloise:F:5
Thibaud:M:3
Nina:F:11
Zoe:F:15
Gaspard:M:6
EOF

Learning from examples

How to cut according to characters

The -c option cuts (selects) specific characters.

To select the second character:

cut -c2 cut.txt
u
a
u
l
h
i
o
a

To select a range of characters for example:

cut -c2-5 cut.txt
uka:
athi
ules
lois
hiba
ina:
oe:F
aspa
cut -c2- cut.txt
uka:M:14
athias:M:11
ules:M:11
loise:F:5
hibaud:M:3
ina:F:11
oe:F:15
aspard:M:6
cut -c-8 cut.txt
Luka:M:1
Mathias:
Jules:M:
Eloise:F
Thibaud:
Nina:F:1
Zoe:F:15
Gaspard:

How to cut according to a delimiter :

cut -d ':' -f 3 cut.txt
14
11
11
5
3
11
15
6

To select more than one column

cut -d ':' -f 1,4 cut.txt
Luka
Mathias
Jules
Eloise
Thibaud
Nina
Zoe
Gaspard

To select a range of columns

cut -d ':' -f 2-3 cut.txt
e```
## How to modify the output delimiter

The `--output-delimiter` option specifies the output delimiter.
M:14
M:11
M:11
F:5
M:3
F:11
F:15
M:6
bash: -c: line 4: unexpected EOF while looking for matching ``'
bash: -c: line 5: syntax error: unexpected end of file
cut -d ':' -f 1,3 --output-delimiter=' ' cut.txt
Luka 14
Mathias 11
Jules 11
Eloise 5
Thibaud 3
Nina 11
Zoe 15
Gaspard 6

How to cut the complement characters

cut --complement -c 2 cut.txt
Lka:M:14
Mthias:M:11
Jles:M:11
Eoise:F:5
Tibaud:M:3
Nna:F:11
Ze:F:15
Gspard:M:6

How to cut according to byte positions

To cut out a section by specifying byte positions use the -b option.

echo 'great' | cut -b 1-4
grea
echo 'école' | cut -b 1-2
é

The cut command is based on the range of bytes and it is not necessarily equal to the number of characters. Accentuated letters as 'é' are encoded on two bytes. This explains why the result is just one character in this case.