The
cut
command is used for text processing. It is a command line utility. You can use this command to extract portions of text from a file.
Selection can be done on:
-c
option.-d
and -f
options.-b
option.The cut
command selects sections from each line of the input stream and sends the result to the standard output.
Let's create a simple file named 'cut.txt' to test this command:
cat > cut.txt << EOF
Luka:M:14
Mathias:M:11
Jules:M:11
Eloise:F:5
Thibaud:M:3
Nina:F:11
Zoe:F:15
Gaspard:M:6
EOF
The -c
option cuts (selects) specific characters.
To select the second character:
cut -c2 cut.txt
u
a
u
l
h
i
o
a
To select a range of characters for example:
2-5
cut -c2-5 cut.txt
uka:
athi
ules
lois
hiba
ina:
oe:F
aspa
2-
:)cut -c2- cut.txt
uka:M:14
athias:M:11
ules:M:11
loise:F:5
hibaud:M:3
ina:F:11
oe:F:15
aspard:M:6
-8
)cut -c-8 cut.txt
Luka:M:1
Mathias:
Jules:M:
Eloise:F
Thibaud:
Nina:F:1
Zoe:F:15
Gaspard:
-d
option specifies the delimiter in the file. The delimiter can be set to a comma ',' or ':' or ' ', etc.-f
option indicates the number of the field(s) to be cutcut -d ':' -f 3 cut.txt
14
11
11
5
3
11
15
6
To select more than one column
cut -d ':' -f 1,4 cut.txt
Luka
Mathias
Jules
Eloise
Thibaud
Nina
Zoe
Gaspard
To select a range of columns
cut -d ':' -f 2-3 cut.txt
e```
## How to modify the output delimiter
The `--output-delimiter` option specifies the output delimiter.
M:14
M:11
M:11
F:5
M:3
F:11
F:15
M:6
bash: -c: line 4: unexpected EOF while looking for matching ``'
bash: -c: line 5: syntax error: unexpected end of file
cut -d ':' -f 1,3 --output-delimiter=' ' cut.txt
Luka 14
Mathias 11
Jules 11
Eloise 5
Thibaud 3
Nina 11
Zoe 15
Gaspard 6
-c 2
option is used to cut the second character from each line--complement
école option specifies to cut the complement of the -c2
optioncut --complement -c 2 cut.txt
Lka:M:14
Mthias:M:11
Jles:M:11
Eoise:F:5
Tibaud:M:3
Nna:F:11
Ze:F:15
Gspard:M:6
To cut out a section by specifying byte positions use the -b
option.
echo 'great' | cut -b 1-4
grea
echo 'école' | cut -b 1-2
é
The cut
command is based on the range of bytes and it is not necessarily equal to the number of characters. Accentuated letters as 'é' are encoded on two bytes. This explains why the result is just one character in this case.