-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathcut.html
199 lines (174 loc) · 20.6 KB
/
cut.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
<!doctypehtml><html class="sidebar-visible no-js light"lang=en><head><meta charset=UTF-8><title>cut - CLI text processing with GNU Coreutils</title><meta content="text/html; charset=utf-8"http-equiv=Content-Type><meta content="Example based guide for specialized text processing with GNU Coreutils"name=description><meta content=width=device-width,initial-scale=1 name=viewport><meta content=#ffffff name=theme-color><meta content="CLI text processing with GNU Coreutils"property=og:title><meta content=website property=og:type><meta content="Example based guide for specialized text processing with GNU Coreutils"property=og:description><meta content=https://learnbyexample.github.io/cli_text_processing_coreutils/ property=og:url><meta content=https://raw.githubusercontent.com/learnbyexample/cli_text_processing_coreutils/main/images/cli_coreutils_ls.png property=og:image><meta content=1280 property=og:image:width><meta content=720 property=og:image:height><meta content=summary_large_image property=twitter:card><meta content=@learn_byexample property=twitter:site><link href=favicon.svg rel=icon><link rel="shortcut icon"href=favicon.png><link href=css/variables.css rel=stylesheet><link href=css/general.css rel=stylesheet><link href=css/chrome.css rel=stylesheet><link href=FontAwesome/css/font-awesome.css rel=stylesheet><link href=fonts/fonts.css rel=stylesheet><link href=highlight.css rel=stylesheet><link href=tomorrow-night.css rel=stylesheet><link href=ayu-highlight.css rel=stylesheet><link href=style.css rel=stylesheet><body><script>var path_to_root = "";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "navy" : "light";</script><script>try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }</script><script>var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
var html = document.querySelector('html');
html.classList.remove('no-js')
html.classList.remove('light')
html.classList.add(theme);
html.classList.add('js');</script><script>var html = document.querySelector('html');
var sidebar = 'hidden';
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
}
html.classList.remove('sidebar-visible');
html.classList.add("sidebar-" + sidebar);</script><nav aria-label="Table of contents"class=sidebar id=sidebar><div class=sidebar-scrollbox><ol class=chapter><li class="chapter-item expanded affix"><a href=cover.html>Cover</a><li class="chapter-item expanded affix"><a href=buy.html>Buy PDF/EPUB versions</a><li class="chapter-item expanded affix"><a href=preface.html>Preface</a><li class="chapter-item expanded"><a href=introduction.html><strong aria-hidden=true>1.</strong> Introduction</a><li class="chapter-item expanded"><a href=cat-tac.html><strong aria-hidden=true>2.</strong> cat and tac</a><li class="chapter-item expanded"><a href=head-tail.html><strong aria-hidden=true>3.</strong> head and tail</a><li class="chapter-item expanded"><a href=tr.html><strong aria-hidden=true>4.</strong> tr</a><li class="chapter-item expanded"><a class=active href=cut.html><strong aria-hidden=true>5.</strong> cut</a><li class="chapter-item expanded"><a href=seq.html><strong aria-hidden=true>6.</strong> seq</a><li class="chapter-item expanded"><a href=shuf.html><strong aria-hidden=true>7.</strong> shuf</a><li class="chapter-item expanded"><a href=paste.html><strong aria-hidden=true>8.</strong> paste</a><li class="chapter-item expanded"><a href=pr.html><strong aria-hidden=true>9.</strong> pr</a><li class="chapter-item expanded"><a href=fold-fmt.html><strong aria-hidden=true>10.</strong> fold and fmt</a><li class="chapter-item expanded"><a href=sort.html><strong aria-hidden=true>11.</strong> sort</a><li class="chapter-item expanded"><a href=uniq.html><strong aria-hidden=true>12.</strong> uniq</a><li class="chapter-item expanded"><a href=comm.html><strong aria-hidden=true>13.</strong> comm</a><li class="chapter-item expanded"><a href=join.html><strong aria-hidden=true>14.</strong> join</a><li class="chapter-item expanded"><a href=nl.html><strong aria-hidden=true>15.</strong> nl</a><li class="chapter-item expanded"><a href=wc.html><strong aria-hidden=true>16.</strong> wc</a><li class="chapter-item expanded"><a href=split.html><strong aria-hidden=true>17.</strong> split</a><li class="chapter-item expanded"><a href=csplit.html><strong aria-hidden=true>18.</strong> csplit</a><li class="chapter-item expanded"><a href=expand-unexpand.html><strong aria-hidden=true>19.</strong> expand and unexpand</a><li class="chapter-item expanded"><a href=basename-dirname.html><strong aria-hidden=true>20.</strong> basename and dirname</a><li class="chapter-item expanded affix"><a href=what_next.html>What next?</a><li class="chapter-item expanded affix"><a href=Exercise_solutions.html>Exercise solutions</a></li><br><hr><li class="chapter-item expanded"><i class="fa fa-github"id=git-repository-button></i><a href=https://github.com/learnbyexample/cli_text_processing_coreutils> Source code</a><li class="chapter-item expanded"><i class="fa fa-home"id=home-button></i><a href=https://learnbyexample.github.io/> My Blog</a><li class="chapter-item expanded"><i class="fa fa-book"id=book-button></i><a href=https://learnbyexample.github.io/books/> My Books</a><li class="chapter-item expanded"><i class="fa fa-envelope"id=mail-button></i><a href=https://learnbyexample.gumroad.com/l/learnbyexample-weekly> learnbyexample weekly</a><li class="chapter-item expanded"><i class="fa fa-twitter"id=twitter-button></i><a href=https://twitter.com/learn_byexample> Twitter</a></ol></div><div class=sidebar-resize-handle id=sidebar-resize-handle></div></nav><div class=page-wrapper id=page-wrapper><div class=page><div id=menu-bar-hover-placeholder></div><div class="menu-bar sticky bordered"id=menu-bar><div class=left-buttons><button aria-label="Toggle Table of Contents"title="Toggle Table of Contents"aria-controls=sidebar class=icon-button id=sidebar-toggle type=button><i class="fa fa-bars"></i></button><button aria-label="Change theme"title="Change theme"aria-controls=theme-list aria-expanded=false aria-haspopup=true class=icon-button id=theme-toggle type=button><i class="fa fa-paint-brush"></i></button><ul aria-label=Themes class=theme-popup id=theme-list role=menu><li role=none><button class=theme id=light role=menuitem>Light (default)</button><li role=none><button class=theme id=rust role=menuitem>Rust</button><li role=none><button class=theme id=coal role=menuitem>Coal</button><li role=none><button class=theme id=navy role=menuitem>Navy</button><li role=none><button class=theme id=ayu role=menuitem>Ayu</button></ul><button aria-label="Toggle Searchbar"title="Search. (Shortkey: s)"aria-controls=searchbar aria-expanded=false aria-keyshortcuts=S class=icon-button id=search-toggle type=button><i class="fa fa-search"></i></button></div><h1 class=menu-title>CLI text processing with GNU Coreutils</h1><div class=right-buttons><a aria-label=Blog href=https://learnbyexample.github.io title=Blog> <i class="fa fa-home"id=home-button></i> </a><a aria-label=Twitter href=https://twitter.com/learn_byexample title=Twitter> <i class="fa fa-twitter"id=twitter-button></i> </a><a aria-label="Git repository"title="Git repository"href=https://github.com/learnbyexample/cli_text_processing_coreutils> <i class="fa fa-github"id=git-repository-button></i> </a></div></div><div class=hidden id=search-wrapper><form class=searchbar-outer id=searchbar-outer><input placeholder="Search this book ..."aria-controls=searchresults-outer aria-describedby=searchresults-header id=searchbar name=searchbar type=search></form><div class="searchresults-outer hidden"id=searchresults-outer><div class=searchresults-header id=searchresults-header></div><ul id=searchresults></ul></div></div><script>document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});</script><div class=content id=content><main><div class=sidetoc><nav class=pagetoc></nav></div><h1 id=cut><a class=header href=#cut>cut</a></h1><p><code>cut</code> is a handy tool for many field processing use cases. The features are limited compared to <code>awk</code> and <code>perl</code> commands, but the reduced scope also leads to faster processing.<h2 id=individual-field-selections><a class=header href=#individual-field-selections>Individual field selections</a></h2><p>By default, <code>cut</code> splits the input content into fields based on the tab character. You can use the <code>-f</code> option to select a desired field from each input line. To extract multiple fields, specify the selections separated by the comma character.<pre><code class=language-bash># only the second field
$ printf 'apple\tbanana\tcherry\n' | cut -f2
banana
# first and third fields
$ printf 'apple\tbanana\tcherry\n' | cut -f1,3
apple cherry
</code></pre><p><code>cut</code> will always display the selected fields in ascending order. And you cannot display a field more than once.<pre><code class=language-bash># same as: cut -f1,3
$ printf 'apple\tbanana\tcherry\n' | cut -f3,1
apple cherry
# same as: cut -f1,2
$ printf 'apple\tbanana\tcherry\n' | cut -f1,1,2,1,2,1,1,2
apple banana
</code></pre><p>By default, <code>cut</code> uses the newline character as the line separator. <code>cut</code> will add a newline character to the output even if the last input line doesn't end with a newline.<pre><code class=language-bash>$ printf 'good\tfood\ntip\ttap' | cut -f2
food
tap
</code></pre><h2 id=field-ranges><a class=header href=#field-ranges>Field ranges</a></h2><p>You can use the <code>-</code> character to specify field ranges. You can skip the starting or ending range, but not both.<pre><code class=language-bash># 2nd, 3rd and 4th fields
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f2-4
banana cherry fig
# all fields from the start till the 3rd field
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f-3
apple banana cherry
# all fields from the 3rd one till the end
$ printf 'apple\tbanana\tcherry\tfig\tmango\n' | cut -f3-
cherry fig mango
</code></pre><h2 id=input-field-delimiter><a class=header href=#input-field-delimiter>Input field delimiter</a></h2><p>Use the <code>-d</code> option to change the input delimiter. Only a single byte character is allowed. By default, the output delimiter will be same as the input delimiter.<pre><code class=language-bash>$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
$ cut -d, -f2,4 scores.csv
Maths,Chemistry
100,100
97,95
78,80
# use quotes if the delimiter is a shell metacharacter
$ echo 'one;two;three;four' | cut -d; -f3
cut: option requires an argument -- 'd'
Try 'cut --help' for more information.
-f3: command not found
$ echo 'one;two;three;four' | cut -d';' -f3
three
</code></pre><h2 id=output-field-delimiter><a class=header href=#output-field-delimiter>Output field delimiter</a></h2><p>Use the <code>--output-delimiter</code> option to customize the output separator to any string of your choice. The string is treated literally. Depending on your shell you can use <a href=https://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html>ANSI-C quoting</a> to allow escape sequences.<pre><code class=language-bash># same as: tr '\t' ','
$ printf 'apple\tbanana\tcherry\n' | cut --output-delimiter=, -f1-
apple,banana,cherry
# example for multicharacter output separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=' : ' -f1,3-
one : three : four
# ANSI-C quoting example
# depending on your environment, you can also press Ctrl+v and then the Tab key
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\t' -f1,3-
one three four
# newline as the output field separator
$ echo 'one;two;three;four' | cut -d';' --output-delimiter=$'\n' -f2,4
two
four
</code></pre><h2 id=complement><a class=header href=#complement>Complement</a></h2><p>The <code>--complement</code> option allows you to invert the field selections.<pre><code class=language-bash># except the second field
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f2
apple cat
1 3 4 5
# except the first and third fields
$ printf 'apple ball cat\n1 2 3 4 5' | cut --complement -d' ' -f1,3
ball
2 4 5
</code></pre><h2 id=suppress-lines-without-delimiters><a class=header href=#suppress-lines-without-delimiters>Suppress lines without delimiters</a></h2><p>By default, lines not containing the input delimiter will still be part of the output. You can use the <code>-s</code> option to suppress such lines.<pre><code class=language-bash>$ cat mixed_fields.csv
1,2,3,4
hello
a,b,c
# second line doesn't have the comma separator
# by default, such lines will be part of the output
$ cut -d, -f2 mixed_fields.csv
2
hello
b
# use the -s option to suppress such lines
$ cut -sd, -f2 mixed_fields.csv
2
b
$ cut --complement -sd, -f2 mixed_fields.csv
1,3,4
a,c
</code></pre><blockquote><p><img alt=info src=./images/info.svg> If a line contains the specified delimiter but doesn't have the field number requested, you'll get a blank line. The <code>-s</code> option has no effect on such lines.<pre><code class=language-bash>$ printf 'apple ball cat\n1 2 3 4 5' | cut -d' ' -f4
4
</code></pre></blockquote><h2 id=character-selections><a class=header href=#character-selections>Character selections</a></h2><p>You can use the <code>-b</code> or <code>-c</code> options to select specific bytes from each input line. The syntax is same as the <code>-f</code> option. The <code>-c</code> option is intended for multibyte character selection, but for now it works exactly as the <code>-b</code> option. Character selection is useful for working with <a href=https://stackoverflow.com/q/7666780/4082052>fixed-width fields</a>.<pre><code class=language-bash>$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11
pan
$ printf 'apple\tbanana\tcherry\n' | cut -c2,8,11 --output-delimiter=-
p-a-n
$ printf 'apple\tbanana\tcherry\n' | cut -c-5
apple
$ printf 'apple\tbanana\tcherry\n' | cut --complement -c13-
apple banana
$ printf 'cat-bat\ndog:fog\nget;pet' | cut -c5-
bat
fog
pet
</code></pre><h2 id=nul-separator><a class=header href=#nul-separator>NUL separator</a></h2><p>Use the <code>-z</code> option if you want to use NUL character as the line separator. In this scenario, <code>cut</code> will ensure to add a final NUL character even if not present in the input.<pre><code class=language-bash>$ printf 'good-food\0tip-tap\0' | cut -zd- -f2 | cat -v
food^@tap^@
</code></pre><h2 id=alternatives><a class=header href=#alternatives>Alternatives</a></h2><p>Here are some alternate commands you can explore if <code>cut</code> isn't enough to solve your task.<ul><li><a href=https://github.com/sstadick/hck>hck</a> — supports regexp delimiters, field reordering, header based selection, etc<li><a href=https://github.com/theryangeary/choose>choose</a> — negative indexing, regexp based delimiters, etc<li><a href=https://github.com/BurntSushi/xsv>xsv</a> — fast CSV command line toolkit<li><a href=https://github.com/learnbyexample/regexp-cut>rcut</a> — my <code>bash+awk</code> script, supports regexp delimiters, field reordering, negative indexing, etc<li><a href=https://github.com/learnbyexample/learn_gnuawk>awk</a> — my ebook on <code>GNU awk</code> one-liners<li><a href=https://github.com/learnbyexample/learn_perl_oneliners>perl</a> — my ebook on Perl one-liners</ul><h2 id=exercises><a class=header href=#exercises>Exercises</a></h2><blockquote><p><img alt=info src=images/info.svg> The <a href=https://github.com/learnbyexample/cli_text_processing_coreutils/tree/main/exercises>exercises</a> directory has all the files used in this section.</blockquote><p><strong>1)</strong> Display only the third field.<pre><code class=language-bash>$ printf 'tea\tcoffee\tchocolate\tfruit\n' | ##### add your solution here
chocolate
</code></pre><p><strong>2)</strong> Display the second and fifth fields. Consider <code>,</code> as the field separator.<pre><code class=language-bash>$ echo 'tea,coffee,chocolate,ice cream,fruit' | ##### add your solution here
coffee,fruit
</code></pre><p><strong>3)</strong> Why does the below command not work as expected? What other tools can you use in such cases?<pre><code class=language-bash># not working as expected
$ echo 'apple,banana,cherry,fig' | cut -d, -f3,1,3
apple,cherry
# expected output
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
cherry,apple,cherry
</code></pre><p><strong>4)</strong> Display except the second field in the format shown below. Can you construct two different solutions?<pre><code class=language-bash># solution 1
$ echo 'apple,banana,cherry,fig' | ##### add your solution here
apple cherry fig
# solution 2
$ echo '2,3,4,5,6,7,8' | ##### add your solution here
2 4 5 6 7 8
</code></pre><p><strong>5)</strong> Extract the first three characters from the input lines as shown below. Can you also use the <code>head</code> command for this purpose? If not, why not?<pre><code class=language-bash>$ printf 'apple\nbanana\ncherry\nfig\n' | ##### add your solution here
app
ban
che
fig
</code></pre><p><strong>6)</strong> Display only the first and third fields of the <code>scores.csv</code> input file, with tab as the output field separator.<pre><code class=language-bash>$ cat scores.csv
Name,Maths,Physics,Chemistry
Ith,100,100,100
Cy,97,98,95
Lin,78,83,80
##### add your solution here
Name Physics
Ith 100
Cy 98
Lin 83
</code></pre><p><strong>7)</strong> The given input data uses one or more <code>:</code> characters as the field separator. Assume that no field content will have the <code>:</code> character. Display except the second field, with <code>:</code> as the output field separator.<pre><code class=language-bash>$ cat books.txt
Cradle:::Mage Errant::The Weirkey Chronicles
Mother of Learning::Eight:::::Dear Spellbook:Ascendant
Mark of the Fool:Super Powereds:::Ends of Magic
##### add your solution here
Cradle : The Weirkey Chronicles
Mother of Learning : Dear Spellbook : Ascendant
Mark of the Fool : Ends of Magic
</code></pre><p><strong>8)</strong> Which option would you use to not display lines that do not contain the input delimiter character?<p><strong>9)</strong> Modify the command to get the expected output shown below.<pre><code class=language-bash>$ printf 'apple\nbanana\ncherry\n' | cut -c-3 --output-delimiter=:
app
ban
che
$ printf 'apple\nbanana\ncherry\n' | ##### add your solution here
a:p:p
b:a:n
c:h:e
</code></pre><p><strong>10)</strong> Figure out the logic based on the given input and output data.<pre><code class=language-bash>$ printf 'apple\0fig\0carpet\0jeep\0' | ##### add your solution here | cat -v
ple^@g^@rpet^@ep^@
</code></pre></main><nav aria-label="Page navigation"class=nav-wrapper><a aria-label="Previous chapter"class="mobile-nav-chapters previous"title="Previous chapter"aria-keyshortcuts=Left href=tr.html rel=prev> <i class="fa fa-angle-left"></i> </a><a aria-label="Next chapter"class="mobile-nav-chapters next"title="Next chapter"aria-keyshortcuts=Right href=seq.html rel=next> <i class="fa fa-angle-right"></i> </a><div style="clear: both"></div></nav></div></div><nav aria-label="Page navigation"class=nav-wide-wrapper><a aria-label="Previous chapter"class="nav-chapters previous"title="Previous chapter"aria-keyshortcuts=Left href=tr.html rel=prev> <i class="fa fa-angle-left"></i> </a><a aria-label="Next chapter"class="nav-chapters next"title="Next chapter"aria-keyshortcuts=Right href=seq.html rel=next> <i class="fa fa-angle-right"></i> </a></nav></div><script>window.playground_copyable = true;</script><script charset=utf-8 src=elasticlunr.min.js></script><script charset=utf-8 src=mark.min.js></script><script charset=utf-8 src=searcher.js></script><script charset=utf-8 src=clipboard.min.js></script><script charset=utf-8 src=highlight.js></script><script charset=utf-8 src=book.js></script><script src=sidebar.js></script>