-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathcomm.html
157 lines (150 loc) · 17.2 KB
/
comm.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<!doctypehtml><html class="sidebar-visible no-js light"lang=en><head><meta charset=UTF-8><title>comm - CLI text processing with GNU Coreutils</title><meta content="text/html; charset=utf-8"http-equiv=Content-Type><meta content="Example based guide for specialized text processing with GNU Coreutils"name=description><meta content=width=device-width,initial-scale=1 name=viewport><meta content=#ffffff name=theme-color><meta content="CLI text processing with GNU Coreutils"property=og:title><meta content=website property=og:type><meta content="Example based guide for specialized text processing with GNU Coreutils"property=og:description><meta content=https://learnbyexample.github.io/cli_text_processing_coreutils/ property=og:url><meta content=https://raw.githubusercontent.com/learnbyexample/cli_text_processing_coreutils/main/images/cli_coreutils_ls.png property=og:image><meta content=1280 property=og:image:width><meta content=720 property=og:image:height><meta content=summary_large_image property=twitter:card><meta content=@learn_byexample property=twitter:site><link href=favicon.svg rel=icon><link rel="shortcut icon"href=favicon.png><link href=css/variables.css rel=stylesheet><link href=css/general.css rel=stylesheet><link href=css/chrome.css rel=stylesheet><link href=FontAwesome/css/font-awesome.css rel=stylesheet><link href=fonts/fonts.css rel=stylesheet><link href=highlight.css rel=stylesheet><link href=tomorrow-night.css rel=stylesheet><link href=ayu-highlight.css rel=stylesheet><link href=style.css rel=stylesheet><body><script>var path_to_root = "";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "navy" : "light";</script><script>try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }</script><script>var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
var html = document.querySelector('html');
html.classList.remove('no-js')
html.classList.remove('light')
html.classList.add(theme);
html.classList.add('js');</script><script>var html = document.querySelector('html');
var sidebar = 'hidden';
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
}
html.classList.remove('sidebar-visible');
html.classList.add("sidebar-" + sidebar);</script><nav aria-label="Table of contents"class=sidebar id=sidebar><div class=sidebar-scrollbox><ol class=chapter><li class="chapter-item expanded affix"><a href=cover.html>Cover</a><li class="chapter-item expanded affix"><a href=buy.html>Buy PDF/EPUB versions</a><li class="chapter-item expanded affix"><a href=preface.html>Preface</a><li class="chapter-item expanded"><a href=introduction.html><strong aria-hidden=true>1.</strong> Introduction</a><li class="chapter-item expanded"><a href=cat-tac.html><strong aria-hidden=true>2.</strong> cat and tac</a><li class="chapter-item expanded"><a href=head-tail.html><strong aria-hidden=true>3.</strong> head and tail</a><li class="chapter-item expanded"><a href=tr.html><strong aria-hidden=true>4.</strong> tr</a><li class="chapter-item expanded"><a href=cut.html><strong aria-hidden=true>5.</strong> cut</a><li class="chapter-item expanded"><a href=seq.html><strong aria-hidden=true>6.</strong> seq</a><li class="chapter-item expanded"><a href=shuf.html><strong aria-hidden=true>7.</strong> shuf</a><li class="chapter-item expanded"><a href=paste.html><strong aria-hidden=true>8.</strong> paste</a><li class="chapter-item expanded"><a href=pr.html><strong aria-hidden=true>9.</strong> pr</a><li class="chapter-item expanded"><a href=fold-fmt.html><strong aria-hidden=true>10.</strong> fold and fmt</a><li class="chapter-item expanded"><a href=sort.html><strong aria-hidden=true>11.</strong> sort</a><li class="chapter-item expanded"><a href=uniq.html><strong aria-hidden=true>12.</strong> uniq</a><li class="chapter-item expanded"><a class=active href=comm.html><strong aria-hidden=true>13.</strong> comm</a><li class="chapter-item expanded"><a href=join.html><strong aria-hidden=true>14.</strong> join</a><li class="chapter-item expanded"><a href=nl.html><strong aria-hidden=true>15.</strong> nl</a><li class="chapter-item expanded"><a href=wc.html><strong aria-hidden=true>16.</strong> wc</a><li class="chapter-item expanded"><a href=split.html><strong aria-hidden=true>17.</strong> split</a><li class="chapter-item expanded"><a href=csplit.html><strong aria-hidden=true>18.</strong> csplit</a><li class="chapter-item expanded"><a href=expand-unexpand.html><strong aria-hidden=true>19.</strong> expand and unexpand</a><li class="chapter-item expanded"><a href=basename-dirname.html><strong aria-hidden=true>20.</strong> basename and dirname</a><li class="chapter-item expanded affix"><a href=what_next.html>What next?</a><li class="chapter-item expanded affix"><a href=Exercise_solutions.html>Exercise solutions</a></li><br><hr><li class="chapter-item expanded"><i class="fa fa-github"id=git-repository-button></i><a href=https://github.com/learnbyexample/cli_text_processing_coreutils> Source code</a><li class="chapter-item expanded"><i class="fa fa-home"id=home-button></i><a href=https://learnbyexample.github.io/> My Blog</a><li class="chapter-item expanded"><i class="fa fa-book"id=book-button></i><a href=https://learnbyexample.github.io/books/> My Books</a><li class="chapter-item expanded"><i class="fa fa-envelope"id=mail-button></i><a href=https://learnbyexample.gumroad.com/l/learnbyexample-weekly> learnbyexample weekly</a><li class="chapter-item expanded"><i class="fa fa-twitter"id=twitter-button></i><a href=https://twitter.com/learn_byexample> Twitter</a></ol></div><div class=sidebar-resize-handle id=sidebar-resize-handle></div></nav><div class=page-wrapper id=page-wrapper><div class=page><div id=menu-bar-hover-placeholder></div><div class="menu-bar sticky bordered"id=menu-bar><div class=left-buttons><button aria-label="Toggle Table of Contents"title="Toggle Table of Contents"aria-controls=sidebar class=icon-button id=sidebar-toggle type=button><i class="fa fa-bars"></i></button><button aria-label="Change theme"title="Change theme"aria-controls=theme-list aria-expanded=false aria-haspopup=true class=icon-button id=theme-toggle type=button><i class="fa fa-paint-brush"></i></button><ul aria-label=Themes class=theme-popup id=theme-list role=menu><li role=none><button class=theme id=light role=menuitem>Light (default)</button><li role=none><button class=theme id=rust role=menuitem>Rust</button><li role=none><button class=theme id=coal role=menuitem>Coal</button><li role=none><button class=theme id=navy role=menuitem>Navy</button><li role=none><button class=theme id=ayu role=menuitem>Ayu</button></ul><button aria-label="Toggle Searchbar"title="Search. (Shortkey: s)"aria-controls=searchbar aria-expanded=false aria-keyshortcuts=S class=icon-button id=search-toggle type=button><i class="fa fa-search"></i></button></div><h1 class=menu-title>CLI text processing with GNU Coreutils</h1><div class=right-buttons><a aria-label=Blog href=https://learnbyexample.github.io title=Blog> <i class="fa fa-home"id=home-button></i> </a><a aria-label=Twitter href=https://twitter.com/learn_byexample title=Twitter> <i class="fa fa-twitter"id=twitter-button></i> </a><a aria-label="Git repository"title="Git repository"href=https://github.com/learnbyexample/cli_text_processing_coreutils> <i class="fa fa-github"id=git-repository-button></i> </a></div></div><div class=hidden id=search-wrapper><form class=searchbar-outer id=searchbar-outer><input placeholder="Search this book ..."aria-controls=searchresults-outer aria-describedby=searchresults-header id=searchbar name=searchbar type=search></form><div class="searchresults-outer hidden"id=searchresults-outer><div class=searchresults-header id=searchresults-header></div><ul id=searchresults></ul></div></div><script>document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});</script><div class=content id=content><main><div class=sidetoc><nav class=pagetoc></nav></div><h1 id=comm><a class=header href=#comm>comm</a></h1><p>The <code>comm</code> command finds common and unique lines between two sorted files. These results are formatted as a table with three columns and one or more of these columns can be suppressed as required.<h2 id=three-column-output><a class=header href=#three-column-output>Three column output</a></h2><p>Consider the sample input files as shown below:<pre><code class=language-bash># side by side view of the sample files
# note that these files are already sorted
$ paste colors_1.txt colors_2.txt
Blue Black
Brown Blue
Orange Green
Purple Orange
Red Pink
Teal Red
White White
</code></pre><p>By default, <code>comm</code> gives a tabular output with three columns:<ul><li>first column has lines unique to the first file<li>second column has lines unique to the second file<li>third column has lines common to both the files</ul><p>The columns are separated by a tab character. Here's the output for the above sample files:<pre><code class=language-bash>$ comm colors_1.txt colors_2.txt
Black
Blue
Brown
Green
Orange
Pink
Purple
Red
Teal
White
</code></pre><p>You can change the column separator to a string of your choice using the <code>--output-delimiter</code> option. Here's an example:<pre><code class=language-bash># note that the input files need not have the same number of lines
$ comm <(seq 3) <(seq 2 5)
1
2
3
4
5
$ comm --output-delimiter=, <(seq 3) <(seq 2 5)
1
,,2
,,3
,4
,5
</code></pre><blockquote><p><img alt=info src=./images/info.svg> Collating order for <code>comm</code> should be same as the one used to <code>sort</code> the input files.</blockquote><blockquote><p><img alt=info src=./images/info.svg> <code>--nocheck-order</code> option can be used for unsorted inputs. However, as per the documentation, this option "is not guaranteed to produce any particular output."</blockquote><h2 id=suppressing-columns><a class=header href=#suppressing-columns>Suppressing columns</a></h2><p>You can use one or more of the following options to suppress columns:<ul><li><code>-1</code> to suppress the lines unique to the first file<li><code>-2</code> to suppress the lines unique to the second file<li><code>-3</code> to suppress the lines common to both the files</ul><p>Here's how the output looks like when you suppress one of the columns:<pre><code class=language-bash># suppress lines common to both the files
$ comm -3 colors_1.txt colors_2.txt
Black
Brown
Green
Pink
Purple
Teal
</code></pre><p>Combining two of these options gives three useful solutions. <code>-12</code> will give you only the common lines.<pre><code class=language-bash>$ comm -12 colors_1.txt colors_2.txt
Blue
Orange
Red
White
</code></pre><p><code>-23</code> will give you the lines unique to the first file.<pre><code class=language-bash>$ comm -23 colors_1.txt colors_2.txt
Brown
Purple
Teal
</code></pre><p><code>-13</code> will give you the lines unique to the second file.<pre><code class=language-bash>$ comm -13 colors_1.txt colors_2.txt
Black
Green
Pink
</code></pre><p>You can combine all the three options as well. Useful with the <code>--total</code> option to get only the count of lines for each of the three columns.<pre><code class=language-bash>$ comm --total -123 colors_1.txt colors_2.txt
3 3 4 total
</code></pre><h2 id=duplicate-lines><a class=header href=#duplicate-lines>Duplicate lines</a></h2><p>The number of duplicate lines in the common column will be minimum of the duplicate occurrences between the two files. Rest of the duplicate lines, if any, will be considered as unique to the file having the excess lines. Here's an example:<pre><code class=language-bash>$ paste list_1.txt list_2.txt
apple cherry
banana cherry
cherry mango
cherry papaya
cherry
cherry
# 'cherry' occurs only twice in the second file
# rest of the 'cherry' lines will be unique to the first file
$ comm list_1.txt list_2.txt
apple
banana
cherry
cherry
cherry
cherry
mango
papaya
</code></pre><h2 id=nul-separator><a class=header href=#nul-separator>NUL separator</a></h2><p>Use the <code>-z</code> option if you want to use NUL character as the line separator. In this scenario, <code>comm</code> will ensure to add a final NUL character even if not present in the input.<pre><code class=language-bash>$ comm -z -12 <(printf 'a\0b\0c') <(printf 'a\0c\0x') | cat -v
a^@c^@
</code></pre><h2 id=alternatives><a class=header href=#alternatives>Alternatives</a></h2><p>Here are some alternate commands you can explore if <code>comm</code> isn't enough to solve your task. These alternatives do not require the input files to be sorted.<ul><li><a href=https://github.com/yarrow/zet>zet</a> — set operations on one or more input files<li><a href=https://learnbyexample.github.io/learn_gnugrep_ripgrep/frequently-used-options.html#comparing-lines-between-files>Comparing lines between files</a> section from my <strong>GNU grep</strong> ebook<li><a href=https://learnbyexample.github.io/learn_gnuawk/two-file-processing.html>Two file processing</a> chapter from my <strong>GNU awk</strong> ebook, has examples for both line and field based comparisons<li><a href=https://learnbyexample.github.io/learn_perl_oneliners/two-file-processing.html>Two file processing</a> chapter from my <strong>Perl one-liners</strong> ebook, has examples for both line and field based comparisons</ul><h2 id=exercises><a class=header href=#exercises>Exercises</a></h2><blockquote><p><img alt=info src=images/info.svg> The <a href=https://github.com/learnbyexample/cli_text_processing_coreutils/tree/main/exercises>exercises</a> directory has all the files used in this section.</blockquote><p><strong>1)</strong> Get the common lines between the <code>s1.txt</code> and <code>s2.txt</code> files. Assume that their contents are already sorted.<pre><code class=language-bash>$ paste s1.txt s2.txt
apple banana
coffee coffee
fig eclair
honey fig
mango honey
pasta milk
sugar tea
tea yeast
##### add your solution here
coffee
fig
honey
tea
</code></pre><p><strong>2)</strong> Display lines present in <code>s1.txt</code> but not <code>s2.txt</code> and vice versa.<pre><code class=language-bash># lines unique to the first file
##### add your solution here
apple
mango
pasta
sugar
# lines unique to the second file
##### add your solution here
banana
eclair
milk
yeast
</code></pre><p><strong>3)</strong> Display lines unique to the <code>s1.txt</code> file and the common lines when compared to the <code>s2.txt</code> file. Use <code>==></code> to separate the output columns.<pre><code class=language-bash>##### add your solution here
apple
==>coffee
==>fig
==>honey
mango
pasta
sugar
==>tea
</code></pre><p><strong>4)</strong> What does the <code>--total</code> option do?<p><strong>5)</strong> Will the <code>comm</code> command fail if there are repeated lines in the input files? If not, what'd be the expected output for the command shown below?<pre><code class=language-bash>$ cat s3.txt
apple
apple
guava
honey
tea
tea
tea
$ comm -23 s3.txt s1.txt
</code></pre></main><nav aria-label="Page navigation"class=nav-wrapper><a aria-label="Previous chapter"class="mobile-nav-chapters previous"title="Previous chapter"aria-keyshortcuts=Left href=uniq.html rel=prev> <i class="fa fa-angle-left"></i> </a><a aria-label="Next chapter"class="mobile-nav-chapters next"title="Next chapter"aria-keyshortcuts=Right href=join.html rel=next> <i class="fa fa-angle-right"></i> </a><div style="clear: both"></div></nav></div></div><nav aria-label="Page navigation"class=nav-wide-wrapper><a aria-label="Previous chapter"class="nav-chapters previous"title="Previous chapter"aria-keyshortcuts=Left href=uniq.html rel=prev> <i class="fa fa-angle-left"></i> </a><a aria-label="Next chapter"class="nav-chapters next"title="Next chapter"aria-keyshortcuts=Right href=join.html rel=next> <i class="fa fa-angle-right"></i> </a></nav></div><script>window.playground_copyable = true;</script><script charset=utf-8 src=elasticlunr.min.js></script><script charset=utf-8 src=mark.min.js></script><script charset=utf-8 src=searcher.js></script><script charset=utf-8 src=clipboard.min.js></script><script charset=utf-8 src=highlight.js></script><script charset=utf-8 src=book.js></script><script src=sidebar.js></script>