-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathtuning.html
187 lines (187 loc) · 23.6 KB
/
tuning.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<title>WiredTiger: Performance Tuning</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
$(document).ready(initResizable);
$(window).load(resizeHeight);
</script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="wiredtiger.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<td id="projectlogo"><a href="http://wiredtiger.com/"><img alt="Logo" src="LogoFinal-header.png" alt="WiredTiger" /></a></td>
<td style="padding-left: 0.5em;">
<div id="projectname">
 <span id="projectnumber">Version 1.5.0</span>
</div>
</td>
</tr>
</tbody>
</table>
</div>
<div class="banner">
<a href="https://github.com/wiredtiger/wiredtiger">Fork me on GitHub</a>
<a class="last" href="http://groups.google.com/group/wiredtiger-users">Join my user group</a>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.3.1 -->
<div id="navrow1" class="tabs">
<ul class="tablist">
<li><a href="index.html"><span>Main Page</span></a></li>
<li class="current"><a href="pages.html"><span>Related Pages</span></a></li>
<li><a href="modules.html"><span>Modules</span></a></li>
<li><a href="examples.html"><span>Examples</span></a></li>
<li><a href="community.html"><span>Community</span></a></li>
<li><a href="license.html"><span>License</span></a></li>
</ul>
</div>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
<div id="nav-tree-contents">
<div id="nav-sync" class="sync"></div>
</div>
</div>
<div id="splitbar" style="-moz-user-select:none;"
class="ui-resizable-handle">
</div>
</div>
<script type="text/javascript">
$(document).ready(function(){initNavTree('tuning.html','');});
</script>
<div id="doc-content">
<div class="header">
<div class="headertitle">
<div class="title">Performance Tuning </div> </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><h1><a class="anchor" id="tuning_cache_size"></a>
Cache size</h1>
<p>The cache size for the database is configurable by setting the <code>cache_size</code> configuration string when calling the <a class="el" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a> function.</p>
<p>The effectiveness of the cache can be measured by reviewing the page eviction statistics for the database.</p>
<p>An example of setting a cache size to 500MB:</p>
<div class="fragment"><div class="line"> <span class="keywordflow">if</span> ((ret = <a class="code" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a>(home, NULL,</div>
<div class="line"> <span class="stringliteral">"create,cache_size=500M"</span>, &conn)) != 0)</div>
<div class="line"> fprintf(stderr, <span class="stringliteral">"Error connecting to %s: %s\n"</span>,</div>
<div class="line"> home, <a class="code" href="group__wt.html#gac95e70a24d09cf6928398512990e1474" title="Return information about an error as a string; wiredtiger_strerror is a superset of the ISO C99/POSIX...">wiredtiger_strerror</a>(ret));</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_memory_allocation"></a>
Memory allocation</h1>
<p>The performance of heavily-threaded WiredTiger applications can be dominated by memory allocation because the WiredTiger engine has to free and re-allocate memory as part of many queries. Replacing the system's malloc implementation with one that has better threaded performance (for example, Google's <a href="http://goog-perftools.sourceforge.net/doc/tcmalloc.html">tcmalloc</a>, or <a href="http://www.canonware.com/jemalloc">jemalloc</a>), can dramatically improve throughput.</p>
<h1><a class="anchor" id="tuning_read_only_objects"></a>
Read-only objects</h1>
<p>Cursors opened on checkpoints (either named, or using the special "last
checkpoint" name "WiredTigerCheckpoint") are read-only objects. Unless memory mapping is configured off (using the "mmap" configuration string to <a class="el" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a>), read-only objects are mapped into process memory instead of being read through the WiredTiger cache. Using read-only objects where possible minimizes the amount of buffer cache memory required by WiredTiger applications and the work required for buffer cache management, as well as reducing the number of memory copies from the operating system buffer cache into application memory.</p>
<p>To open a named checkpoint, use the configuration string "checkpoint" to the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#afb5b4a69c2c5cafe411b2b04fdc1c75d" title="Open a new cursor on a data source or duplicate an existing cursor.">WT_SESSION::open_cursor</a> method: </p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#afb5b4a69c2c5cafe411b2b04fdc1c75d" title="Open a new cursor on a data source or duplicate an existing cursor.">open_cursor</a>(session,</div>
<div class="line"> <span class="stringliteral">"table:mytable"</span>, NULL, <span class="stringliteral">"checkpoint=midnight"</span>, &cursor);</div>
</div><!-- fragment --><p> To open the last checkpoint taken in the object, use the configuration string "checkpoint" with the name "WiredTigerCheckpoint" to the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#afb5b4a69c2c5cafe411b2b04fdc1c75d" title="Open a new cursor on a data source or duplicate an existing cursor.">WT_SESSION::open_cursor</a> method: </p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#afb5b4a69c2c5cafe411b2b04fdc1c75d" title="Open a new cursor on a data source or duplicate an existing cursor.">open_cursor</a>(session,</div>
<div class="line"> <span class="stringliteral">"table:mytable"</span>, NULL, <span class="stringliteral">"checkpoint=WiredTigerCheckpoint"</span>, &cursor);</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_cache_resident"></a>
Cache resident objects</h1>
<p>Cache resident objects (objects never considered for the purposes of cache eviction), can be configured with the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">WT_SESSION::create</a> "cache_resident" configuration string.</p>
<p>Configuring a cache resident object has two effects: first, once the object's page have been instantiated in memory, no further I/O cost is ever paid for object access, minimizing potential latency. Second, in-memory objects can be accessed faster than objects tracked for potential eviction, and applications able to guarantee sufficient memory that an object need never be evicted can significantly increase their performance.</p>
<p>An example of configuring a cache-resident object:</p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">create</a>(session,</div>
<div class="line"> <span class="stringliteral">"table:mytable"</span>, <span class="stringliteral">"key_format=r,value_format=S,cache_resident=true"</span>);</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_page_size"></a>
Page and overflow sizes</h1>
<p>There are four page and item size configuration values: <code>internal_page_max</code>, <code>internal_item_max</code>, <code>leaf_page_max</code> and <code>leaf_item_max</code>. All four should be specified to the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">WT_SESSION::create</a> method, that is, they are configurable on a per-file basis.</p>
<p>The <code>internal_page_max</code> and <code>leaf_page_max</code> configuration values specify the maximum size for Btree internal and leaf pages. That is, when an internal or leaf page reaches the specified size, it splits into two pages. Generally, internal pages should be sized to fit into the system's L1 or L2 caches in order to minimize cache misses when searching the tree, while leaf pages should be sized to maximize I/O performance (if reading from disk is necessary, it is usually desirable to read a large amount of data, assuming some locality of reference in the application's access pattern).</p>
<p>The <code>internal_item_max</code> and <code>leaf_item_max</code> configuration values specify the maximum size at which an object will be stored on-page. Larger items will be stored separately in the file from the page where the item logically appears. Referencing overflow items is more expensive than referencing on-page items, requiring additional I/O if the object is not already cached. For this reason, it is important to avoid creating large numbers of overflow items that are repeatedly referenced, and the maximum item size should probably be increased if many overflow items are being created. Because pages must be large enough to store any item that is not an overflow item, increasing the size of the overflow items may also require increasing the page sizes.</p>
<p>With respect to compression, page and item sizes do not necessarily reflect the actual size of the page or item on disk, if block compression has been configured. Block compression in WiredTiger happens within the disk I/O subsystem, and so a page might split even if subsequent compression would result in a resulting page size that would be small enough to leave as a single page. In other words, page and overflow sizes are based on in-memory sizes, not disk sizes.</p>
<p>There are two other, related configuration values, also settable by the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">WT_SESSION::create</a> method. They are <code>allocation_size</code>, and <code>split_pct</code>.</p>
<p>The <code>allocation_size</code> configuration value is the underlying unit of allocation for the file. As the unit of file allocation, it has two effects: first, it limits the ultimate size of the file, and second, it determines how much space is wasted when storing overflow items.</p>
<p>By limiting the size of the file, the allocation size limits the amount of data that can be stored in a file. For example, if the allocation size is set to the minimum possible (512B), the maximum file size is 2TB, that is, attempts to allocate new file blocks will fail when the file reaches 2TB in size. If the allocation size is set to the maximum possible (512MB), the maximum file size is 2EB.</p>
<p>The unit of allocation also determines how much space is wasted when storing overflow items. For example, if the allocation size were set to the minimum value of 512B, an overflow item of 1100 bytes would require 3 allocation sized file units, or 1536 bytes, wasting almost 500 bytes. For this reason, as the allocation size increases, page sizes and overflow item sizes will likely increase as well, to ensure that significant space isn't wasted by overflow items.</p>
<p>The last configuration value is <code>split_pct</code>, which configures the size of a split page. When a page grows sufficiently large that it must be split, the newly split page's size is <code>split_pct</code> percent of the maximum page size. This value should be selected to avoid creating a large number of tiny pages or repeatedly splitting whenever new entries are inserted. For example, if the maximum page size is 1MB, a <code>split_pct</code> value of 10% would potentially result in creating a large number of 100KB pages, which may not be optimal for future I/O. Or, if the maximum page size is 1MB, a <code>split_pct</code> value of 90% would potentially result in repeatedly splitting pages as the split pages grow to 1MB over and over. The default value for <code>split_pct</code> is 75%, intended to keep large pages relatively large, while still giving split pages room to grow.</p>
<p>An example of configuring page sizes:</p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">create</a>(session, <span class="stringliteral">"file:example"</span>,</div>
<div class="line"> <span class="stringliteral">"key_format=u,"</span></div>
<div class="line"> <span class="stringliteral">"internal_page_max=32KB,internal_item_max=1KB,"</span></div>
<div class="line"> <span class="stringliteral">"leaf_page_max=1MB,leaf_item_max=32KB"</span>);</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_checksums"></a>
Checksums</h1>
<p>WiredTiger checksums file reads and writes, by default. In read-only applications, or when file compression provides any necessary checksum functionality, or when using backing storage systems where blocks require no validation, performance can be increased by turning off checksum support when calling the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">WT_SESSION::create</a> method.</p>
<p>Checksums can be configured to be "off" or "on", as well as "uncompressed". The "uncompressed" configuration checksums blocks not otherwise protected by compression. For example, in a system where the compression algorithms provide sufficient protection against corruption, and when writing a block which is too small to be usefully compressed, setting the checksum configuration value to "uncompressed" causes WiredTiger to checksum the blocks which are not compressed:</p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">create</a>(session, <span class="stringliteral">"table:mytable"</span>,</div>
<div class="line"> <span class="stringliteral">"key_format=S,value_format=S,checksum=uncompressed"</span>);</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_direct_io"></a>
Direct I/O</h1>
<p>WiredTiger optionally supports direct I/O, based on the non-standard <code>O_DIRECT</code> flag to the POSIX 1003.1 open system call. Configuring direct I/O may be useful for applications wanting to minimize the operating system cache effects of I/O to and from WiredTiger's buffer cache.</p>
<p>Direct I/O is configured using the "direct_io" configuration string to the <a class="el" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a> function. An example of configuring direct I/O for WiredTiger's data files:</p>
<div class="fragment"><div class="line"> ret = <a class="code" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a>(home, NULL, <span class="stringliteral">"create,direct_io=[data]"</span>, &conn);</div>
</div><!-- fragment --> <h1><a class="anchor" id="tuning_compression"></a>
Compression</h1>
<p>WiredTiger configures key prefix compression for row-store objects, and column-store compression for both row-store and column-store objects, by default. These forms of compression minimize in-memory and on-disk space, but at some CPU cost when rows are read and written. Turning these forms of compression off may increase application throughput.</p>
<p>For example, turning off row-store key prefix compression:</p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">create</a>(session, <span class="stringliteral">"table:mytable"</span>,</div>
<div class="line"> <span class="stringliteral">"key_format=S,value_format=S,prefix_compression=false"</span>);</div>
</div><!-- fragment --><p> For example, turning off row-store or column-store dictionary compression:</p>
<div class="fragment"><div class="line"> ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file.">create</a>(session, <span class="stringliteral">"table:mytable"</span>,</div>
<div class="line"> <span class="stringliteral">"key_format=S,value_format=S,dictionary=false"</span>);</div>
</div><!-- fragment --><p> WiredTiger does not configure Huffman encoding or block compression by default, but these forms of compression can also impact overall throughput. See <a class="el" href="file_formats.html#file_formats_compression">File formats and compression</a> for more information.</p>
<h1><a class="anchor" id="tuning_statistics"></a>
Performance monitoring with statistics</h1>
<p>WiredTiger optionally maintains a variety of statistics, when the <code>statistics</code> configuration string is specified to <a class="el" href="group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed" title="Open a connection to a database.">wiredtiger_open</a>; see <a class="el" href="statistics.html">Statistics</a> for general information about statistics, and <a class="el" href="data_sources.html#data_statistics">Statistics Data</a> for information about accessing the statistics.</p>
<p>Note that maintaining run-time statistics involves updating shared-memory data structures and may decrease application performance.</p>
<p>The statistics gathered by WiredTiger can be combined to derive information about the system's behavior. For example, a cursor can be opened on the statistics for a table:</p>
<div class="fragment"><div class="line"> <span class="keywordflow">if</span> ((ret = session-><a class="code" href="struct_w_t___s_e_s_s_i_o_n.html#afb5b4a69c2c5cafe411b2b04fdc1c75d" title="Open a new cursor on a data source or duplicate an existing cursor.">open_cursor</a>(session,</div>
<div class="line"> <span class="stringliteral">"statistics:table:access"</span>, NULL, NULL, &cursor)) != 0)</div>
<div class="line"> <span class="keywordflow">return</span> (ret);</div>
</div><!-- fragment --><p> Then this code calculates the "fragmentation" of a table, defined here as the percentage of the table that is not part of the current checkpoint:</p>
<div class="fragment"><div class="line"> uint64_t ckpt_size, file_size;</div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#ga3ba4d6c12abe10285dc3b599f082a4e4" title="checkpoint size">WT_STAT_DSRC_BLOCK_CHECKPOINT_SIZE</a>, &ckpt_size);</div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#ga2af5c768712ce6c6b840badd17dd13b9" title="block manager size">WT_STAT_DSRC_BLOCK_SIZE</a>, &file_size);</div>
<div class="line"></div>
<div class="line"> printf(<span class="stringliteral">"File is %d%% fragmented\n"</span>,</div>
<div class="line"> (<span class="keywordtype">int</span>)(100 * (file_size - ckpt_size) / file_size));</div>
</div><!-- fragment --><p> The following example calculates the "write amplification", defined here as the ratio of bytes written to the filesystem versus the total bytes inserted, updated and removed by the application.</p>
<div class="fragment"><div class="line"> uint64_t app_insert, app_remove, app_update, fs_writes;</div>
<div class="line"></div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#gaec7145b45392656825b8404ae1fb4f46" title="cursor-insert key and value bytes inserted">WT_STAT_DSRC_CURSOR_INSERT_BYTES</a>, &app_insert);</div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#gaf9cc721c75f42a110e0f6af104087b4d" title="cursor-remove key bytes removed">WT_STAT_DSRC_CURSOR_REMOVE_BYTES</a>, &app_remove);</div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#ga21596e9d2f3d0c992bcade65b85cb5eb" title="cursor-update value bytes updated">WT_STAT_DSRC_CURSOR_UPDATE_BYTES</a>, &app_update);</div>
<div class="line"></div>
<div class="line"> ret = get_stat(cursor, <a class="code" href="group__wt.html#ga212a7e9d219e279da8b10adfaaab2afc" title="bytes written from cache">WT_STAT_DSRC_CACHE_BYTES_WRITE</a>, &fs_writes);</div>
<div class="line"></div>
<div class="line"> printf(<span class="stringliteral">"Write amplification is %.2lf\n"</span>,</div>
<div class="line"> (<span class="keywordtype">double</span>)fs_writes / (app_insert + app_remove + app_update));</div>
</div><!-- fragment --><p> Both examples use this helper function to retrieve statistics values from a cursor:</p>
<div class="fragment"><div class="line"><span class="keywordtype">int</span></div>
<div class="line">get_stat(<a class="code" href="struct_w_t___c_u_r_s_o_r.html" title="A WT_CURSOR handle is the interface to a cursor.">WT_CURSOR</a> *cursor, <span class="keywordtype">int</span> stat_field, uint64_t *valuep)</div>
<div class="line">{</div>
<div class="line"> <span class="keyword">const</span> <span class="keywordtype">char</span> *desc, *pvalue;</div>
<div class="line"> <span class="keywordtype">int</span> ret;</div>
<div class="line"></div>
<div class="line"> cursor-><a class="code" href="struct_w_t___c_u_r_s_o_r.html#ad1088d719df40babc1f57d086691ebdc" title="Set the key for the next operation.">set_key</a>(cursor, stat_field);</div>
<div class="line"> <span class="keywordflow">if</span> ((ret = cursor-><a class="code" href="struct_w_t___c_u_r_s_o_r.html#a7e25b2ced2cf3ec68bd5429bf921c79f" title="Move to the record matching the key.">search</a>(cursor)) != 0)</div>
<div class="line"> <span class="keywordflow">return</span> (ret);</div>
<div class="line"></div>
<div class="line"> <span class="keywordflow">return</span> (cursor-><a class="code" href="struct_w_t___c_u_r_s_o_r.html#af85364a5af50b95bbc46c82e72f75c01" title="Get the value for the current record.">get_value</a>(cursor, &desc, &pvalue, valuep));</div>
<div class="line">}</div>
</div><!-- fragment --></div></div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
<li class="navelem"><a class="el" href="index.html">Reference Guide</a></li><li class="navelem"><a class="el" href="programming.html">Writing WiredTiger applications</a></li>
<li class="footer">Copyright (c) 2008-2013 WiredTiger, Inc. All rights reserved. Contact <a href="mailto:[email protected]">[email protected]</a> for more information.</li>
</ul>
</div>
</body>
</html>