pgbench-tools
Tools
Shell
About pgbench-tools
pgbench-tools automates running PostgreSQL’s built-in pgbench tool in a
useful way. It will run some number of database sizes (the database
scale) and various concurrent client count combinations.
Scale/client runs with some common characteristic–perhaps one
configuration of the postgresql.conf–can be organized into a “set”
of runs. The program graphs transaction rate during each test,
latency, and comparisons between test sets.
pgbench-tools setup
- Install GNUplot.
-
Create databases for your test and for the results::
createdb results createdb pgbench
<ul dir="auto"> <li> Both databases can be the same, but there may be more shared_buffers<br /> cache churn in that case. Some amount of cache disruption<br /> is unavoidable unless the result database is remote, because<br /> of the OS cache. The recommended and default configuration<br /> is to have a pgbench database and a results database. This also<br /> keeps the size of the result dataset from being included in the<br /> total database size figure recorded by the test. </li> </ul>
-
Initialize the results database by executing::
psql -f init/resultdb.sql -d results
<p> Make sure to reference the correct database. </li> <li> You need to create a test set with a descritption::</p> <pre class="notranslate"><code>./newset 'Initial Config'
<p> Running the “newset” utility without any parameters will list all of the<br /> existing test sets. </p> <ul dir="auto"> <li> Allow linux user to fire the tools::</p> <pre class="notranslate"><code>chmod +x benchwarmer
chmod +x cleanup
<h1 dir="auto">
<a rel="nofollow noopener" target="_blank" id="user-content-running-tests" class="anchor" aria-hidden="true" href="#running-tests"></a>Running tests
</h1>
<ul dir="auto">
<li>
Edit the config file to reference the test and results database, as<br /> well as list the test you want to run. The default test is a<br /> SELECT-only one that runs for 60 seconds.
</li>
<li>
Execute::</p> <pre class="notranslate"><code>./runset
<p>
In order to execute all the tests </li> </ul>
<p>
HINT:: change the pg_hba.conf or setup a pgpass file to avoid password prompt for each connection (there’s a lot of them). You can also <code>export</code> the password in your shell session.
</p>
<h1 dir="auto">
<a rel="nofollow noopener" target="_blank" id="user-content-results" class="anchor" aria-hidden="true" href="#results"></a>Results
</h1>
<ul dir="auto">
<li>
You can check results even as the test is running with::</p> <pre class="notranslate"><code>psql -d results -f reports/report.sql
<p>
This is unlikely to disrupt the test results very much unless you’ve<br /> run an enormous number of tests already. There is also a helper<br /> script named summary that shows reports/summary.sql </li>
<li>
A helper script named set-times will show how long past tests have taken to<br /> complete. This can be useful to get an idea how long the currently running<br /> test or test set will actually take to finish.
</li>
<li>
Other useful reports you can run are in the reports/ directory, including:</p> <ul dir="auto">
<li>
<code>fastest.sql</code>
</li>
<li>
<code>summary.sql</code>
</li>
<li>
<code>bufreport.sql</code>
</li>
<li>
<code>bufsummary.sql</code>
</li>
<li>
<code>compromise_params.sql</code> (see below)
</li>
</ul>
</li>
<li>
Once the tests are done, the results/ directory will include<br /> a HTML subdirectory for each test giving its results,<br /> in addition to the summary information in the results database.
</li>
<li>
The results directory will also include its own index HTML file (named<br /> index.html) that shows summary information and plots for all the tests.
</li>
<li>
If you manually adjust the test result database, you can<br /> then manually regenerate the summary graphs by running:</p> <pre class="notranslate"><code>./webreport
<li>
If you want to generate a report with selected testsets only (for example sets 1, 6 and 7):</p> <pre class="notranslate"><code>./limited_webreport 1,6,7
<li>
In case you test for static rates amount (for example sets 2, 8 and 9):</p> <pre class="notranslate"><code>./rates_webreport 2,8,9
<h1 dir="auto">
<a rel="nofollow noopener" target="_blank" id="user-content-test-sets-comparison" class="anchor" aria-hidden="true" href="#test-sets-comparison"></a>Test sets comparison
</h1>
<p>
Runs of pgbench via the runset command are oriented into test sets. Each<br /> test that is run will be put into the same test set until you tell the<br /> program to switch to a new set. Each test set is assigned both a<br /> serial number and a test description.<br /> New test sets are added like this:
</p>
<pre class="notranslate"><code>psql -d results -c "INSERT INTO testset (info) VALUES ('set name')"
<p>
pgbench-tools aims to help compare multiple setups of PostgreSQL. That<br /> might be different configuration parameters, different source code builds, or<br /> even different versions of the database. One reason the results database is<br /> separate from the test database is that you can use a shared results<br /> database across multiple test sets, while connecting to multiple test database<br /> installations.<br /> The graphs generated by the program will generate a seperate graph pair for<br /> each test set, as well as a master graph pair that compares all of them. The<br /> graphs in each pair are graphed with a X axis of client count and database<br /> scale (size) respectively. The idea is that you might see whether an<br /> alternate configuration is better at handling larger data sets, or if it<br /> handles concurrency at high client counts better.<br /> Note that all of the built-in pgbench tests use very simple queries. The<br /> results can be useful for testing read-only SELECT scaling at different<br /> client counts. They can also be useful for seeing how the server handles<br /> heavy write volume. But none of these results will change if you alter<br /> server parameters that adjust query execution, such as work_mem or<br /> effective_cache_size. Many of the useful PostgreSQL parameters to tune<br /> for better query execution on larger servers in particular fall into<br /> this category. You will not always be able to compare configurations<br /> usefully using the built-in pgbench tests. Even for parameters that<br /> should impact results, such as shared_buffers or checkpoint_segments,<br /> making useful comparisons with pgbench is often difficult.<br /> There is more information about what pgbench is useful for, as well as<br /> how to adjust the program to get better results, in the pgbench<br /> documentation: <a rel="nofollow noopener" target="_blank" href="http://www.postgresql.org/docs/current/static/pgbench.html">http://www.postgresql.org/docs/current/static/pgbench.html</a>
</p>
<h1 dir="auto">
<a rel="nofollow noopener" target="_blank" id="user-content-version-compatibility" class="anchor" aria-hidden="true" href="#version-compatibility"></a>Version compatibility
</h1>
<p>
The default configuration now aims to support the pgbench that ships with<br /> PostgreSQL 9.6 and later versions, which uses names such as “pgbench_accounts”<br /> for its tables. There are commented out settings in the config file that<br /> show what changes need to be made in order to make the program compatible<br /> with PostgreSQL 8.3, where the names were like “accounts” instead.<br /> pgbench went through major changes in version 9.6, and the random function<br /> used in most test scripts was renamed. The main tests/ directory has the<br /> current scripts for versions 9.6, 10, 11, 12, and 13 (so far).<br /> To test against versions 8.4 through 9.5, use the tests/tests-9.5 directory<br /> in the confile file.<br /> Support for PostgreSQL versions before 8.3 is not possible, because a<br /> change was made to the pgbench client in that version that is needed<br /> by the program to work properly. It is possible to use the PostgreSQL 8.3<br /> pgbench client against a newer database server, or to copy the pgbench.c<br /> program from 8.3 into a 8.2 source code build and use it instead (with<br /> some fixes–it won’t compile unless you comment out code that refers to<br /> optional newer features added in 8.3).
</p>
<h2 dir="auto">
<a rel="nofollow noopener" target="_blank" id="user-content-multiple-worker-support" class="anchor" aria-hidden="true" href="#multiple-worker-support"></a>Multiple worker support
</h2>
<p>
Starting in PostgreSQL 9.0, pgbench allows splitting up the work pgbench<br /> does into multiple worker threads or processes (which depends on whether<br /> the database client libraries haves been compiled with thread-safe<br /> behavior or not).<br /> This feature is extremely valuable, as it’s likely to give at least<br /> a 15% speedup on common hardware. And it can more than double throughput<br /> on operating systems that are particularly hostile to running the<br /> pgbench client. One known source of this problem is Linux kernels<br /> using the Completely Fair Scheduler introduced in 2.6.23,<br /> which does not schedule the pgbench program very well when it’s connecting<br /> to the database using the default method, Unix-domain sockets.<br /> (Note that pgbench-tools doesn’t suffer greatly from this problem itself, as<br /> it connects over TCP/IP using the “-H” parameter. Manual pgbench runs that<br /> do not specify a host, and therefore connect via a local socket can be<br /> extremely slow on recent Linux kernels.)<br /> Taking advantage of this feature is done in pgbench-tools by increasing the<br /> MAX_WORKERS setting in the configuration file. It takes the value of <code>nproc</code><br /> by default, or where that isn’t available (typically on systems without a<br /> recent version of GNU coreutils), the default can be set to blank, which avoids<br /> using this feature altogether — thereby remaining compatible not only with<br /> systems lacking the nproc program, but also with PostgreSQL/pgbench versions<br /> before this capability was added.<br /> When…
</p>