Skip to main content
  1. All Posts/

pgbench-tools

Tools Shell

About pgbench-tools

pgbench-tools automates running PostgreSQL’s built-in pgbench tool in a
useful way. It will run some number of database sizes (the database
scale) and various concurrent client count combinations.
Scale/client runs with some common characteristic–perhaps one
configuration of the postgresql.conf–can be organized into a “set”
of runs. The program graphs transaction rate during each test,
latency, and comparisons between test sets.

pgbench-tools setup

  • Install GNUplot.
  • Create databases for your test and for the results::

    createdb results
    createdb pgbench
    
    <ul dir="auto">
      <li>
        Both databases can be the same, but there may be more shared_buffers<br /> cache churn in that case. Some amount of cache disruption<br /> is unavoidable unless the result database is remote, because<br /> of the OS cache. The recommended and default configuration<br /> is to have a pgbench database and a results database. This also<br /> keeps the size of the result dataset from being included in the<br /> total database size figure recorded by the test.
      </li>
    </ul>
    
  • Initialize the results database by executing::

    psql -f init/resultdb.sql -d results
    
    <p>
      Make sure to reference the correct database. </li> 
      
      <li>
        You need to create a test set with a descritption::</p> <pre class="notranslate"><code>./newset 'Initial Config'
    

        <p>
          Running the &#8220;newset&#8221; utility without any parameters will list all of the<br /> existing test sets.
        </p>
        
        <ul dir="auto">
          <li>
            Allow linux user to fire the tools::</p> <pre class="notranslate"><code>chmod +x benchwarmer
    

    chmod +x cleanup

  <h1 dir="auto">
    <a rel="nofollow noopener" target="_blank" id="user-content-running-tests" class="anchor" aria-hidden="true" href="#running-tests"></a>Running tests
  </h1>
  
  <ul dir="auto">
    <li>
      Edit the config file to reference the test and results database, as<br /> well as list the test you want to run. The default test is a<br /> SELECT-only one that runs for 60 seconds.
    </li>
    <li>
      Execute::</p> <pre class="notranslate"><code>./runset

      <p>
        In order to execute all the tests </li> </ul> 
        
        <p>
          HINT:: change the pg_hba.conf or setup a pgpass file to avoid password prompt for each connection (there&#8217;s a lot of them). You can also <code>export</code> the password in your shell session.
        </p>
        
        <h1 dir="auto">
          <a rel="nofollow noopener" target="_blank" id="user-content-results" class="anchor" aria-hidden="true" href="#results"></a>Results
        </h1>
        
        <ul dir="auto">
          <li>
            You can check results even as the test is running with::</p> <pre class="notranslate"><code>psql -d results -f reports/report.sql

            <p>
              This is unlikely to disrupt the test results very much unless you&#8217;ve<br /> run an enormous number of tests already. There is also a helper<br /> script named summary that shows reports/summary.sql </li> 
              
              <li>
                A helper script named set-times will show how long past tests have taken to<br /> complete. This can be useful to get an idea how long the currently running<br /> test or test set will actually take to finish.
              </li>
              <li>
                Other useful reports you can run are in the reports/ directory, including:</p> <ul dir="auto">
                  <li>
                    <code>fastest.sql</code>
                  </li>
                  <li>
                    <code>summary.sql</code>
                  </li>
                  <li>
                    <code>bufreport.sql</code>
                  </li>
                  <li>
                    <code>bufsummary.sql</code>
                  </li>
                  <li>
                    <code>compromise_params.sql</code> (see below)
                  </li>
                </ul>
              </li>
              
              <li>
                Once the tests are done, the results/ directory will include<br /> a HTML subdirectory for each test giving its results,<br /> in addition to the summary information in the results database.
              </li>
              <li>
                The results directory will also include its own index HTML file (named<br /> index.html) that shows summary information and plots for all the tests.
              </li>
              <li>
                If you manually adjust the test result database, you can<br /> then manually regenerate the summary graphs by running:</p> <pre class="notranslate"><code>./webreport

              <li>
                If you want to generate a report with selected testsets only (for example sets 1, 6 and 7):</p> <pre class="notranslate"><code>./limited_webreport 1,6,7

              <li>
                In case you test for static rates amount (for example sets 2, 8 and 9):</p> <pre class="notranslate"><code>./rates_webreport 2,8,9

              <h1 dir="auto">
                <a rel="nofollow noopener" target="_blank" id="user-content-test-sets-comparison" class="anchor" aria-hidden="true" href="#test-sets-comparison"></a>Test sets comparison
              </h1>
              
              <p>
                Runs of pgbench via the runset command are oriented into test sets. Each<br /> test that is run will be put into the same test set until you tell the<br /> program to switch to a new set. Each test set is assigned both a<br /> serial number and a test description.<br /> New test sets are added like this:
              </p>
              
              <pre class="notranslate"><code>psql -d results -c "INSERT INTO testset (info) VALUES ('set name')"

              <p>
                pgbench-tools aims to help compare multiple setups of PostgreSQL. That<br /> might be different configuration parameters, different source code builds, or<br /> even different versions of the database. One reason the results database is<br /> separate from the test database is that you can use a shared results<br /> database across multiple test sets, while connecting to multiple test database<br /> installations.<br /> The graphs generated by the program will generate a seperate graph pair for<br /> each test set, as well as a master graph pair that compares all of them. The<br /> graphs in each pair are graphed with a X axis of client count and database<br /> scale (size) respectively. The idea is that you might see whether an<br /> alternate configuration is better at handling larger data sets, or if it<br /> handles concurrency at high client counts better.<br /> Note that all of the built-in pgbench tests use very simple queries. The<br /> results can be useful for testing read-only SELECT scaling at different<br /> client counts. They can also be useful for seeing how the server handles<br /> heavy write volume. But none of these results will change if you alter<br /> server parameters that adjust query execution, such as work_mem or<br /> effective_cache_size. Many of the useful PostgreSQL parameters to tune<br /> for better query execution on larger servers in particular fall into<br /> this category. You will not always be able to compare configurations<br /> usefully using the built-in pgbench tests. Even for parameters that<br /> should impact results, such as shared_buffers or checkpoint_segments,<br /> making useful comparisons with pgbench is often difficult.<br /> There is more information about what pgbench is useful for, as well as<br /> how to adjust the program to get better results, in the pgbench<br /> documentation: <a rel="nofollow noopener" target="_blank" href="http://www.postgresql.org/docs/current/static/pgbench.html">http://www.postgresql.org/docs/current/static/pgbench.html</a>
              </p>
              
              <h1 dir="auto">
                <a rel="nofollow noopener" target="_blank" id="user-content-version-compatibility" class="anchor" aria-hidden="true" href="#version-compatibility"></a>Version compatibility
              </h1>
              
              <p>
                The default configuration now aims to support the pgbench that ships with<br /> PostgreSQL 9.6 and later versions, which uses names such as &#8220;pgbench_accounts&#8221;<br /> for its tables. There are commented out settings in the config file that<br /> show what changes need to be made in order to make the program compatible<br /> with PostgreSQL 8.3, where the names were like &#8220;accounts&#8221; instead.<br /> pgbench went through major changes in version 9.6, and the random function<br /> used in most test scripts was renamed. The main tests/ directory has the<br /> current scripts for versions 9.6, 10, 11, 12, and 13 (so far).<br /> To test against versions 8.4 through 9.5, use the tests/tests-9.5 directory<br /> in the confile file.<br /> Support for PostgreSQL versions before 8.3 is not possible, because a<br /> change was made to the pgbench client in that version that is needed<br /> by the program to work properly. It is possible to use the PostgreSQL 8.3<br /> pgbench client against a newer database server, or to copy the pgbench.c<br /> program from 8.3 into a 8.2 source code build and use it instead (with<br /> some fixes&#8211;it won&#8217;t compile unless you comment out code that refers to<br /> optional newer features added in 8.3).
              </p>
              
              <h2 dir="auto">
                <a rel="nofollow noopener" target="_blank" id="user-content-multiple-worker-support" class="anchor" aria-hidden="true" href="#multiple-worker-support"></a>Multiple worker support
              </h2>
              
              <p>
                Starting in PostgreSQL 9.0, pgbench allows splitting up the work pgbench<br /> does into multiple worker threads or processes (which depends on whether<br /> the database client libraries haves been compiled with thread-safe<br /> behavior or not).<br /> This feature is extremely valuable, as it&#8217;s likely to give at least<br /> a 15% speedup on common hardware. And it can more than double throughput<br /> on operating systems that are particularly hostile to running the<br /> pgbench client. One known source of this problem is Linux kernels<br /> using the Completely Fair Scheduler introduced in 2.6.23,<br /> which does not schedule the pgbench program very well when it&#8217;s connecting<br /> to the database using the default method, Unix-domain sockets.<br /> (Note that pgbench-tools doesn&#8217;t suffer greatly from this problem itself, as<br /> it connects over TCP/IP using the &#8220;-H&#8221; parameter. Manual pgbench runs that<br /> do not specify a host, and therefore connect via a local socket can be<br /> extremely slow on recent Linux kernels.)<br /> Taking advantage of this feature is done in pgbench-tools by increasing the<br /> MAX_WORKERS setting in the configuration file. It takes the value of <code>nproc</code><br /> by default, or where that isn&#8217;t available (typically on systems without a<br /> recent version of GNU coreutils), the default can be set to blank, which avoids<br /> using this feature altogether &#8212; thereby remaining compatible not only with<br /> systems lacking the nproc program, but also with PostgreSQL/pgbench versions<br /> before this capability was added.<br /> When&#8230;
              </p>