By design, the out of the box configuration for PostgreSQL is defined to be a "Jack of All Trades, Master of None". The default configuration for PostgreSQL is fairly painstakingly chosen to ensure that it will run on every environment it is installed, meeting the lowest common denominator resources across most platforms.
Because of this, it's always recommended that one of the first actions performed once an install of PostgreSQL is completed, would be to tune and configure some high-level settings.
There are four high-level settings which will be discussed here:
Let's begin with
PostgreSQL uses 'double buffering', meaning that PostgreSQL uses its own internal buffer as well as kernel buffered IO. In short, this means that data is stored in memory twice.
The PostgreSQL buffer is named
shared_buffers and it defines how much dedicated system memory PostgreSQL will use for cache.
Because of PostgreSQL's design choice to ensure compatibility on all supported machines and operating systems, this value is set conservatively low by default. As such, updating the
shared_buffers is one of the settings which will be the most effective in improving overall performance on most modern operating systems.
There is not one specific recommended value for
shared_buffers, but the calculation to determine the value for a particular system is not especially difficult.
Generally speaking, the value for
shared_buffers should be roughly 25% of the total system RAM for a dedicated DB server. The value for
shared_buffers should never be set to reserve all of the system RAM for PostgreSQL. A value over 25% of the system RAM can be useful if, for example, it is set such that the entire database working set of data can fit in cache, as this would greatly reduce the amount of time reading from disk.
Alternately, while a larger
shared_buffers value can increase performance in 'read heavy' use cases, having a large
shared_buffer value can be detrimental for 'write heavy' use cases, as the entire contents of
shared_buffers must be processed during writes.
Write-Ahead Logging (WAL) is a standard method for ensuring integrity of data. Much like in the
shared_buffers setting, PostgreSQL writes WAL records into buffers and then these buffers are flushed to disk.
The default size of the buffer is set by the
wal_buffers setting- initially at 16MB. If the system being tuned has a large number of concurrent connections, then a higher value for
wal_buffers can provide better performance.
effective_cache_size has the reputation of being a confusing PostgreSQL settings, and as such, many times the setting is left to the default value.
effective_cache_size value provides a 'rough estimate' of the number of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications.
This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. As such, it's a bit of a fuzzy number to define for general use cases.
A conservative value for
effective_cache_size would be 1/2 of the total memory available on the system. Most commonly, the value is set to 75% of the total system memory on a dedicated DB server, but can vary depending on the specific discrete needs on a particular server workload.
If the value for
effective_cache_size is too low, then the query planner may decide not to use some indexes, even if they would help greatly increase query speed.
The value of
work_mem is used for complex sort operations, and defines the maximum amount of memory to be used for intermediate results, such as hash tables, and for sorting.
When the value for
work_mem is properly tuned, then the majority of sort actions are performed in the much-faster memory, rather than being written and read to disk.
However, it's important to ensure that the
work_mem value is not set too high, as it can 'bottleneck' the available memory on the system as the application performs sort operations. In this case, for example, the system will try to allocate.
work_mem several times over for each concurrent sort operation.
Because of this important caveat, it's ideal to set the global value for
work_mem at a relatively low value, and then alter any specific queries themselves to use a higher
SET LOCAL work_mem = '256MB';
SELECT * FROM db ORDER BY LOWER(name);
work_mem specifies how much memory is used for complex sort operations,
maintenance_work_mem specifies how much memory is used for routine maintenance tasks, such as VACUUM, CREATE INDEX, and similar.
work_mem, however, only one of these maintenance operations can be executed at a time by a database session. As a result, most systems do not have many of these processes running concurrently, so it's typically safe to set this value much larger than
work_mem, as the larger amounts of available memory could improve the performance of vacuuming and database dump restores.
The default value for
maintenance_work_mem is 64MB.
Using a tool such as https://pgtune.leopard.in.ua/#/ to craft an initial configuration is worthwhile, but the key to getting the absolute best performance is benchmarking your workload and comparing against a known baseline.
It's also important to remember that even the most well tuned database cannot salvage poorly formed queries. Developers creating applications which interface with the database need to be mindful of how queries are written.
If a query performs heavy joins or other expensive aggregate operations, or if a query is performing a full table scan where an index could be used, it will nearly always perform poorly, no matter how well the database settings are tuned.
We hope that the brief explanations above provide enough insight to allow you go forth and tune your PostgreSQL installs! We're also here to help with PostgreSQL support and to help troubleshoot any PostgreSQL performance issues you may come across.