Best Practices for Tuning Slow Postgres Queries | POSETTE: An Event for Postgres 2025
AI Summary
Luke Suttel, founder and CEO of PG Analyze, presents a talk on optimizing slow queries in Postgres. He covers five key areas: debugging slow queries, benchmarking with EXPLAIN ANALYZE BUFFERS, planner costing, join order and parameterized index scans, and guiding the planner to better plans. The talk provides practical tips and a mental model for interpreting EXPLAIN ANALYZE output and fixing performance issues.
Start by asking if the query is always slow or just sometimes slow. Check for I/O time issues or plan changes due to ANALYZE or statistics updates.
Use EXPLAIN ANALYZE to get actual execution statistics. The BUFFERS option (default in Postgres 18) shows I/O timings and data read, crucial for debugging.
A buffer is an 8KB page. Table bloat from dead rows reduces rows per page, increasing buffer counts and scan costs.
EXPLAIN ANALYZE adds overhead, especially with timing and buffers. Always run the query without EXPLAIN ANALYZE to get true performance.
The planner does not find all possible plans, does not change plans when expectations fail, and does not learn from past bad plans. Garbage in, garbage out applies.
Bad row estimates often come from non-independent clauses. Use CREATE STATISTICS to track functional dependencies between columns.
Join order affects scan methods and performance. Postgres may change join order for the same query based on input parameters.
Parameterized index scans require nested loops and must be on the inner side. They pass values from the outer table to the inner index scan.
Turn off planner features (e.g., enable_seqscan=off) to force a good plan and compare costs to find misestimates.
pg_hint_plan allows specifying join order, index usage, etc., for complex queries. It is a last resort due to overhead.
Query plan optimization requires understanding how Postgres estimates costs and row counts. Experiment with statistics, cost settings, and rewriting queries before resorting to planner hints.
Clickbait Check
90% Legit"Title accurately reflects content: practical best practices for tuning slow Postgres queries."
Mentioned in this Video
Tutorial Checklist
Study Flashcards (10)
What is a buffer in Postgres?
easy
Click to reveal answer
What is a buffer in Postgres?
An 8-kilobyte page used for internal storage.
03:43
What does the BUFFERS option in EXPLAIN ANALYZE show?
easy
Click to reveal answer
What does the BUFFERS option in EXPLAIN ANALYZE show?
I/O timings and how much data was read from disk or cache.
03:31
Why does EXPLAIN ANALYZE add overhead?
medium
Click to reveal answer
Why does EXPLAIN ANALYZE add overhead?
Because it measures each repeated access to the plan node, especially with timing and buffers enabled.
04:51
What is the default sequential page cost in Postgres?
easy
Click to reveal answer
What is the default sequential page cost in Postgres?
1.
08:09
What is the most typical cause of bad row estimates?
hard
Click to reveal answer
What is the most typical cause of bad row estimates?
Clauses are not independent; Postgres assumes independence and underestimates rows when multiple conditions are correlated.
09:54
How can you improve row estimates for correlated columns?
medium
Click to reveal answer
How can you improve row estimates for correlated columns?
Use CREATE STATISTICS to track functional dependencies between columns.
10:33
What is a parameterized index scan?
hard
Click to reveal answer
What is a parameterized index scan?
An index scan that receives values from another table (e.g., in a nested loop join) to locate rows.
14:06
What is the purpose of pg_hint_plan?
medium
Click to reveal answer
What is the purpose of pg_hint_plan?
To specify specific parts of a query plan (e.g., join order, index usage) when other methods fail.
18:14
What is a common technique to fix join selectivity problems?
medium
Click to reveal answer
What is a common technique to fix join selectivity problems?
Increase statistics target to collect more most common values.
11:15
What does the startup cost in a plan node represent?
medium
Click to reveal answer
What does the startup cost in a plan node represent?
The cost to fetch the first row using that method.
07:26
🔥 Best Moments
Plans are fuzzy
Luke quotes Paul Lane: 'Ultimately the plan is fuzzy,' highlighting that many valid plans exist and the planner must balance quality with resource usage.
06:06Planner doesn't learn from past mistakes
Reveals that Postgres will happily keep using a bad plan even if it performed poorly before, a key insight for debugging.
07:00Join order changes for same query
Surprising finding: Postgres may use different join orders for the same query depending on input parameters, affecting performance.
13:10Full Transcript
Download .txt[00:00] I'm Luke Suttel, founder and CEO of PG Analyze. We're going to talk about how to optimize slow
[00:13] queries and Postgres. Specifically, when you have an explain-analyze output and you don't know how to interpret it, I'm going to give you a few tips and tricks and a better mental model for working with slow queries in Postgres and how to fix them.
[00:27] All right, let's dive in. There are five things we're going to talk about today. First, how to debug why a query is slow. Second, how we can benchmark queries to explain and analyze with the buses option.
[00:39] Third, how planner costing works and why it really can never be perfect. Fourth, we talk about drawing order and how parameters index can greatly affect query plans. And last, we'll talk a little about how we can guide the plan to write plans.
[00:52] So generally, you know, I think it's important to have a good sense for how you go about debugging, right? Like what a good debug loop is that you can think of. And usually I start debugging when I'm saying, well, is the query always slow? Is it just sometimes slow?
[01:04] Typically, you know, here we have a query that has 1.4 seconds of average execution time, but the outlier took 14.6 seconds, right? So this is usually a good starting point saying, well, why did it take 14.6 seconds in that one case?
[01:17] Oftentimes, IOTime can be a problem. So IOTime might be in a particular note just taking either loading more data or maybe the disk is overloaded and that greatly influences, of course, the whole runtime of the query.
[01:29] And unfortunately, especially in the cloud, IOT latency can do pretty bad. Postgres 18, which is the upcoming Postgres release, will make some of these things better. But for now, on most production systems, we'll have really inefficient IOT on some of these cloud provider network storage options.
[01:43] Also, sometimes what we see is not necessarily I.O. being the problem, but the plan having changed. The query might have a good query plan and a bad query plan. For example, maybe an analyze ran recently and changed the internal statistics,
[01:55] or other things caused the plan to change. And so really what we're starting with is to say, well, we have a slow query. It's slow. What can we do? So we start a benchmark query with explain-analyze, which is basically put explain-analyze in front of the query.
[02:08] We form a thesis on what to change based on that explain-analyze. Then we try out a change, which might be rewriting a query or other optimizations. And then hopefully, you know, what we're trying to get to is that we actually make it very fast,
[02:21] and then we figure out a way to make that change from that. This is a loop that people go through day-to-day all the time, the slow-scratch, right, actually buying slow queries. So this shouldn't come as a surprise to you if you've had a slow query in your life before.
[02:33] But what I want to give you is basically better tools of working in each of those steps. So just in case you're not familiar, I want to repeat maybe the most important things about how to interpret expand-analyze. So explain-analyze, right, we just, you know, either put explain without the analyze in front of a query, explain-analyze, or then explain-analyze buffers.
[02:50] In the upcoming Postgres 18 release, buffers will be a default, so if you just want to explain-analyze without specifying buffers, it will automatically be set between the buffers off if you want to turn it off. Now, an explain-output without the analyze just gives us the query plan that Postgres wants to use,
[03:05] but it doesn't tell us anything about, you know, the actual execution statistics. For that we have to add a plane analyze, which in this case shows us that the query executed in 307 milliseconds of runtime. It also tells us things like, did a particular plane note perform worse than expected?
[03:19] Did it remove a lot of information after, for example, scanning the table and such. And then we add the buffers option, or again, if you're in Postgres 18, then you'll actually get information like I.O. timings and how much data was read.
[03:31] This is really important because it helps you understand whether the slowness of the The query comes from the amount of data it's working with. It has to load from disk if the data was in the cache and such. There's really important information for debugging.
[03:43] Now, when we see buffers, what we mean with a buffer is an 8-kilobyte buffer page. Basically, the internal storage of Postgres is structured this way. So each time you load data, even if you load just a single row,
[03:55] you'll have to load a full 8-kilobyte of buffer page from the disk or from the memory. Now, the number of how many rows you have per page is greatly influenced by table bloat. So if you have a lot of dead rows and then your table bloat and you have a lot of empty space in the table,
[04:12] that greatly influences how effective your scans are. And so it is important to watch these buffer counts if they're unexpectedly high. Also, the other thing to note is if you see a hit counter in the buffered output, that means it's a cache hit.
[04:24] So it's generally a good thing, but it can be confusing because the problem is if you have a nested loop, the loop loops over the same basically data over and over again. What can happen is that it basically double counts the same buffer block.
[04:37] So in this case, for example, it might have been that there were only 12 buffers uniquely loaded, but 1,200 buffer accesses occurred. And so you've got to interpret these hit counters when you see them. The other thing to know with ExplainAnalyze is that it has overhead.
[04:51] So when you run a query without the explain analyze and it's a query that doesn't work for a lot of rows, usually the measurements are pretty similar. However, once you have a query that's very slow, in this case, the query just meshed from the client side takes 340 milliseconds.
[05:08] But if you measure the query with explain analyze in the picture without timing, which means no per node timing information, it's similar, right? It's 404 milliseconds. So yeah, a little bit of overhead here. But where it gets really bad is if you have timing on and the buffer's option on.
[05:24] Then Postgres basically measures each repeated access to the plan node, and that just takes a lot of time. And so when you benchmark, Explain and Analyze will give you the insights into what's going on, but you should always run it without the Explain and Analyze also to get the true performance,
[05:38] basically without the overhead of measuring Now let talk a little bit about planner costing So planner costing is basically the mechanism the planner uses internally to decide which career plan to use So you know
[05:51] core mechanism here, right, like we looked at, you know, you have a back career plan, we found a plan to analyze, now it's for my thesis and what to change. That thesis has to interact with how the planner, you know, comes up with a plan. And so, as Paul Lane said in the presentation many years ago,
[06:06] ultimately the plan is fuzzy, right? So there can be many valid plans for the same query, but it's not always clear which one is the best query plan. And so basically a planner has to find a good query plan, but also doesn't, shouldn't spend so much
[06:19] time or memory finding it, and it also needs to support the extensible aspect of Postgres, right? Especially that last point means sometimes Postgres has to do more work to figure out which data types, you know, being used and such,
[06:31] which index access methods exist. And so some limitations exist because the planner is expensible, but then also, you know, wants to consume finite CPU of memory. So most important thing to know is the planner does not find all possible plans.
[06:45] It actually descards plans to look back pretty quickly. Also, it doesn't change plans when expectations don't hold true. So if you row count this off, then, you know, it doesn't fix that. It doesn't rerun. And also, it doesn't know if past execution is the same query,
[07:00] had a bad plan that performed badly, it will happily keep doing the same flow query. So really cost estimation is how the planner makes decisions. Ultimately garbage in, garbage out applies here, as Thomas said in the mailing list comment
[07:13] some time ago. So basically if you put bad data into the query planner, as in bad statistics, you'll get a bad query planner. Now, quick information here is we can look at the cost on each of these plan notes.
[07:26] The cost itself is a startup cost component, which is the first row you're fetching, and and the total cost is if you fetched all the rows with that particular method. In this case, index scan, if we basically scan for all those rows. Now here, we estimated that there's 10,000 rows being fetched,
[07:42] which is the output of that plan node, and the planner also estimates how wide each of these rows is. So what is cost, really? Cost is not really a specific unit. It's more a currency that the planner uses when it does its cost-based search.
[07:57] So it doesn't, like, you could argue that maybe what it means is the cost of sequential scans. So one of the core things, if you look at how Postgres internally costs a sequential scan on a table,
[08:09] it basically multiplies the sequential page cost, which is set to 1 by default, with the number of pages you're looking for. So if you have a table that has 8,000 pages, the sequential scan cost will be 8,000 roughly.
[08:21] So you could say that base sequential scan cost basically is what all the other costs are based on. Most importantly, what's the cost of an index scan? Well, the cost of an index scan is dependent on which index access method you're using.
[08:36] So if you use a Betray index, it's going to be different than if you use a hash or print index. And because, again, both this is extensible, this passes down to the particular index access method. For example, in the case of a Betray index, we have the btCostEstimate function.
[08:49] that function ultimately tries to figure out how selective is the part you're looking for in the index with the selectivity calculation, and then it multiplies that with the number of tuples, and then it factors into the regular cost.
[09:03] Now, this function here, this clause list selectivity function, is one of the key facts that will impact your query plan's performance, or the query plan's quality, rather. So this here basically works from limited information where it's like you're looking for this, you know, kind of, you have these filters on this table, and you're trying to say, is this, you know, likely to be found, right?
[09:26] Like, am I looking for something that's very unique, or am I looking for something that's very common in the table? And that then drives how selective your query is, how selective your search is, basically, in the table or the index.
[09:39] Now, this, you know, then drives the cost, right, because the cost basically multiplies with that selectivity number. So, for example, the way we can see this selectivity number is kind of indirect because the total number of rows times the selectivity, right?
[09:54] That's the row number you see in the plan. The most typical issue why you get bad row estimates is because clauses are not independent. What that means is if I'm saying A is 1, B is 1, C is 1, B is 1, and E is 1, then basically if all the A's that are 1 also have B1, or there are no C1s that have B1, right, so basically there's dependencies between those different conditions, Postgres does not know that by default.
[10:20] And so because of that, what Postgres will do is will think if you add more conditions, things become more specific, so it will underestimate the vote. Now, we can improve this by using the create statistics command.
[10:33] The create statistics command basically tracks additional information, specifically the schema dependency between columns, that lets us then encode the situation. Another problem on selectivity though is drawing estimates.
[10:46] So drawing estimates are really complicated and often wrong. I don't have enough time in today's talk to really go through all the details here, but suffice to say if you want to look for this, look at the Postgres source, one of the things to know about drawings is that the most common value lists are being used for that.
[11:02] So CoachCraft, when it does an analyze run, will collect the most common values in the table. And then if those basically align with the two tables being joined, that is used to drive some of the join estimates.
[11:15] But one of the ways you can improve the quality basically of the join. So I think the estimate is you can increase what's called statistics target. That then increases the number of most common values that are collected And that way you might get a better estimate of how many rows get returned by a joint node Now let talk about joins in a little bit more detail
[11:34] So ultimately, you know, when we talk about joins, we have to think about two tables, in this case A and B, being joined. So here we see a sequential scan on A, indexed scan on B, and we join those two together in a nested loop join.
[11:47] Oftentimes joins are more complicated, right? So in this case, maybe we join the result of that A and B join with another table C, and then it becomes, you know, ABC as a joint. This here, you know, is what we call the joint order.
[11:59] So in this case, the joint order is ABC. We can also turn this around, so maybe, you know, just to illustrate the difference here, maybe the joint on C happens earlier and then the joint on B happens last, and so the joint order becomes ACB.
[12:12] This performs differently, right, because you basically filter things out earlier or later, depending on the joint order. This, you know, ultimately drives a couple of things. So, for example, join order can influence which scan methods get used,
[12:26] a couple of parameterized index scans in a moment. Join order itself also can just by virtue of, you know, eliminating information earlier or later, influence the performance of a career plan. And then join methods, like whether you have a nested loop join, a hash join, or a merge join,
[12:42] also, again, you know, are charged the plan makes and are influenced by the join order and the scan methods, right? So all of these are kind of interdependent with each other. One of the things you can do, and we've done some kind of prototyping around this ourselves
[12:56] as we analyze, is you can detect the join order in captured explain. So imagine you use auto explain. You're looking at all the flow execution outliers. You can actually analyze which join order is being used by your queries.
[13:10] And so the one thing that surprised me when we did this is that it actually does change for the same query, right? You might think Postgres chooses one join order and keeps it, but the reality is that depending on the input parameters,
[13:22] Postgres will use different join orders because it thinks that maybe, you know, the data being returned from one part of the join is going to be less in one case. And so, like, nothing else can be useful for, you know, highly frequent, you know, changing queries, basically.
[13:37] The other thing to know about is what we, you know, or think about is how can we filter on particular parts, right? So here in this query, we have a sequential scan, and you can see that, you know, in this case, we have a hash join, and so the hash join is the one that basically does the, you know, merging of the two,
[13:52] and the filtering happens, like, on T2, there's no filtering when we load the data. It's really when we join the data that we filter. Now, the most important thing to think about here is that we might have what we call a parameterized index scan,
[14:06] where basically, here's a nested loop, and parameterized index scans require nested loops. and basically what happens is that you first load the data from the one table and then each of the result outputs from, in this case, T1
[14:19] gets passed to the index scan on the second table. And so this means that, for example, when our query says, you know, load from table 1 and also load from table 2 and then match up, you know, the table 1 ID with the table 2 ID,
[14:33] Postgres cannot, you know, like, doesn't know what the value is it's searching for. And so the choices you have here are basically very limited But if you have an index, in this case on T1 ID, on that second table that you want to use,
[14:46] you have to first know the values you're looking for. And so it's going to get the values first from the first table and then put them into the second table to basically locate the actual rows. Another way of putting this is like, these are basically the same query, just two different plans.
[15:04] Here you have a hash join, and then if you have no index scan that's usable without the value being known, then you basically either have to hash it on the sequential scan on the left side here, or you have a nested loop join with an index scan on the right side.
[15:16] And the index scan uses this parameterized index scan parameterized because it inputs the values from the other table. The parameterized index scan must be on the inner side of the nested loop, so this is where the join order comes back, right?
[15:29] If you have bad join order, we're not going to get to it. Now let's think a little bit more about how we can guide the planner to the right plan. So when you think about, you know, query being slow, right,
[15:41] and we found the thesis, we understood kind of the cost and what's going on, the drawing order, like we looked at all these attributes. Now we're trying to say, what can we try to improve this, right? How can we actually get to a better plan? One of the things I like to do is trying to force a good plan
[15:56] and then seeing why folks just chose a bad plan. So one way you can think about this is you have this query, right, the planner produces, you know, for a particular value, a good plan,
[16:08] but then it's also always a bad plan, ultimately, right? The bad plan has a cost, and what you're hoping for with a cost-based, you know, estimate by a planner is that the estimate models the reality, meaning the bad plan doesn't get chosen because it has a higher cost.
[16:23] Now, what can happen, right, is, like, you might have different input values. So in this case, you might have, you know, different objects that you were looking for. And so the back plan gets chosen because the cost is different, right?
[16:35] And so even though basically it's the back plan, because of the input value having different selectivity, we're getting, you know, the back plan gives us lower cost, right? So again, in this case, back plan costs $300. The good plan costs $500.
[16:48] Postgres doesn't know ahead of time which plan is going to be better, and so chooses the worst plan, basically. So one of the things we can do to improve the situation is we can turn off planner features. So planner features are, for example, sequential scans.
[17:03] So let's say we get a bad plan that's a sequential scan and a good plan that's an index scan. Well we can say setEnabledExScan off which means Postgres basically makes the sequential scan extremely expensive in old versions or in Postgres 18 this is changed with an explicit disabled plan note setting so just visualized a little bit differently
[17:21] But the, you know, the fact remains the same. If the sequential scan is the only viable option, it will still give you a sequential scan, it's just extremely unlikely to do that if there's any other way to basically give you a query plan.
[17:34] Now, in this case, what we see, right, is like we get the good plan, and most importantly, we know that the good plan has a cost of 500. So once we have the right plan, we can then look at the right plan with the wrong costs,
[17:47] and we can find out where our misestimate is happening. For example, another thing you can try is if you've seen your hash or merge join being used, but you wanted to use nested loop, you could turn off merge and hash joins and see what the nested loop plan looks like.
[18:01] Now, sometimes this works really well, and sometimes it doesn't work well. And where it doesn't work well is when you have complex queries where maybe you want to turn off message loops for one part of the query plan, but you want to keep message loops for another part of the query plan.
[18:14] And so for this, we can use pghint plan. So pghint plan basically lets us specify specific parts of the query to be done differently. So here, for example, we have a bad plan with a particular drawing order that we don't think is good.
[18:28] And so we also know that there's another plan that works better. And so what we can do is, when we look at the good plan, basically you come up with an idea of how exactly to, you know, force the order.
[18:40] And in pg-hint plan, look it up. It's going to take a long time to explain all the details. But basically pg-hint plan has things like, for this table, use this index. For join these other tables, use this order, right?
[18:52] It's basically prescriptive about how the planner should do things. So in this case, now we know that, you know, the good plan, for example, had a cost of $1,000, $1.4 million. The bad plan, in this case, is out of the memorize function.
[19:06] had 14 million costs, that actually would have been fine. It would have been worse. But because of the Memoize feature in Postgres, the BAP plan was chosen because the BAP plan had a cost of 900,000. That was, you know, $1.4 billion, right? And so now you know that, okay, what happened was Memoize, you know,
[19:21] made that difference basically between those plans. So to summarize, there's basically six ways to guide the BAP plan that I can think of right now that I would like to give you on your journey.
[19:33] First of all, if you have simple scans for activity issues, right? So if you have a problem where the Postgres planner, you know, gives you the wrong row count, look into using create statistics. Create statistics collects, you know, additional information.
[19:49] This doesn't work for cross-table situations, but it works well if you have these functional dependency issues. If you've joined selectivity problems, increase your statistics target to increase the number of most common values to help the join, you know, improve selectivity.
[20:02] Sometimes with joint selectivity, you can also force materialization, which is another technique I'm thinking about to use. Reviewing cost settings can help, right? So as mentioned, these, you know, social settings, like random page cost, sequential scan cost,
[20:15] they do significantly influence what the planner does. And so, again, one common problem might be that, for example, random page cost is still set to four by default. And so random page cost being high means that you get certain,
[20:31] like indexes being used versus others. Like, there's lots of details to this, but basically reviewing those settings and sometimes lower in random page lists in particular can make sense. Sometimes, and we didn't talk much about this today,
[20:43] but it makes sense to understand what particular biases the planner has. So one common situation I've encountered is order by limit, and so sometimes that causes a bad career plan into creating multi-column indexes in the right way.
[20:57] It's a good thing to think about. As mentioned, materialization is a way to fix things. So if you don't want to use a planner hint, which you shouldn't, if you can avoid it, using the materialized keywords for CTEs lets you control which part they flowed first in Postgres.
[21:13] So that's kind of a neat hack sometimes. And then one trick that I would give on the journey for anybody who runs a multi-tenant app, where you have different customers, but they're all in the same database, right? So you have a shared kind of table for all the customers.
[21:27] it makes sense to basically give more of your tables a custom ID identifier or some thing that basically identifies the tenant, and then adding explicit clauses and duplicative clauses to your query plans.
[21:39] It seems a little bit nonsensical, but the reason you do it is that the planner basically has more choice in terms of query plans because it's not forced to use a parameterized in X-CAN. And as we know, kind of overarching principle, I would say choosing better statistics
[21:54] or rewriting queries is always better than choosing PlannerHints. So PlannerHints really should be a method of last resort. PGNinPlan works. Some people use it in production, but it is not efficient.
[22:07] It causes overhead on each query where you're using it, and so I would only use it if you have no other way of fixing it. And that's basically it, right? So query plan optimization isn't magic necessarily.
[22:21] It is in the sense that the linguistic process that Postgres does, And so I really encourage you to basically experiment, try to get a better model of how Postgres comes up with these query plans. And if you're interested, PG-Nalliance helps you make these things easier,
[22:34] but I do think that there's a lot of things you can do just with a PSQL shell or, you know, PGAdmin and just running explain and having a good mental model of how real estimations work. I'm looking forward to, you know, talking more about this.
[22:47] If you want to reach out to me, feel free to email me, Lucas at PG-Nalliance.com. I'm always happy to talk about how to optimize Postgres queries. Thank you.