2024-12-07 | DjangoCon 2024 | A Guided Tour Through Postgres Internals with Elizabeth Garrett Christensen
探索PostgreSQL内部:从基础命令到性能优化
标签
媒体详情
- 上传日期
- 2025-06-21 17:47
- 来源
- https://www.youtube.com/watch?v=ppg8-iAwd34
- 处理状态
- 已完成
- 转录状态
- 已完成
- Latest LLM Model
- gemini-2.5-pro
转录
speaker 1: Thank you so much for having me. I love jgochan. I do quite a bit of community stuff with postgres, and I copy all of the great ideas that you guys have. And you guys are real leaders in kind of how to build an awesome community. And thank you for inviting me back. I always love speaking here. Postgreinternals is, without a doubt, the most boring topic I could be talking about at the end of this conference. I'm going to try to make this entertaining painless, and if I don't succeed, you can buy me a drink later. And thatbe more entertaining than this. I work at a company called crunchy data. We do manage postgras on cloud. We have a new data warehouse product, and then we also do post grestuff and Kubernetes. This isn't really part of my talk, but postgres 17 is coming out tomorrow. I've been watching and everything's tagged and ready to go. If you are not on version 13 or above, you should be upgrading your postgres. There's not a ton of superherheadline awesome features in postgres 17, but there is some cool performance stuff, especially with betree indexes. I know those are used heavily by the jgo community, so there's some cool stuff in there. If you end up upgrading, I'm going to talk through a bunch of kind of little commands in postgres. I have some queries. Here's a qr code or a link to a just file. Nobody needs these slides because they're not that good. But if you need the sequel, it's in there. All right, so I kind of want to just go through at a high level. I know lots of you have tons of postgreexperience, some of you don't have a ton. So I'm gonna na start at the very beginning and then we'll get into some more complicated stuff. So if you've never gone into postgreor, into psel, there is a command line interface for it comes and installs with postgres itself. You can't really install it as a side piece. You're if you've got a local copy of postgress running, you're fine. You already have it. If you have like you know your company's postgres is running at Amazon rds, you will have to install postgres in order to work with the psql command line. It's not the biggest package in the world. Like it could be smaller, but it's not terrible. If you're like on a mac, you can just brew, install postgraphs. There's lots of other distributions that they provide as downloads that work pretty well. To get into the command line interface for psql, you'll just connect it to your host. If you're connecting to a remote host, like you have a database somewhere else that your company manages, you'll need to create a connection string with, you know, your user password, host port and all that stuff, so you know who you are. Once you're inside psql, you can ask it, you know, who you are and confirm all of that stuff. That's the slash con info command. You can do a slash du and find out who else is in this database. So who else has permissions, what other applications might be in here, all kinds of stuff like that. You can change the formatting with the backslash x auto. So if you've ever been in like in postgreand, you've queried through the command line. The queries just get really, really long, and they go out to the right and they're like impossible to read. So this is a way that itjust batch up all the records for you and it's easier to read. So I know that people hate clis. I hate clis. I had to learn the Amazon cli this year, and I told my boss, I'm not learning any more clis this year. So if you don't like clis, it's fine. You can do most of the stuff that I'm gonna na talk about today in a gui. Lots of people use pg admin. It's very popular. I personally use dbeaver. I think it's just a better interface. And then there's other pythonic, you know data gritools and other stuff like that. So Yeah. So now that we have kind of connected, I'm going to talk through some of the ways that we can kind of see what's going on inside our database. I'm going to start at the very, very beginning. So if you have a postgreinstallation, you may have more than one database inside of it. This is a nice way to experiment. Start test projects, do lots of stuff without Ching touching your other stuff. I like have probably like 100 test databases. This is a small sample so you can find out all the databases that are inside this postgres instance with the backslash l. If you want to change databases, you'll land often in the postgredatabase. But if you find other databases that you want to work inside, you can do the backslash c and you can go into another database. I don't know how many of you know this, but the entire postgress ecosystem runs on Jango. So all of our conference registration memberships, like the whole postgreworld, runs in Jango. So some of the screenshots in here are from the Jango database that comes with the postgreproperties. So backslash d plus will get you like a display of table information and show you you know what your table names are. This is at the point that you're kind of trying to figure out what you've got right? If you're exploring your own database or maybe someone's asking for help, this is a good place to start to see you know what you've got. If you do the backslash d plus thatshow you the size of each of the tables. And if you need to like you know look at storage size, this is like an easy way to do it. Once you're once you know what table you want to look at, you can do a described table and that's the backslash d and you can and then you'll add the table name to it. The backslash d plus is really, really handy, right? Because it's got tons of data about the table. So it's sort of describing for you what columns, data types, if you have primary keys, foreign keys, so if you're you know kind of messing around and you're you know your Jango code and you're trying to like figure out what's happening and what's been mapped to what and how it's displaying in the database, this is a great place to like confirm what's happening. Another just small plug for why I'm showing you this stuff in psql and I'm not showing you the stuff in queries, is that the backslash dt, the described tables for postgres is twelve separate sql queries. This is a small sample of it. So you can do this yourself. Like if you wanna get all of your internals described, you can totally find the queries. They're probably on stack overflow or in the postgres docs, and then you can run those in a gui. But psql does a good job of kind of collapsing all that for you. Another cool thing that you can do with psql and all of the stuff that I'm kind of talking about today is you can do an echo so you can have the commands that you're running be echoed, and then you can get the sel for them. So like let's say that you want na echo, you know, the describe table thing, because you want na change that query a little bit and you want na know specific stuff about tables. You can pull off the sequel and then kind of rearrange it yourself and decide what you want na do with it. And then once you get past the describing the databases, describing the tables, describing what's in the tables, what columns and rows there are, you have to switch to sql to actually see the data. There's no psql for like data. And then you can also, if you're mucking around and messing around in postgres, you can find all your settings. You can they're a little a select query. If if you're an admin, you can just find all of your postgres settings. This is super helpful if you're on a managed platform because you may not have all the bells and whistles to go into your underlying configuration files. And this is nice for people to just take a peek at what their settings are. If you're going through the postgres docs, one of the things I do at crunchy data is write like tutorials. Some of the stuff that I've showed you today is already inside tutorials that we have. Like I have one about basic psql. I have one about the echo stuff I was talking about. These all run in a web browser in like web assembly. So you don't need to install post graphs or do anything with your computer. You can just it even works on a phone. I was like showing somebody postgres business on a phone recently. So that's a learn dot crunchydata dot com if you want to mess around with that. All right. So let's kind of transition here into talking about what's happening in the database. So I know you guys are all really good at using like the Django tools to kind of see what's happening. As it happens, postgres has a lot of the same kinds of things, although they're stored in the database in a slightly different way. If you're using some kind of monitoring, this is not a replacement for that. So I'm going to talk about a couple of different things you could do. This is more just like things that are in the database if you want to dig in a little bit more, not a replacement for you know some kind of full application monitoring. So there's a table inside postgres called pg Stat activity that will tell you everything that's happening inside the database, how long it's been running for, when it started and what the state is. This is super helpful. Like if everything is not working and your database is completely not taking queries or you know things aren't working, you can go in and see what's running and then you can you can stop the pid that's actually stopping the database from working. You can also see in the pg stactivity who is doing work in the database. So you can see if your application has connections, how many connections that application has individual users, lots of stuff like that. So if you are kind of messing around trying to figure out how many connections your application has opened, I know I was talking a couple of people last night about just like the connection management, knowing how many connections are open or how many connections are idle is super helpful when you're trying to figure out what to do with your connection management. Several of the slides I have kind of sprinkled in here are like queries that have you know specific stuff kind of pulled out of them. And the reason for that is that these internal tables that I'm talking about are just huge. There's tons and tons of data in them and it's virtually impossible to kind of talk about if I don't show you a small sample of it. So you're if you're wondering if there's a lot more than there needs to be, this is postgres. So of course there is. Yeah so another cool table for kind of what's happening is the pg Stat database table that will show you all the transactions that are happening in individual databases. This is kind of the way that people measure transaction volume. If you run in this query now and you run it an hour from now, you will know what your transaction volume per hour is. And you could do that for days. But this is super helpful for just knowing how busy you are. Postgres holds a bunch of locks, which I'm sure you guys are super familiar with. You probably run a Jango migration and a locked your database and now you're not allowed to run Jango migrations. Maybe it's just me. Yeah. You know there's a lot of things, a couple things inside postgres that will actually lock postgres tables and those are table alteration commands. Like if you change a column or do something like that, there's a couple other things that will lock your tables a lot of times like I help on our support team and we spend a lot of time dealing with locks. It is kind of one of the unfortunate things about a relational database that's not trying to lose your data is that it needs a minute to like commit its transactions. But Yeah, so you can look at what is locked. Like there's a table where the locks are stored. One of my colleagues wrote this query because what happens with these locks is something will lock and then itbe sort of this you know cascading effect, right? And all these queries will be waiting for it and you're probably trying to find out like why is this query waiting? Like why is it not able to run? And then it's gonna na take you a minute to figure it out. So some one of my colleagues wrote this kind of locks query that just like has a cte at the beginning and then just kind of gives you the original thing that locked everything up. So that's super helpful if you're like, okay, I've got a in the process that's locking like 100 things. Postgres also has views for everything that's have so maintenance tasks that happen inside postgres, right? If you're super familiar with vacuum, I'm sorry, I have also spent a lot of time working on postgress vacuum. Again, it's kind of just an unfortunate remnant of the transaction system that you know needs to kind of keep dead stuff around. But you can check on things like if if you have a copy command, like if you're running some kind of etl process at night and loading a bunch of data, you can just check on the process of those maintenance tasks, which is pretty handy sometimes if you want to dig in more on vacuum, you can query post greand. Kind of ask it, you know how long ago did you vacuum and different things like that. All right. So we kind of covered what's inside of our database, what kind of data we have. We've covered what's happening. Postgres has this whole other world of cumulative stats that are really good. And there's been a lot of development even in the last couple versions of postgres to kind of build out some of this stuff. If you're using a postgres monitor, probably some of the monitoring tools are built off of these pieces, but you know they're kind of fun to just like get in and queer yourself too. So the pg Stat statements, if you're not familiar with it, is kind of the query tracking piece of postgres. Pg Stat statements ships with postgres, but it's not turned on. So you have to go and add it as an extension and then add it to your library. But it is like the most helpful thing you could do as an application developer to kind of find out like what's happening with your queries, how often good queries run. It's definitely the place to start when you're trying to do some query optimization and work, you know kind of if you're working backwards from the database to your application code. So here's a query to just find. Know your slowest ten queries and pg Stat statements surprising no one. The first slowest query in this database is a refreshing materialized view, which if you have any of those, they take like hours to run depending on how big they are. But you can go through it and you know some of the stuff you're obviously not gonna na be able to fix. And then you can just pick the things that you want to work on. There is a pg Stat io user table. There are a ton of io and memory related things inside postgraphs. Now if you're interested in memory usage, if you're concerned about memory usage, or you just love kind of that kind of piece of postgres, some of these internal tables have a ton of data in them. So one of the things that you can get out a pgstuser tables is a cash hit ratio. And so postgresses keeping track of how many things it served from the memory so that like shared buffer memory, and then how many things that it had to read from the underlying disk. So you know if you're familiar with kind of how the ideal postgress world works, is that you want the vast majority of your data in postgressmemory so that everything is super fast. So you're looking for a cash hit ratio in the nineties, but you can find out what it is. There's a pg stuser table that has a bunch of information about what's happening when tables are queried and scanned or indexed. And this will show you kind of this is a little query that somebody wrote that I think is kind of cute because they like did it up so that if there are more sequence scans than there are index scans, you're missing an index. So anyways, this you know this is one way that you can find out kind of what what is happening on tables in terms of like actual query behavior. And then pgstat user index is kind of similar to the one I was just talking about, has a ton of information about the indexes, how often they're used, what's going on with them. And then you can kind of decide like, okay, great. Like that's index is being used. Or you can be like, this index is never used and I should delete it because it's taking up space on my disk. You can reset all these Stat tables that I just talked about, which is super handy. So like if you go and you do a bunch of indexing and you do a bunch of great stuff with your queries and you don't want na like muck up all of your stats tables, you can reset it. I sat down and showed some of my slides last night and showed them the slide and they reset all of their stats, and I don't think they meant to do that. So maybe think about it before you copy something from one of my slides and run it on your database. Yeah. So I'm just going to kind of wrap up by like what I sort of really wanted people to get out of this talk and kind of people to walk away with is sort of you know that there's a bunch of data in your database, right? Like you know that postgrehas all of your users and all the data and all the settings, but there's actually a bunch of other information that is collected in the database you know for all of this activity and transactions, locks, connections and then all of this like kind of performance over time stuff that you have at your fingertips and you can just do whatever you want with I have a postgres meetup that's online called postgres meeup for all. It's pretty new or starting in October, if anybody feels like joining an online group of postgres people that are kind of loud and obnoxious. If you want to find me on social media, I'm sequel Liz, or as some people call it, sequel lies. And this is another chance if you need the just file and I'll take some, I'll take questions, I guess, in the hall so you guys can get your snacks. Ach so.
最新摘要 (详细摘要)
概览/核心摘要 (Executive Summary)
本次演讲由 Crunchy Data 的 Elizabeth Garrett Christensen 在 DjangoCon 2024 上呈现,为 Django 开发者提供了一份深入探索 PostgreSQL 内部机制的实用指南。演讲的核心主张是,开发者应利用 PostgreSQL 自身丰富的内部数据来诊断性能问题、优化查询和理解数据库行为,而不仅仅依赖外部工具。演讲内容涵盖了从基础的 psql 命令行工具使用,到探索数据库结构(如列出数据库、表、查看表结构),再到监控实时活动(如通过 pg_stat_activity 查看当前连接与查询、分析锁问题),最终深入到分析累积性能统计数据。
演讲重点强调了几个关键的内部视图和扩展。pg_stat_activity 被认为是诊断数据库“卡死”和管理连接池的关键。而 pg_stat_statements 扩展则被誉为“应用开发者能做的最有帮助的事”,它能够追踪查询的性能指标,是定位慢查询和进行优化的起点。此外,演讲还介绍了如何通过 pg_stat_user_tables 和 pg_stat_user_indexes 等视图来评估缓存命中率、发现缺失的索引以及清理未使用的索引。整个演讲旨在通过实际的命令和查询示例,赋予开发者直接与数据库“对话”的能力,从而更高效地进行性能调优。
PostgreSQL 17 发布及升级建议
- 发布信息: 演讲者提到,PostgreSQL 17 版本预计在演讲次日(即2024年12月8日)正式发布。
- 升级建议: 强烈建议仍在使用版本13或更早版本的用户进行升级。
- 新特性亮点: 尽管没有“超级英雄级”的头条功能,但版本17在性能方面有显著提升,特别是针对 B-tree 索引的优化,这对 Django 社区尤为重要。
工具与连接:psql 及图形化界面
- psql 命令行工具:
- 是 PostgreSQL 自带的命令行接口,需要本地安装 PostgreSQL 才能使用。
- 连接远程数据库(如 AWS RDS)需要提供包含用户、密码、主机和端口的完整连接字符串。
- 常用命令:
\coninfo: 确认当前连接信息。\du: 查看数据库中的所有用户和角色。\x auto: 切换为扩展显示模式,使宽表查询结果更易于阅读。
- 图形化界面 (GUI) 替代方案:
- 演讲者理解并尊重不喜欢命令行的开发者。
- 推荐工具:
PGAdmin: 非常流行。DBeaver: 演讲者个人偏好,认为其界面更优。- 其他基于 Python 的工具,如
DataGrip。
探索数据库内部结构
- 数据库与表信息:
\l: 列出当前 PostgreSQL 实例中的所有数据库。\c [database_name]: 切换到指定的数据库。\d+: 显示当前数据库中的所有表,并包含其大小信息。\d [table_name]或\d+ [table_name]: 描述特定表的结构,包括列名、数据类型、主键、外键等。这对于验证 Django ORM 模型与数据库表的实际映射关系非常有用。
- psql 命令的优势:
psql的快捷命令背后是复杂的 SQL 查询。演讲者指出:> “\dt这个描述表的命令,背后是十二个独立的 SQL 查询。”- 通过在
psql中开启echo模式,可以查看到这些快捷命令对应的底层 SQL,方便开发者学习和自定义查询。
- 查看数据与配置:
- 查看表中的实际数据必须使用标准的
SELECT语句。 - 可以通过查询
pg_settings表来查看数据库的所有配置项,这在无法直接访问配置文件的托管平台上尤其有用。
- 查看表中的实际数据必须使用标准的
监控实时活动:“数据库正在做什么?”
演讲者明确指出,这些方法适用于即时探查,不能替代专业的全功能监控系统。
pg_stat_activity表:- 功能: 显示数据库中所有正在运行的进程、执行的查询、运行时间、启动时间以及当前状态。
- 核心用途:
- 诊断故障: 当数据库无响应时,可进入此表查看是哪个查询或进程(
pid)导致了阻塞,并可手动终止它。 - 连接管理: 分析应用的连接行为,例如有多少连接处于活动状态,有多少处于空闲状态,为优化连接池提供依据。
- 诊断故障: 当数据库无响应时,可进入此表查看是哪个查询或进程(
pg_stat_database表:- 功能: 跟踪每个数据库的事务统计信息。
- 用途: 通过在不同时间点查询该表,可以计算出事务处理量(如每小时的事务数),以衡量数据库的繁忙程度。
- 锁 (Locks):
- 常见问题: 锁是关系型数据库中的一个常见痛点,尤其是在执行 Django 迁移(
ALTER TABLE命令)时,很容易导致表被锁定。 - 解决方案: 演讲者分享了一个她同事编写的查询,该查询能够穿透连锁的等待效应,直接定位到最初引发所有锁的根源查询,极大地简化了调试过程。
- 常见问题: 锁是关系型数据库中的一个常见痛点,尤其是在执行 Django 迁移(
- 维护任务监控:
- PostgreSQL 提供了专门的视图来监控后台维护任务的进度,例如
VACUUM(垃圾回收)和COPY(数据加载)。
- PostgreSQL 提供了专门的视图来监控后台维护任务的进度,例如
分析累积统计数据:性能优化洞察
pg_stat_statements扩展:- 重要性: 演讲者称之为 > “作为应用开发者,你能做的最有帮助的事情”。
- 功能: 跟踪所有执行过的查询的详细统计信息,包括执行次数、总耗时、平均耗时等。
- 设置: 该扩展随 PostgreSQL 一同发布,但默认关闭,需要手动在配置文件中加载并创建扩展。
- 应用: 是进行查询优化的起点,可用于识别最慢或最频繁的查询。
- I/O 与索引使用情况:
- 缓存命中率 (Cache Hit Ratio):
- 这是一个关键性能指标,反映了有多少数据请求是从内存(共享缓冲区)中服务,而不是从磁盘读取。
- 理想的缓存命中率应在 90% 以上。
- 可以通过查询
pg_stat_user_tables等视图来计算。
pg_stat_user_tables表:- 可用于分析表的查询行为。一个实用的查询示例是比较表的顺序扫描次数与索引扫描次数,如果前者远多于后者,通常意味着缺少必要的索引。
pg_stat_user_indexes表:- 提供每个索引的使用频率信息。
- 可用于识别从未使用过的索引,这些索引可以被安全删除,以节省磁盘空间并减少写操作的开销。
- 缓存命中率 (Cache Hit Ratio):
- 重置统计信息:
- 可以使用
pg_stat_reset()系列函数来清空上述累积的统计数据。 - 严重警告: 演讲者分享了一个趣闻,告诫听众在生产数据库上执行此操作前务必三思,以免意外丢失宝贵的历史性能数据。
- 可以使用
核心结论与资源
- 核心观点: PostgreSQL 不仅仅是一个数据存储容器,它内部收集了大量关于自身活动、事务、锁、连接和性能历史的元数据。开发者应充分利用这些触手可及的信息,进行深入的性能分析和优化。
- 相关资源:
- SQL 查询: 演讲中使用的所有查询都已分享在一个 Gist 文件中。
- 互动教程: Crunchy Data 网站(
learn.crunchydata.com)提供了可在浏览器中直接运行的 PostgreSQL 互动教程。 - 社区: 演讲者正在组织一个名为 “Postgres Meetup for All” 的线上交流小组。
- 社交媒体: 演讲者的社交账号是
sqlliz。