postgresql – 刷新物化视图CONCURRENTLY导致表膨胀

postgresql – 刷新物化视图CONCURRENTLY导致表膨胀,第1张

概述在PostgreSQL 9.5中,我决定创建一个物化视图“效果”并安排每小时并发刷新,因为我希望它始终可用: REFRESH MATERIALIZED VIEW CONCURRENTLY effects; 一开始一切都运行良好,我的物化视图令人耳目一新,磁盘空间使用率或多或少保持不变. 问题 经过一段时间后,磁盘使用率开始呈线性增长. 我得出结论,这种增长的原因是物化视图并从this回答运行查询以 在Postgresql 9.5中,我决定创建一个物化视图“效果”并安排每小时并发刷新,因为我希望它始终可用:

REFRESH MATERIAliZED VIEW CONCURRENTLY effects;

一开始一切都运行良好,我的物化视图令人耳目一新,磁盘空间使用率或多或少保持不变.

问题

经过一段时间后,磁盘使用率开始呈线性增长.

我得出结论,这种增长的原因是物化视图并从this回答运行查询以获得以下结果:

what                |  bytes/ct   | bytes_pretty | bytes_per_row-----------------------------------+-------------+--------------+--------------- core_relation_size                | 32224567296 | 30 GB        |         21140 visibility_map                    |      991232 | 968 kB       |             0 free_space_map                    |     7938048 | 7752 kB      |             5 table_size_incl_toast             | 32233504768 | 30 GB        |         21146 indexes_size                      | 22975922176 | 21 GB        |         15073 total_size_incl_toast_and_indexes | 55209426944 | 51 GB        |         36220 live_rows_in_text_representation  |   316152215 | 302 MB       |           207 ------------------------------    |             |              | row_count                         |     1524278 |              | live_tuples                       |      676439 |              | dead_tuples                       |     1524208 |              |(11 rows)

然后,我发现这个表最后一次自动恢复是两天前,通过运行:

SELECT relname,n_dead_tup,last_vacuum,last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup desc;

我决定手动调用真空(VERBOSE)效果.它运行了大约半小时,产生了以下输出:

vacuum (VERBOSE) effects;INFO:  vacuuming "public.effects"INFO:  scanned index "effects_IDx" to remove 129523454 row versionsDETAIL:  cpu 12.16s/55.76u sec elapsed 119.87 secINFO:  scanned index "effects_campaign_created_IDx" to remove 129523454 row versionsDETAIL:  cpu 19.11s/154.59u sec elapsed 337.91 secINFO:  scanned index "effects_campaign_name_IDx" to remove 129523454 row versionsDETAIL:  cpu 28.51s/151.16u sec elapsed 315.51 secINFO:  scanned index "effects_campaign_event_type_IDx" to remove 129523454 row versionsDETAIL:  cpu 38.60s/373.59u sec elapsed 601.73 secINFO:  "effects": removed 129523454 row versions in 3865537 pagesDETAIL:  cpu 59.02s/36.48u sec elapsed 326.43 secINFO:  index "effects_IDx" Now contains 1524208 row versions in 472258 pagesDETAIL:  113679000 index row versions were removed.463896 index pages have been deleted,60386 are currently reusable.cpu 0.00s/0.00u sec elapsed 0.01 sec.INFO:  index "effects_campaign_created_IDx" Now contains 1524208 row versions in 664910 pagesDETAIL:  121637488 index row versions were removed.41014 index pages have been deleted,0 are currently reusable.cpu 0.00s/0.00u sec elapsed 0.00 sec.INFO:  index "effects_campaign_name_IDx" Now contains 1524208 row versions in 711391 pagesDETAIL:  125650677 index row versions were removed.696221 index pages have been deleted,28150 are currently reusable.cpu 0.00s/0.00u sec elapsed 0.00 sec.INFO:  index "effects_campaign_event_type_IDx" Now contains 1524208 row versions in 956018 pagesDETAIL:  127659042 index row versions were removed.934288 index pages have been deleted,32105 are currently reusable.cpu 0.00s/0.00u sec elapsed 0.00 sec.INFO:  "effects": found 0 removable,493 nonremovable row versions in 3880239 out of 3933663 pagesDETAIL:  0 dead row versions cannot be removed yet.There were 666922 unused item pointers.Skipped 0 pages due to buffer pins.0 pages are entirely empty.cpu 180.49s/788.60u sec elapsed 1799.42 sec.INFO:  vacuuming "pg_toast.pg_toast_1371723"INFO:  index "pg_toast_1371723_index" Now contains 0 row versions in 1 pagesDETAIL:  0 index row versions were removed.0 index pages have been deleted,0 are currently reusable.cpu 0.00s/0.00u sec elapsed 0.00 sec.INFO:  "pg_toast_1371723": found 0 removable,0 nonremovable row versions in 0 out of 0 pagesDETAIL:  0 dead row versions cannot be removed yet.There were 0 unused item pointers.Skipped 0 pages due to buffer pins.0 pages are entirely empty.cpu 0.00s/0.00u sec elapsed 0.00 sec.VACUUM

此时我认为问题已经解决,并开始考虑可能会干扰autovacuum的问题.可以肯定的是,我再次运行查询以查找该表的空间使用情况,令我惊讶的是它没有改变.

只有在我调用REFRESH MATERIAliZED VIEW效果之后;不是同时发生的.只是现在查询表大小的查询输出是:

what                | bytes/ct  | bytes_pretty | bytes_per_row-----------------------------------+-----------+--------------+--------------- core_relation_size                | 374005760 | 357 MB       |           245 visibility_map                    |         0 | 0 bytes      |             0 free_space_map                    |         0 | 0 bytes      |             0 table_size_incl_toast             | 374013952 | 357 MB       |           245 indexes_size                      | 213843968 | 204 MB       |           140 total_size_incl_toast_and_indexes | 587857920 | 561 MB       |           385 live_rows_in_text_representation  | 316175512 | 302 MB       |           207 ------------------------------    |           |              | row_count                         |   1524385 |              | live_tuples                       |    676439 |              | dead_tuples                       |   1524208 |              |(11 rows)

一切都恢复正常……

问题

问题已经解决,但仍然存在相当多的混乱

>有人可以解释一下我遇到的问题吗?
>将来我怎么能避免这种情况?

解决方法 首先,我们来解释这个膨胀

REFRESH MATERIAliZED CONCURRENTLY在src/backend/commands/matview.c实施,评论很有启发性:

/* * refresh_by_match_merge * * Refresh a materialized vIEw with transactional semantics,while allowing * concurrent reads. * * This is called after a new version of the data has been created in a * temporary table.  It performs a full outer join against the old version of * the data,producing "diff" results.  This join cannot work if there are any * duplicated rows in either the old or new versions,in the sense that every * column would compare as equal between the two rows.  It does work correctly * in the face of rows which have at least one NulL value,with all non-NulL * columns equal.  The behavior of NulLs on equality tests and on UNIQUE * indexes turns out to be quite convenIEnt here; the tests we need to make * are consistent with default behavior.  If there is at least one UNIQUE * index on the materialized vIEw,we have exactly the guarantee we need. * * The temporary table used to hold the diff results contains just the TID of * the old record (if matched) and the ROW from the new table as a single * column of complex record type (if matched). * * Once we have the diff table,we perform set-based DELETE and INSERT * operations against the materialized vIEw,and discard both temporary * tables. * * Everything from the generation of the new data to applying the differences * takes place under cover of an ExclusiveLock,since it seems as though we * would want to prohibit not only concurrent REFRESH operations,but also * incremental maintenance.  It also doesn't seem reasonable or safe to allow * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by * this command. */

因此,通过删除行并从临时表中插入新行来刷新物化视图.这当然会导致死元组和表膨胀,这可以通过您的VACUUM(VERBOSE)输出来确认.

在某种程度上,这是你为同时支付的价格.

其次,让我们揭穿VACUUM无法移除死元组的神话

VACUUM将删除死行,但它不能减少膨胀(可以使用VACUUM(FulL)完成,但这会像REFRESH MATERIAliZED VIEW一样锁定视图而不会同时发生).

我怀疑你用来确定死元组数的查询只是一个估计,它会导致死元组的数量错误.

举例说明了这一切

CREATE table tab AS SELECT ID,'row ' || ID AS val FROM generate_serIEs(1,100000) AS ID;-- make sure autovacuum doesn't spoil our demonstrationCREATE MATERIAliZED VIEW tab_v WITH (autovacuum_enabled = off)AS SELECT * FROM tab;-- required for CONCURRENTLYCREATE UNIQUE INDEX ON tab_v (ID);

使用pgstattuple扩展来准确测量表膨胀:

CREATE EXTENSION pgstattuple;SELECT * FROM pgstattuple('tab_v');-[ RECORD 1 ]------+--------table_len          | 4431872tuple_count        | 100000tuple_len          | 3788895tuple_percent      | 85.49dead_tuple_count   | 0dead_tuple_len     | 0dead_tuple_percent | 0free_space         | 16724free_percent       | 0.38

现在让我们删除表中的一些行,再次刷新和测量:

DELETE FROM tab WHERE ID BETWEEN 40001 AND 80000;REFRESH MATERIAliZED VIEW CONCURRENTLY tab_v;SELECT * FROM pgstattuple('tab_v');-[ RECORD 1 ]------+--------table_len          | 4431872tuple_count        | 60000tuple_len          | 2268895tuple_percent      | 51.19dead_tuple_count   | 40000dead_tuple_len     | 1520000dead_tuple_percent | 34.3free_space         | 16724free_percent       | 0.38

很多死元组. VACUUM摆脱了这些:

VACUUM tab_v;SELECT * FROM pgstattuple('tab_v');-[ RECORD 1 ]------+--------table_len          | 4431872tuple_count        | 60000tuple_len          | 2268895tuple_percent      | 51.19dead_tuple_count   | 0dead_tuple_len     | 0dead_tuple_percent | 0free_space         | 1616724free_percent       | 36.48

死去的元组消失了,但现在有很多空的空间.

总结

以上是内存溢出为你收集整理的postgresql – 刷新物化视图CONCURRENTLY导致表膨胀全部内容,希望文章能够帮你解决postgresql – 刷新物化视图CONCURRENTLY导致表膨胀所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/sjk/1157698.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-01
下一篇 2022-06-01

发表评论

登录后才能评论

评论列表(0条)

保存