postgresql – Postgres结合多个索引_sql

概述我有以下表/索引 – CREATE TABLE test( coords geography(Point,4326), user_id varchar(50), created_at timestamp);CREATE INDEX ix_coords ON test USING GIST (coords);CREATE INDEX ix_user_id ON test 我有以下表/索引 –

CREATE table test(   coords geography(Point,4326),user_ID varchar(50),created_at timestamp);CREATE INDEX ix_coords ON test USING GIST (coords);CREATE INDEX ix_user_ID ON test (user_ID);CREATE INDEX ix_created_at ON test (created_at DESC);

这是我想要执行的查询：

select * from updates where ST_DWithin(coords,ST_MakePoint(-126.4,45.32)::geography,30000) and user_ID='3212312' order by created_at desclimit 60

当我运行查询时,它只使用ix_coords索引.如何确保Postgres使用ix_user_ID和ix_created_at索引以及查询？

这是一个新表,我在其中批量插入生产数据.测试表中的总行数：15,069,489

我正在使用(effective_cache_size = 2GB)运行Postgresql 9.2.1(使用Postgis).这是我的本地OSX,具有16GB RAM,Core i7 / 2.5 GHz,非SSD磁盘.

添加EXPLAIN ANALYZE输出 –

limit  (cost=71.64..71.65 rows=1 wIDth=280) (actual time=1278.652..1278.665 rows=60 loops=1)   ->  Sort  (cost=71.64..71.65 rows=1 wIDth=280) (actual time=1278.651..1278.662 rows=60 loops=1)         Sort Key: created_at         Sort Method: top-N heapsort  Memory: 33kB         ->  Index Scan using ix_coords on test  (cost=0.00..71.63 rows=1 wIDth=280) (actual time=0.198..1278.227 rows=178 loops=1)               Index Cond: (coords && '0101000020E61000006666666666E63C40C3F5285C8F824440'::geography)               Filter: (((user_ID)::text = '4f1092000b921a000100015c'::text) AND ('0101000020E61000006666666666E63C40C3F5285C8F824440'::geography && _st_expand(coords,30000::double precision)) AND _st_DWithin(coords,'0101000020E61000006666666666E63C40C3F5285C8F824440'::geography,30000::double precision,true))               Rows Removed by Filter: 3122459 Total runtime: 1278.701 ms

更新：

基于以下建议,我尝试使用索引user_ID：

CREATE INDEX ix_coords_and_user_ID ON updates USING GIST (coords,user_ID);

..但是得到以下错误：

ERROR:  data type character varying has no default operator class for access method "gist"HINT:  You must specify an operator class for the index or define a default operator class for the data type.

更新：

所以CREATE EXTENSION btree_gist;解决了btree / gist复合索引问题.现在我的索引看起来像

CREATE INDEX ix_coords_user_ID_created_at ON test USING GIST (coords,user_ID,created_at);

注意：btree_gist不接受DESC / ASC.

新的查询计划：

limit  (cost=134.99..135.00 rows=1 wIDth=280) (actual time=273.282..273.292 rows=60 loops=1)   ->  Sort  (cost=134.99..135.00 rows=1 wIDth=280) (actual time=273.281..273.285 rows=60 loops=1)         Sort Key: created_at         Sort Method: quicksort  Memory: 41kB         ->  Index Scan using ix_updates_coords_user_ID_created_at on updates  (cost=0.00..134.98 rows=1 wIDth=280) (actual time=0.406..273.110 rows=115 loops=1)               Index Cond: ((coords && '0101000020E61000006666666666E63C40C3F5285C8F824440'::geography) AND ((user_ID)::text = '4e952bb5b9a77200010019ad'::text))               Filter: (('0101000020E61000006666666666E63C40C3F5285C8F824440'::geography && _st_expand(coords,true))               Rows Removed by Filter: 1 Total runtime: 273.331 ms

查询比以前表现更好,几乎一秒钟更好,但仍然不是很好.我想这是我能得到的最好的？我希望在60-80ms左右.同样从查询中获取created_at desc的顺序,另外100ms,这意味着它无法使用索引.有任何解决这个问题的方法吗？

解决方法我不知道Pg是否可以将GiST索引和常规b树索引与位图索引扫描相结合,但我怀疑不是.您可能无需向GiST索引添加user_ID列就可以获得最佳结果(并因此使其对于不使用user_ID的其他查询更大更慢).

作为实验,您可以：

CREATE EXTENSION btree_gist;CREATE INDEX ix_coords_and_user_ID ON test USING GIST (coords,user_ID);

这可能会导致一个大的索引,但可能会提升该查询 – 如果它有效.请注意,维护此类索引会显着减慢INSERT和UPDATE.如果你删除旧的ix_coords,你的查询将使用ix_coords_and_user_ID,即使它们没有过滤user_ID,但它会比ix_coords慢.保持两者将使INSERT和UPDATE减速更加糟糕.

见btree-gist

(通过编辑对问题完全改变问题;在编写用户时有多列索引,他们现在分成两个独立的索引)：

您似乎没有在user_ID上进行过滤或排序,只有create_date. Pg不会(不能？)只使用多列索引的第二项,如(user_ID,create_date),它也需要使用第一项.

如果要索引create_date,请为其创建单独的索引.如果您使用并需要(user_ID,create_date)索引,并且通常不单独使用user_ID,请查看是否可以反转列顺序.或者创建两个独立的索引(user_ID)和(create_date).当需要两列时,Pg可以使用位图索引扫描组合两个独立索引.

总结

以上是内存溢出为你收集整理的postgresql – Postgres结合多个索引全部内容，希望文章能够帮你解决postgresql – Postgres结合多个索引所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/sjk/1161381.html