This is a discussion on Trouble with LEFT JOIN using VIEWS. within the Pgsql Performance forums, part of the PostgreSQL category; --> Hello, I've just hit problem, that is unusual for me. quaker=> \d sites Table "public.sites" Column | Type | ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, I've just hit problem, that is unusual for me. quaker=> \d sites Table "public.sites" Column | Type | Modifiers -----------+-------------------+---------------------------------------------------- id | integer | not null default nextval('sites_id_seq'::regclass) site_name | character varying | not null user_id | integer | not null extra | integer | Indexes: "sites_pkey" PRIMARY KEY, btree (id) "sites_site_name_key_unique" UNIQUE, btree (site_name text_pattern_ops) "sites_user_id_key" btree (user_id) quaker=> \d users Table "public.users" Column | Type | Modifiers -----------+-------------------+---------------------------------------------------- id | integer | not null default nextval('users_id_seq'::regclass) user_name | character varying | not null extra | integer | Indexes: "users_pkey" PRIMARY KEY, btree (id) Both tables filled with 100k records of random data. In users id is in range from 1..100k, same in sites. In sites user_id is random, range from 1..150k. I've created views: quaker=> \d users_secure View "public.users_secure" Column | Type | Modifiers -----------+-------------------+----------- id | integer | user_name | character varying | View definition: SELECT users.id, users.user_name FROM users; quaker=> \d users_secure_with_has_extra View "public.users_secure_with_has_extra" Column | Type | Modifiers -----------+-------------------+----------- id | integer | user_name | character varying | has_extra | boolean | View definition: SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra FROM users; Now, when I do simple query to find all data for sites matching site_name like 'H3bh%' (there are at least one record in sites matching this condition). quaker=> explain analyze select s.site_name,u.user_name from sites_secure s left join users_secure_with_has_extra u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3897.02..3897.03 rows=2 width=44) (actual time=430.326..430.331 rows=1 loops=1) -> Sort (cost=3897.02..3897.03 rows=2 width=44) (actual time=430.321..430.323 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop Left Join (cost=0.00..3897.01 rows=2 width=44) (actual time=290.103..430.301 rows=1 loops=1) Join Filter: ("inner".id = "outer".user_id) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.054 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Seq Scan on users (cost=0.00..1641.00 rows=100000 width=20) (actual time=0.007..245.406 rows=100000 loops=1) Total runtime: 430.432 ms (10 rows) When I resign from LEFT JOIN users_secure_with_has_extra, and put JOIN instead I've got: quaker=> explain analyze select s.site_name,u.user_name from sites_secure s join users_secure_with_has_extra u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.112..0.118 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.106..0.108 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual time=0.073..0.088 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.050 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.023 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.216 ms (10 rows) As explain shows PostgreSQL is using index scan on users, instead of seq scan like in example above. Now. When I use view with no has_extra field (important: field is a simple function on extra field) I get expectable results. Both using indexes. quaker=> explain analyze select s.site_name,u.user_name from sites_secure s left join users_secure u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.111..0.117 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.105..0.107 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop Left Join (cost=0.00..9.04 rows=1 width=24) (actual time=0.072..0.087 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.043..0.049 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.022 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.216 ms (10 rows) quaker=> explain analyze select s.site_name,u.user_name from sites_secure s join users_secure u on u.id = s.user_id where site_name like 'H3bh%' order by site_name limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.109..0.115 rows=1 loops=1) -> Sort (cost=9.05..9.06 rows=1 width=24) (actual time=0.104..0.106 rows=1 loops=1) Sort Key: sites.site_name -> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual time=0.071..0.086 rows=1 loops=1) -> Index Scan using sites_site_name_key_unique on sites (cost=0.00..6.01 rows=1 width=16) (actual time=0.042..0.048 rows=1 loops=1) Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) AND ((site_name)::text ~<~ 'H3bi'::text)) Filter: ((site_name)::text ~~ 'H3bh%'::text) -> Index Scan using users_pkey on users (cost=0.00..3.02 rows=1 width=16) (actual time=0.018..0.021 rows=1 loops=1) Index Cond: (users.id = "outer".user_id) Total runtime: 0.214 ms (10 rows) Why? quaker=> select version(); version ----------------------------------------------------------------------------------------------------------------------- PostgreSQL 8.1.10 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1) (1 row) ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| =?ISO-8859-2?Q?Piotr_Gasid=B3o?= <quaker@barbara.eu.org> writes: > I've just hit problem, that is unusual for me. > View definition: > SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra > FROM users; What you've got here is a non-nullable target list, which creates an optimization fence when used in the nullable side of an outer join. The problem is that has_extra should read as NULL in the query output for a sites_secure row that has no match in users_secure_with_has_extra, but making users.extra go to null will not make that happen, so there's a constraint on where the expression can be evaluated. The current planner has no way to deal with that except by restricting the plan structure. We have some ideas about how to fix this, but don't hold your breath .... it's going to take major surgery on the planner, AFAICS. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |