Unix Technical Forum

Trouble with LEFT JOIN using VIEWS.

This is a discussion on Trouble with LEFT JOIN using VIEWS. within the Pgsql Performance forums, part of the PostgreSQL category; --> Hello, I've just hit problem, that is unusual for me. quaker=> \d sites Table "public.sites" Column | Type | ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:43 AM
=?ISO-8859-2?Q?Piotr_Gasid=B3o?=
 
Posts: n/a
Default Trouble with LEFT JOIN using VIEWS.

Hello,

I've just hit problem, that is unusual for me.

quaker=> \d sites
Table "public.sites"
Column | Type | Modifiers

-----------+-------------------+----------------------------------------------------
id | integer | not null default
nextval('sites_id_seq'::regclass)
site_name | character varying | not null
user_id | integer | not null
extra | integer |
Indexes:
"sites_pkey" PRIMARY KEY, btree (id)
"sites_site_name_key_unique" UNIQUE, btree (site_name text_pattern_ops)
"sites_user_id_key" btree (user_id)

quaker=> \d users
Table "public.users"
Column | Type | Modifiers

-----------+-------------------+----------------------------------------------------
id | integer | not null default
nextval('users_id_seq'::regclass)
user_name | character varying | not null
extra | integer |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)

Both tables filled with 100k records of random data. In users id is in
range from 1..100k, same in sites. In sites user_id is random, range
from 1..150k.

I've created views:

quaker=> \d users_secure
View "public.users_secure"
Column | Type | Modifiers
-----------+-------------------+-----------
id | integer |
user_name | character varying |
View definition:
SELECT users.id, users.user_name
FROM users;

quaker=> \d users_secure_with_has_extra
View "public.users_secure_with_has_extra"
Column | Type | Modifiers
-----------+-------------------+-----------
id | integer |
user_name | character varying |
has_extra | boolean |
View definition:
SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra
FROM users;

Now, when I do simple query to find all data for sites matching
site_name like 'H3bh%' (there are at least one record in sites matching
this condition).

quaker=> explain analyze select s.site_name,u.user_name from
sites_secure s left join users_secure_with_has_extra u on u.id =
s.user_id where site_name like 'H3bh%' order by site_name limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3897.02..3897.03 rows=2 width=44) (actual
time=430.326..430.331 rows=1 loops=1)
-> Sort (cost=3897.02..3897.03 rows=2 width=44) (actual
time=430.321..430.323 rows=1 loops=1)
Sort Key: sites.site_name
-> Nested Loop Left Join (cost=0.00..3897.01 rows=2
width=44) (actual time=290.103..430.301 rows=1 loops=1)
Join Filter: ("inner".id = "outer".user_id)
-> Index Scan using sites_site_name_key_unique on sites
(cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.054 rows=1
loops=1)
Index Cond: (((site_name)::text ~>=~ 'H3bh'::text)
AND ((site_name)::text ~<~ 'H3bi'::text))
Filter: ((site_name)::text ~~ 'H3bh%'::text)
-> Seq Scan on users (cost=0.00..1641.00 rows=100000
width=20) (actual time=0.007..245.406 rows=100000 loops=1)
Total runtime: 430.432 ms
(10 rows)

When I resign from LEFT JOIN users_secure_with_has_extra, and put JOIN
instead I've got:

quaker=> explain analyze select s.site_name,u.user_name from
sites_secure s join users_secure_with_has_extra u on u.id = s.user_id
where site_name like 'H3bh%' order by site_name limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.112..0.118
rows=1 loops=1)
-> Sort (cost=9.05..9.06 rows=1 width=24) (actual
time=0.106..0.108 rows=1 loops=1)
Sort Key: sites.site_name
-> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual
time=0.073..0.088 rows=1 loops=1)
-> Index Scan using sites_site_name_key_unique on sites
(cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.050 rows=1
loops=1)
Index Cond: (((site_name)::text ~>=~ 'H3bh'::text)
AND ((site_name)::text ~<~ 'H3bi'::text))
Filter: ((site_name)::text ~~ 'H3bh%'::text)
-> Index Scan using users_pkey on users
(cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.023 rows=1 loops=1)
Index Cond: (users.id = "outer".user_id)
Total runtime: 0.216 ms
(10 rows)

As explain shows PostgreSQL is using index scan on users, instead of seq
scan like in example above.

Now. When I use view with no has_extra field (important: field is a
simple function on extra field) I get expectable results. Both using
indexes.

quaker=> explain analyze select s.site_name,u.user_name from
sites_secure s left join users_secure u on u.id = s.user_id where
site_name like 'H3bh%' order by site_name limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.111..0.117
rows=1 loops=1)
-> Sort (cost=9.05..9.06 rows=1 width=24) (actual
time=0.105..0.107 rows=1 loops=1)
Sort Key: sites.site_name
-> Nested Loop Left Join (cost=0.00..9.04 rows=1 width=24)
(actual time=0.072..0.087 rows=1 loops=1)
-> Index Scan using sites_site_name_key_unique on sites
(cost=0.00..6.01 rows=1 width=16) (actual time=0.043..0.049 rows=1
loops=1)
Index Cond: (((site_name)::text ~>=~ 'H3bh'::text)
AND ((site_name)::text ~<~ 'H3bi'::text))
Filter: ((site_name)::text ~~ 'H3bh%'::text)
-> Index Scan using users_pkey on users
(cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.022 rows=1 loops=1)
Index Cond: (users.id = "outer".user_id)
Total runtime: 0.216 ms
(10 rows)

quaker=> explain analyze select s.site_name,u.user_name from
sites_secure s join users_secure u on u.id = s.user_id where site_name
like 'H3bh%' order by site_name limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9.05..9.06 rows=1 width=24) (actual time=0.109..0.115
rows=1 loops=1)
-> Sort (cost=9.05..9.06 rows=1 width=24) (actual
time=0.104..0.106 rows=1 loops=1)
Sort Key: sites.site_name
-> Nested Loop (cost=0.00..9.04 rows=1 width=24) (actual
time=0.071..0.086 rows=1 loops=1)
-> Index Scan using sites_site_name_key_unique on sites
(cost=0.00..6.01 rows=1 width=16) (actual time=0.042..0.048 rows=1
loops=1)
Index Cond: (((site_name)::text ~>=~ 'H3bh'::text)
AND ((site_name)::text ~<~ 'H3bi'::text))
Filter: ((site_name)::text ~~ 'H3bh%'::text)
-> Index Scan using users_pkey on users
(cost=0.00..3.02 rows=1 width=16) (actual time=0.018..0.021 rows=1 loops=1)
Index Cond: (users.id = "outer".user_id)
Total runtime: 0.214 ms
(10 rows)

Why?

quaker=> select version();
version

-----------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.1.10 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.3
20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)
(1 row)



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:43 AM
Tom Lane
 
Posts: n/a
Default Re: Trouble with LEFT JOIN using VIEWS.

=?ISO-8859-2?Q?Piotr_Gasid=B3o?= <quaker@barbara.eu.org> writes:
> I've just hit problem, that is unusual for me.


> View definition:
> SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra
> FROM users;


What you've got here is a non-nullable target list, which creates an
optimization fence when used in the nullable side of an outer join.
The problem is that has_extra should read as NULL in the query output
for a sites_secure row that has no match in users_secure_with_has_extra,
but making users.extra go to null will not make that happen, so there's
a constraint on where the expression can be evaluated. The current
planner has no way to deal with that except by restricting the plan
structure.

We have some ideas about how to fix this, but don't hold your breath
.... it's going to take major surgery on the planner, AFAICS.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:32 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com