Unix Technical Forum

locks in CREATE TRIGGER, ADD FK

This is a discussion on locks in CREATE TRIGGER, ADD FK within the pgsql Hackers forums, part of the PostgreSQL category; --> Neil Conway <neilc@samurai.com> writes: > How can we drop the file at commit, given that a serializable > transaction's ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #21 (permalink)  
Old 04-11-2008, 04:11 AM
Tom Lane
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:
> How can we drop the file at commit, given that a serializable
> transaction's snapshot should still be able to see old relfilenode's
> content?


It isn't 100% MVCC, I agree. But it works because system catalog
lookups are SnapshotNow, and so when another session comes and wants to
look at the table it will see the committed new version of the pg_class
row pointing at the new relfilenode file. What you have to prevent is
somebody accessing the table *while* the changeover happens ... and
that's why your lock has to be AccessExclusive.

If you want to complain about MVCC violations in CLUSTER, think about
the fact that it scans the table with SnapshotNow, and therefore loses
rows that are committed-dead but might still be visible to somebody.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #22 (permalink)  
Old 04-11-2008, 04:11 AM
Tom Lane
 
Posts: n/a
Default Re: swapping relfilenodes (was: Re: locks in CREATE TRIGGER,

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> In fact, would a truncate during a backup cause the backup to be
> inconsistent because it wouldn't be a true snapshot of the database at
> backup start time? Seems so.


No, because pg_dump holds AccessShareLock on every table that it intends
to dump, thereby ensuring that TRUNCATE/CLUSTER/etc are held off. The
proposal to weaken the locks that those operations take would in fact
break pg_dump.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #23 (permalink)  
Old 04-11-2008, 04:11 AM
Bruce Momjian
 
Posts: n/a
Default Re: swapping relfilenodes (was: Re: locks in CREATE TRIGGER,

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > In fact, would a truncate during a backup cause the backup to be
> > inconsistent because it wouldn't be a true snapshot of the database at
> > backup start time? Seems so.

>
> No, because pg_dump holds AccessShareLock on every table that it intends
> to dump, thereby ensuring that TRUNCATE/CLUSTER/etc are held off. The
> proposal to weaken the locks that those operations take would in fact
> break pg_dump.


Oh, it pre-locks. I didn't know that.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #24 (permalink)  
Old 04-11-2008, 04:11 AM
Tom Lane
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:
> - if we make these changes, we will need some way to delete a
> no-longer-visible relfilenode.


This is presuming that we abandon the notion that system catalog
access use SnapshotNow. Which opens the question of what they should
use instead ... to which "transaction snapshot" isn't the answer,
because we have to be able to do system catalog accesses before
we've set the snapshot. (Else forget issuing LOCK TABLE before
the snapshot is set.)

I really think that you haven't the faintest idea of the size of
the can of worms you are opening here :-(

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #25 (permalink)  
Old 04-11-2008, 04:11 AM
Neil Conway
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Tom Lane wrote:
> It isn't 100% MVCC, I agree. But it works because system catalog
> lookups are SnapshotNow, and so when another session comes and wants to
> look at the table it will see the committed new version of the pg_class
> row pointing at the new relfilenode file.


If by "works", you mean "provides correct transactional semantics", then
that simply isn't true. Not making CLUSTER and similar DDL commands MVCC
compliant isn't the end of the world, I agree, but that doesn't make it
correct, either.

> If you want to complain about MVCC violations in CLUSTER, think about
> the fact that it scans the table with SnapshotNow, and therefore loses
> rows that are committed-dead but might still be visible to somebody.


This seems like another facet of the same problem (a serializable
transaction's snapshot effectively includes the relfilenodes that were
visible when the snapshot was taken, and swapping in another relfilenode
under its nose is asking for trouble).

We could fix the CLUSTER bug, although not the TRUNCATE bug, by scanning
the old relation with SnapshotAny (or ideally, "the snapshot such that
we can see all tuples visible to any currently running transaction", if
we can produce such a snapshot easily). Not sure if that's worth doing;
it would be nice to solve the root problem (scanning system catalogs
with SnapshotNow, per discussion elsewhere in thread).

-Neil

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #26 (permalink)  
Old 04-11-2008, 04:11 AM
Neil Conway
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Tom Lane wrote:
> This is presuming that we abandon the notion that system catalog
> access use SnapshotNow. Which opens the question of what they should
> use instead ... to which "transaction snapshot" isn't the answer,
> because we have to be able to do system catalog accesses before
> we've set the snapshot.


I wonder if it would be possible to use SnapshotNow before the
transaction's snapshot has been established, and the transaction's
snapshot subsequently. Although it definitely makes me nervous to use
multiple snapshots over the life of a single transaction...

-Neil

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #27 (permalink)  
Old 04-11-2008, 04:11 AM
Neil Conway
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Tom Lane wrote:
> I don't think this has been adequately thought through at all ... but
> at least make it ExclusiveLock. What is the use-case for allowing
> SELECT FOR UPDATE in parallel with this?


Ok, patch applied -- I adjusted it to use ExclusiveLock, and fleshed out
some of the comments.

-Neil


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #28 (permalink)  
Old 04-11-2008, 04:11 AM
Greg Stark
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:

> Tom Lane wrote:
> > It isn't 100% MVCC, I agree. But it works because system catalog
> > lookups are SnapshotNow, and so when another session comes and wants to
> > look at the table it will see the committed new version of the pg_class
> > row pointing at the new relfilenode file.

>
> If by "works", you mean "provides correct transactional semantics", then that
> simply isn't true. Not making CLUSTER and similar DDL commands MVCC compliant
> isn't the end of the world, I agree, but that doesn't make it correct, either.


I think he means it works because it doesn't matter whether the serializable
transaction sees the old table or the new one. As soon as the CLUSTER commits
the serializable transaction can start using the new one since it's
functionally identical to the old one (at least it's supposed to be, Tom
points out it isn't).

> > If you want to complain about MVCC violations in CLUSTER, think about
> > the fact that it scans the table with SnapshotNow, and therefore loses
> > rows that are committed-dead but might still be visible to somebody.


Ouch. That's, er, a problem. I guess currently it's fine for any transaction
using READ COMMITTED but it's already wrong for serializable transactions. And
it'll be wrong for READ COMMITTED if CLUSTER is changed not to take an
exclusive lock.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #29 (permalink)  
Old 04-11-2008, 04:11 AM
Greg Stark
 
Posts: n/a
Default Re: swapping relfilenodes (was: Re: locks in CREATE TRIGGER,


Tom Lane <tgl@sss.pgh.pa.us> writes:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > In fact, would a truncate during a backup cause the backup to be
> > inconsistent because it wouldn't be a true snapshot of the database at
> > backup start time? Seems so.

>
> No, because pg_dump holds AccessShareLock on every table that it intends
> to dump, thereby ensuring that TRUNCATE/CLUSTER/etc are held off. The
> proposal to weaken the locks that those operations take would in fact
> break pg_dump.


It seems like that would be true for TRUNCATE but not CLUSTER.

Though pg_dump works in READ COMMITTED mode doesn't it? So it doesn't really
get a consistent view of the database exactly anyways. If it tried to go in
SERIALIZABLE mode I suspect it would rarely complete though.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #30 (permalink)  
Old 04-11-2008, 04:11 AM
Tom Lane
 
Posts: n/a
Default Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:
> Tom Lane wrote:
>> I don't think this has been adequately thought through at all ... but
>> at least make it ExclusiveLock. What is the use-case for allowing
>> SELECT FOR UPDATE in parallel with this?


> Ok, patch applied -- I adjusted it to use ExclusiveLock, and fleshed out
> some of the comments.


I think last night's discussion makes it crystal-clear why I felt that
this hasn't been sufficiently thought through. Please revert until the
discussion comes to a conclusion.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:31 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com