Unix Technical Forum

Re: Remove xmin and cmin from frozen tuples

This is a discussion on Re: Remove xmin and cmin from frozen tuples within the pgsql Hackers forums, part of the PostgreSQL category; --> On Thu, Sep 01, 2005 at 10:45:44AM +0900, ITAGAKI Takahiro wrote: Hi, > I think it would be a ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 05:31 AM
Alvaro Herrera
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

On Thu, Sep 01, 2005 at 10:45:44AM +0900, ITAGAKI Takahiro wrote:

Hi,

> I think it would be a waste to retain xmin and cmin for frozen tuples
> because their values represent only 'visible for all transactions'.
> Additionally, most tuples in database can be frozen potentially.


I think this is an interesting idea. I was thinking that when the tuple
needs to be obsoleted it would need to grow to accomodate the Xmax, but
you are not actually proposing to remove that, so it seems sensible. In
fact, it is perfectly reasonable to remove Xmin and Cmin, because after
the tuple is frozen, the Xmin never changes again.

Now, one thing of note is that you need to "compress" the page in order
to actually be able to use the just-freed space. VACUUM could do that,
but maybe it would be better to do it on-line -- the freezing process is
going to have to write the page regardless. I wonder if with your patch
the page is compressed on the same VACUUM execution that freezes the
tuple?

One thing that comes to mind is that this makes somewhat easier to build
a tool to write pre-built tables, for bulk-loading purposes. You just
construct the binary file with the HEAP_FROZEN bit set, and then attach
the file to a dummy table. (Then again, you can do it today, using a
Xmin of FrozenTransactionId. I wonder why the Bizgres people isn't
advocating a tool to do that. It is very hard to do with user-defined
types, but for BI/DW you mostly don't need those, do you?)

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
"Cuando no hay humildad las personas se degradan" (A. Christie)

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 05:31 AM
Josh Berkus
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

Alvaro,

> One thing that comes to mind is that this makes somewhat easier to build
> a tool to write pre-built tables, for bulk-loading purposes. You just
> construct the binary file with the HEAP_FROZEN bit set, and then attach
> the file to a dummy table. (Then again, you can do it today, using a
> Xmin of FrozenTransactionId. I wonder why the Bizgres people isn't
> advocating a tool to do that. It is very hard to do with user-defined
> types, but for BI/DW you mostly don't need those, do you?)


Hmmm ... can you expand on this a little? We'd discussed "frozen partitions"
but hadn't thought to get around to them for a while, expecting the kind of
issues which Tom just raised.

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 05:31 AM
ITAGAKI Takahiro
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> Now, one thing of note is that you need to "compress" the page in order
> to actually be able to use the just-freed space. VACUUM could do that,
> but maybe it would be better to do it on-line -- the freezing process is
> going to have to write the page regardless.


I agree. I think an good position of freezer is on bgwriter.
My idea is:
1. Just before bgwriter writes an dirty page in LRU order,
2. Freeze tuples in the page and repair fragmentation.
3. (Replace the fsm page that has least freespace.)
4. Flush the page.


> I wonder if with your patch
> the page is compressed on the same VACUUM execution that freezes the tuple?


Yes, defragmentation is performed after freezing, but the page has at least
one dead tuple. In current VACUUM implementation, pages that have no dead
tuples will not be defraged. So you cannot "compress" just after bulk-load.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 05:31 AM
Tom Lane
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

ITAGAKI Takahiro <itagaki.takahiro@lab.ntt.co.jp> writes:
> I agree. I think an good position of freezer is on bgwriter.
> My idea is:
> 1. Just before bgwriter writes an dirty page in LRU order,
> 2. Freeze tuples in the page and repair fragmentation.
> 3. (Replace the fsm page that has least freespace.)
> 4. Flush the page.


This is a bad idea. The bgwriter isn't the place to be doing freezing,
because there is no reasonable way for it to guarantee that all old
tuples in a table (or any larger unit) have been frozen. So you'd still
need VACUUM to ensure no wraparound. Plus, you can't do such changes
without emitting an XLOG record, which is something we don't want
happening in the bgwriter's inner loop. Even more to the point, you
can't do such changes without getting a superexclusive lock on the page
(not only locked, but no one else has it pinned), which is a real
nonstarter for the bgwriter, both for performance and possible deadlock
issues.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 05:31 AM
Alvaro Herrera
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

On Wed, Aug 31, 2005 at 09:14:42PM -0700, Josh Berkus wrote:

> > One thing that comes to mind is that this makes somewhat easier to build
> > a tool to write pre-built tables, for bulk-loading purposes. You just
> > construct the binary file with the HEAP_FROZEN bit set, and then attach
> > the file to a dummy table. (Then again, you can do it today, using a
> > Xmin of FrozenTransactionId. I wonder why the Bizgres people isn't
> > advocating a tool to do that. It is very hard to do with user-defined
> > types, but for BI/DW you mostly don't need those, do you?)

>
> Hmmm ... can you expand on this a little? We'd discussed "frozen partitions"
> but hadn't thought to get around to them for a while, expecting the kind of
> issues which Tom just raised.


What issues did he raise on this?

What I'm saying is that you can write a heap file, on which the tuples
would all have xmin=FrozenTransactionId, xmax=Invalid, and the
corresponding bits set in the infomask. This ensures that no matter the
state of the server, you can plug the file in and all tuples will be
valid.

The "only" problem is figuring out how to lay the data in the tuples
themselves, w.r.t endianness and such. This is platform-dependent, so
you have to write code to do it correctly. In absence of user-defined
types, this should not be _too_ hard to do. Of course, such a program
would in general also be Postgres-version-dependent.

Note that this is a very different business from skipping the Xmin and
Cmin from the tuple header -- in fact, there's no relation to that at
all.

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
FOO MANE PADME HUM

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 05:31 AM
Tom Lane
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> What I'm saying is that you can write a heap file, on which the tuples
> would all have xmin=FrozenTransactionId, xmax=Invalid, and the
> corresponding bits set in the infomask. This ensures that no matter the
> state of the server, you can plug the file in and all tuples will be
> valid.


> The "only" problem is figuring out how to lay the data in the tuples
> themselves, w.r.t endianness and such. This is platform-dependent, so
> you have to write code to do it correctly. In absence of user-defined
> types, this should not be _too_ hard to do. Of course, such a program
> would in general also be Postgres-version-dependent.


Of course, it's fair to ask whether such a program would be any faster
than binary-mode COPY by the time you got done ... or enough faster to
justify your effort, anyway.

THe only fundamental disadvantage that COPY labors under is having to
write WAL records. It might be interesting to do something similar to
the recent hacks for CREATE TABLE AS, so that a COPY into a table just
created in the current transaction would skip writing WAL and instead
fsync the table at the end.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-11-2008, 05:31 AM
Simon Riggs
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

On Wed, 2005-08-31 at 22:25 -0400, Alvaro Herrera wrote:
> On Thu, Sep 01, 2005 at 10:45:44AM +0900, ITAGAKI Takahiro wrote:
>
> Hi,
>
> > I think it would be a waste to retain xmin and cmin for frozen tuples
> > because their values represent only 'visible for all transactions'.
> > Additionally, most tuples in database can be frozen potentially.

>
> I think this is an interesting idea.


Agreed, especially since it would avoid the need to vacuum altogether.

> I was thinking that when the tuple
> needs to be obsoleted it would need to grow to accomodate the Xmax, but
> you are not actually proposing to remove that, so it seems sensible. In
> fact, it is perfectly reasonable to remove Xmin and Cmin, because after
> the tuple is frozen, the Xmin never changes again.


It's a good idea, but the Xmin is set to FrozenTransactionId, which is
how we know it is frozen, so how can we remove Xmin? The way to do this
is surely by using a row version id that is different for this format.

Getting 8 or 16 bytes per row back would be a very useful gain.

> Now, one thing of note is that you need to "compress" the page in order
> to actually be able to use the just-freed space. VACUUM could do that,
> but maybe it would be better to do it on-line -- the freezing process is
> going to have to write the page regardless. I wonder if with your patch
> the page is compressed on the same VACUUM execution that freezes the
> tuple?


Only if you do a FULL, which is currently incompatible with a FREEZE.

There's no point in compressing a block if you can't also redistribute
rows between blocks to fill up the spaces, so another reason why it has
to be a FULL. Unless you do this at load time, which is why I guess you
mention....

> One thing that comes to mind is that this makes somewhat easier to build
> a tool to write pre-built tables, for bulk-loading purposes. You just
> construct the binary file with the HEAP_FROZEN bit set, and then attach
> the file to a dummy table. (Then again, you can do it today, using a
> Xmin of FrozenTransactionId. I wonder why the Bizgres people isn't
> advocating a tool to do that. It is very hard to do with user-defined
> types, but for BI/DW you mostly don't need those, do you?)


Loading a table using COPY with frozen bits set was suggested in May, so
yeh... it was suggested. At that time it was rejected, since earlier
transactions would then be able to see rows they ought not be able to
see. Thinking some more about this, this is only the inverse situation
of a TRUNCATE. With truncate we remove tuples that ought to still be
visible to pre-existing transactions. So there shouldn't really be an
issue with loading pre-frozen tuples - as long as you accept the
consequences for row visibility.

Externally writing blocks is possible, but it bypasses a lot of other
features. My current preference would be to have bulk_heap_insert()
function to add a whole page at a time rather than inserting rows one at
at a time. The main objective for a load is to make it disk bound; once
we've achieved that by some further tuning, writing an external file
would cost around the same as writing it internally from the DBMS.
Oracle (direct path loader) and Teradata (Fastload) load data in
complete blocks using a reduced code pathway, so I guess I was just
following on, but I'm genuinely open to further persuasion if there is a
better way.

Having a table marked as INSERT ONLY would allow us to save 8 bytes/row,
loading it pre-frozen (in some way) would save another 8 bytes/row and
allow us to permanently avoid VACUUMing the table. That would be even
better when we have per-table XID wrap avoidance.

Best Regards, Simon Riggs


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-11-2008, 05:32 AM
Alvaro Herrera
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

On Thu, Sep 01, 2005 at 11:08:36AM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > What I'm saying is that you can write a heap file, on which the tuples
> > would all have xmin=FrozenTransactionId, xmax=Invalid, and the
> > corresponding bits set in the infomask. This ensures that no matter the
> > state of the server, you can plug the file in and all tuples will be
> > valid.

>
> > The "only" problem is figuring out how to lay the data in the tuples
> > themselves, w.r.t endianness and such. This is platform-dependent, so
> > you have to write code to do it correctly. In absence of user-defined
> > types, this should not be _too_ hard to do. Of course, such a program
> > would in general also be Postgres-version-dependent.

>
> Of course, it's fair to ask whether such a program would be any faster
> than binary-mode COPY by the time you got done ... or enough faster to
> justify your effort, anyway.


It may not be faster generating the data in the first place, but you
don't have to vacuum the table, nor you are subject to hint bits
changing, resulting in more unnecessary I/O.

This can't be avoided with COPY, because there's always the chance that
it will fail partway through, so you can't write frozen tuples. With an
external program, you can just dump the invalid line somewhere else and
continue with the rest.

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
"Just treat us the way you want to be treated + some extra allowance
for ignorance." (Michael Brusser)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-11-2008, 05:32 AM
Josh Berkus
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

Alvaro,

> What issues did he raise on this?


On having no Xmin.

> What I'm saying is that you can write a heap file, on which the tuples
> would all have xmin=FrozenTransactionId, xmax=Invalid, and the
> corresponding bits set in the infomask. This ensures that no matter the
> state of the server, you can plug the file in and all tuples will be
> valid.
>
> The "only" problem is figuring out how to lay the data in the tuples
> themselves, w.r.t endianness and such. This is platform-dependent, so
> you have to write code to do it correctly. In absence of user-defined
> types, this should not be _too_ hard to do. Of course, such a program
> would in general also be Postgres-version-dependent.


So, bulk loading by file generation? So the idea is that you would generate
a properly formatted PostgreSQL table file, and then in one transaction
create the table and attach it?

Seems like this would have the additional limitation of being useful only for
loading new partitions/new tables. However, it would have some significant
advantages for bulk loading ... chiefly that the data page generation and
associated computations could be done *off* the database server. This might
help considerably in getting around the 100mb/s data computation ceiling
we're hitting ...

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-11-2008, 05:32 AM
Josh Berkus
 
Posts: n/a
Default Re: Remove xmin and cmin from frozen tuples

Tom,

> THe only fundamental disadvantage that COPY labors under is having to
> write WAL records. It might be interesting to do something similar to
> the recent hacks for CREATE TABLE AS, so that a COPY into a table just
> created in the current transaction would skip writing WAL and instead
> fsync the table at the end.


Yes, I thought we discussed doing this for empty tables -- it would be, per
our tests, a +10% to +30% boost to COPY.

But there was some problem the patch?

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:47 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com