Unix Technical Forum

adding new pages bulky way

This is a discussion on adding new pages bulky way within the pgsql Hackers forums, part of the PostgreSQL category; --> I need your advice. For on-disk bitmap I run a list of TIDs. TIDs are stored in pages as ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 05:14 AM
Victor Y. Yegorov
 
Posts: n/a
Default adding new pages bulky way

I need your advice.

For on-disk bitmap I run a list of TIDs.

TIDs are stored in pages as an array, page's opaque data holds an array of
bits, indicating whether corresponding TID has been deleted and should be
skipped during the scan.

Pages, that contain TIDs list, are organized in extents, each extent has 2^N
pages, where N is extent's number (i.e. 2nd extent will occupy 4 pages).
Given that I know number of TIDs, that fit into one page, and the TID's
sequential number, I can easily calculate:
- extent number TID belongs to;
- page offset inside that extent, and;
- TID place in the page.

At the moment, I store BlockNumber of the extent's first page in the
metapage and allocate all pages that belongs to that extent sequentially. I
need to do so to minimize number of page reads when searching for the TID in
the list; I'll need to read 1 page at most to find out TID at given position
during the scan. I hope you understood the idea.

This also means, that while extent's pages are being added this way, no other
pages can be added to the index. And the higher is extent's number, the more
time it'll take to allocate all pages.

The question is: allocating pages this way is really ugly, I understand. Is
there some API that would allow allocating N pages in the bulk way?
Maybe this is a know problem, that has been already solved before?
Any other ideas?


Thanks in advance!


--

Victor Y. Yegorov

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 05:14 AM
Alvaro Herrera
 
Posts: n/a
Default Re: adding new pages bulky way

On Mon, Jun 06, 2005 at 10:59:04PM +0300, Victor Y. Yegorov wrote:

> The question is: allocating pages this way is really ugly, I understand. Is
> there some API that would allow allocating N pages in the bulk way?
> Maybe this is a know problem, that has been already solved before?
> Any other ideas?


I don't understand your question. What's the problem with holding the
extend lock for the index relation while you extend it? Certainly you
want only a single process creating a new extent in the index, right?

I guess the question is when are the extents created, and what
concurrency do you expect from that operation.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"La naturaleza, tan frágil, tan expuesta a la muerte... y tan viva"

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 05:15 AM
Tom Lane
 
Posts: n/a
Default Re: adding new pages bulky way

"Victor Y. Yegorov" <viy@mits.lv> writes:
> [ scheme involving a predetermined layout of index pages ]


> The question is: allocating pages this way is really ugly, I understand. Is
> there some API that would allow allocating N pages in the bulk way?


Why bother? Just write each page when you need to --- there's no law
that says you must use P_NEW. The hash index type does something pretty
similar, IIRC.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 05:15 AM
Qingqing Zhou
 
Posts: n/a
Default Re: adding new pages bulky way


"Tom Lane" <tgl@sss.pgh.pa.us> writes
>
> Why bother? Just write each page when you need to --- there's no law
> that says you must use P_NEW. The hash index type does something pretty
> similar, IIRC.
>


Is there any performance benefits if we have a mdextend_several_pages()
function in md.c? So the relation can be extended in a bulky way. In my
understanding, if we write

write(fd, buffer, BLCKSZ*10)

instead of

for (i=0; i<10; i++)
write(fd, buffer, BLCKSZ);

This will reduce some costs of file system logs for journal file systems. Of
course, the cost we have to pay is the longer time of holding relation
extension lock.

Regards,
Qingqing


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 05:15 AM
Tom Lane
 
Posts: n/a
Default Re: adding new pages bulky way

"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> Is there any performance benefits if we have a mdextend_several_pages()
> function in md.c?


I very seriously doubt that there would be *any* win, and I doubt even
more that it could possibly be worth the klugery you'd have to do to
make it happen. Bear in mind that index access methods are two API
layers away from md.c --- how will you translate this into something
that makes sense in the context of bufmgr's API?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 05:15 AM
Qingqing Zhou
 
Posts: n/a
Default Re: adding new pages bulky way


"Tom Lane" <tgl@sss.pgh.pa.us> writes
>
> I very seriously doubt that there would be *any* win, and I doubt even
> more that it could possibly be worth the klugery you'd have to do to
> make it happen. Bear in mind that index access methods are two API
> layers away from md.c --- how will you translate this into something
> that makes sense in the context of bufmgr's API?
>


Index access or heap access doesn't matter. The imaginary plan is like this:

-- change 1 --
/* md.c */
mdextend()
{
mdextend_several_pages();
add_pages_to_FSM();
}

-- change 2 --
/*
* Any places hold relation extension lock
*/

if (needLock)
LockPage(relation, 0, ExclusiveLock);

/* ADD: check again here */
if (InvalidBlockNumber != GetPageWithFreeSpace())
UnlockPage(relation, 0, ExclusiveLock);

/* I have to do the extension */
buffer = ReadBuffer(relation, P_NEW);

Above code is quite like how we handle xlogflush() currently.



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-11-2008, 05:15 AM
Victor Y. Yegorov
 
Posts: n/a
Default Re: adding new pages bulky way

* Tom Lane <tgl@sss.pgh.pa.us> [07.06.2005 07:59]:
> Why bother? Just write each page when you need to --- there's no law
> that says you must use P_NEW.


This means 2 things:
1) I cannot mix P_NEW and exact-number ReadBuffer() calls;
2) thus, I have to track next-block-number myself.

Is it so?


BTW, are there any differences in buffer seeking speed, if buffer
block-numbers are mixed and if they're not (i.e. P_NEW is used)?


--

Victor Y. Yegorov

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-11-2008, 05:15 AM
Alvaro Herrera
 
Posts: n/a
Default Re: adding new pages bulky way

On Tue, Jun 07, 2005 at 07:52:57PM +0300, Victor Y. Yegorov wrote:
> * Tom Lane <tgl@sss.pgh.pa.us> [07.06.2005 07:59]:
> > Why bother? Just write each page when you need to --- there's no law
> > that says you must use P_NEW.

>
> This means 2 things:
> 1) I cannot mix P_NEW and exact-number ReadBuffer() calls;


Huh, why? You need to grab the relation extension block
(LockRelationForExtension in CVS tip).

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"[PostgreSQL] is a great group; in my opinion it is THE best open source
development communities in existence anywhere." (Lamar Owen)

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-11-2008, 05:15 AM
Victor Y. Yegorov
 
Posts: n/a
Default Re: adding new pages bulky way

* Alvaro Herrera <alvherre@surnet.cl> [08.06.2005 00:39]:
> Huh, why? You need to grab the relation extension block
> (LockRelationForExtension in CVS tip).


Really? Didn't knew that.

Consider:
1) I add 2 pages to the newly-created relation
using P_NEW as BlockNumber;
2) then I do LockRelationForExtension; ReadBuffer(135) and
UnockRelationForExtension.

What BlockNumber will be assigned to the buffer, if I call
ReadBuffer(P_NEW) now? 136?


BTW, is it OK to say "BlockNumber is assigned to buffer"?


--

Victor Y. Yegorov

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-11-2008, 05:15 AM
Tom Lane
 
Posts: n/a
Default Re: adding new pages bulky way

"Victor Y. Yegorov" <viy@mits.lv> writes:
> * Alvaro Herrera <alvherre@surnet.cl> [08.06.2005 00:39]:
>> Huh, why? You need to grab the relation extension block
>> (LockRelationForExtension in CVS tip).


> Really? Didn't knew that.


> Consider:
> 1) I add 2 pages to the newly-created relation
> using P_NEW as BlockNumber;
> 2) then I do LockRelationForExtension; ReadBuffer(135) and
> UnockRelationForExtension.


As things are set up at the moment, you really should not use
P_NEW at all unless you hold the relation extension lock.

(At least not for ordinary heap relations. An index access
method could have its own rules about how to add blocks to
the relation --- hash does for instance.)

This is all pretty ugly in my view, and so I would not stand
opposed to ideas about a cleaner design ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:10 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com