Unix Technical Forum

adding new pages bulky way

This is a discussion on adding new pages bulky way within the pgsql Hackers forums, part of the PostgreSQL category; --> "Tom Lane" <tgl@sss.pgh.pa.us> writes > > I very seriously doubt that there would be *any* win > I did ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 04-11-2008, 05:15 AM
Qingqing Zhou
 
Posts: n/a
Default Re: adding new pages bulky way


"Tom Lane" <tgl@sss.pgh.pa.us> writes
>
> I very seriously doubt that there would be *any* win
>


I did a quick proof-concept implemenation to test non-concurrent batch
insertion, here is the result:

Envrionment:
- Pg8.0.1
- NTFS / IDE


-- batch 16 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4167.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8111.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16444.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41980.000 ms

-- batch 32 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4086.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 7861.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16403.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41290.000 ms

-- batch 64 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4236.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8202.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17265.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 44063.000 ms

-- batch 128 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4256.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8242.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17375.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 43854.000 ms

-- one page extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4496.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 9013.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 19508.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 49962.000 ms

Benefits are there, and it is an approximate 10% improvement if we select
good batch size. The explaination is: if a batch insertion need 6400 new
pages, originally it does write()+file system logs 6400 times, now it does
6400/64 times(though each time the time cost is bigger). Also, considering
write with different size have different cost, seems for my machine 32 is
the an optimal choice.

What I did include:

(1) md.c
Modify function mdextend():
- extend 64 pages each time;
- after extension, let FSM be aware of it (change FSM a little bit so it
could report freespace also for an empty page)

(2) bufmgr.c
make ReadPage(+empty_page) treat different of an empty page and non-empty
one to avoid unnecesary read for new pages, that is:
if (!empty_page)
smgrread(reln->rd_smgr, blockNum, (char *) MAKE_PTR(bufHdr->data));
else
PageInit((char *) MAKE_PTR(bufHdr->data), BLCKSZ, 0); /* Only for
heap pages and race could be here ... */

(3) hio.c
RelationGetBufferForTuple():
- pass correct "empty_page" parameter to ReadPage() according to the query
result from FSM.

Regards,
Qingqing



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 04-11-2008, 05:16 AM
Tom Lane
 
Posts: n/a
Default Re: adding new pages bulky way

"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> What I did include:


> make ReadPage(+empty_page) treat different of an empty page and non-empty
> one to avoid unnecesary read for new pages, that is:


In other words, if FSM is wrong you will overwrite valid data? No
thanks ... this is guaranteed to fail under simple concurrent usage,
let alone any more interesting scenarios like FSM being actually out of
date.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 04-11-2008, 05:16 AM
Qingqing Zhou
 
Posts: n/a
Default Re: adding new pages bulky way


"Tom Lane" <tgl@sss.pgh.pa.us> writes
>
> In other words, if FSM is wrong you will overwrite valid data? No
> thanks ... this is guaranteed to fail under simple concurrent usage,
> let alone any more interesting scenarios like FSM being actually out of
> date.
>


You are welcome ;-). The FSM race/corruption is a trouble. Maybe we could
put it in TODO list for better solutions since we can see the performance
benefits are there.

Regards,
Qingqing



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:20 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com