Unix Technical Forum

NOLOGGING option, or ?

This is a discussion on NOLOGGING option, or ? within the pgsql Hackers forums, part of the PostgreSQL category; --> On Wed, Jun 01, 2005 at 10:35:30AM -0700, Alon Goldshuv wrote: > 2) A modified command syntax for introducing ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #61 (permalink)  
Old 04-11-2008, 05:11 AM
Alvaro Herrera
 
Posts: n/a
Default Re: NOLOGGING option, or ?

On Wed, Jun 01, 2005 at 10:35:30AM -0700, Alon Goldshuv wrote:

> 2) A modified command syntax for introducing a direct single row error
> handling. By direct I mean - a row that if rejected from within the COPY
> command context does not throw an error and rollsback the whole transaction.
> Instead the error is caught and recorded elsewhere, maybe in some error
> table, with some more information that can later on be retrieved. The
> following rows continue to be processed. This way there is barely any error
> handling overhead.


Is there any idea on exactly how would this be done? Do you plan on
using savepoints to implement it? I fail to see how is this "barely any
overhead". Savepoints are not that expensive but they are not free either.
(No, I haven't measured it.)

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #62 (permalink)  
Old 04-11-2008, 05:11 AM
Luke Lonergan
 
Posts: n/a
Default Re: NOLOGGING option, or ?

Steve,

> I've been following this thread, and I'm a little confused. Could you
> possibly clarify what you mean, by providing a couple of lines of
> input as it would be formatted with escape processing turned off -
> containing a text field with an embedded newline and tab and a null field.


Using an extended command syntax for a "copy-like" command named LOAD:

LOAD [schema.]tablename [(column1,column2,...)]
FROM {'filename' | STDIN}
[ [WITH]
[DELIMITER [AS] 'delimiter']
[NULL [AS] 'null string']
[ESCAPE [AS] Œescape¹] ] ;

If you intend to support embedded newlines (0x0a) in your character data
without escapes, you will need to choose an alternative newline character
for formatting the data. An example that uses '0xaa' as the newline might
be:

Control statement:
LOAD webform (formdata) FROM /home/sample/sample.txt WITH DELIMITER Œ|¹ NULL
Œ¹ NEWLINE '0xaa';

Sample with 2 identical rows (with binary representations depicted between
<>):
Blahblah<0xaa>blahblah<0x09>blahblah<0x00>blahblah <0xaa>Blahblah<0xaa>blahbl
ah<0x09>blahblah<0x00>blahblah<0xaa>

- Luke



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #63 (permalink)  
Old 04-11-2008, 05:11 AM
Luke Lonergan
 
Posts: n/a
Default Re: NOLOGGING option, or ?


Steve,

Oops. Example below should have read differently:

> Sample with 2 identical rows (with binary representations depicted between
> <>):
> Blahblah<0xaa>blahblah<0x09>blahblah<0x00>blahblah <0xaa>Blahblah<0xaa>blahbl
> ah<0x09>blahblah<0x00>blahblah<0xaa>


Blahblah<0x0a>blahblah<0x09>blahblah<0x00>blahblah <0xaa>Blahblah<0x0a>blahbl
ah<0x09>blahblah<0x00>blahblah<0xaa>

This would result in the load of two records each of which would look like
this (when printed on a typical terminal):

Blahblah
blahblah blahblah

Luke



---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #64 (permalink)  
Old 04-11-2008, 05:11 AM
Hannu Krosing
 
Posts: n/a
Default Re: NOLOGGING option, or ?

On K, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> Hannu Krosing <hannu@skype.net> writes:
> > I think this should be a decision done when creating a table, just like
> > TEMP tables. So you always know if a certain table is or is not
> > safe/replicated/recoverable.
> > This has also the advantage of requiring no changes to actual COPY and
> > INSERT commands.

>
> That doesn't seem right to me; the scenario I envision is that you are
> willing to do the initial data loading over again (since you presumably
> still have the source data available). But once you've got it loaded
> you want full protection.


What I mean, was that as it can't be safely replicated using log-
shipping, It should be visible as such.

> Perhaps it could work to use an ALTER TABLE command to flip the state.


No. It would be the same as flipping a TEMP table to an ordinary table,
which we don't support, and IMHO for a good reason

> But I'm not really seeing the point compared to treating it as a COPY
> option.


The point is having a separate (sub)type of storage - non-WAL/non-
replicated table and its indexes.

> I do not believe that anyone needs this to work on individual
> INSERT commands --- if you are after max speed, why aren't you using
> COPY? And treating it as an ALTER property opens the possibility of
> forgetting to ALTER the table back to normal behavior, which would be
> a foot-gun of large caliber indeed :-(


That's what I'm trying to avoid - If it is obvious, that the whole table
is quasi-stable (in PITR/log-shipping sense) it is more clearly a user
choice what kinds of data can be stored there. Same as TEMP tables
again.

--
Hannu Krosing <hannu@tm.ee>

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #65 (permalink)  
Old 04-11-2008, 05:11 AM
Hannu Krosing
 
Posts: n/a
Default Re: NOLOGGING option, or ?

On K, 2005-06-01 at 18:05 -0400, Bruce Momjian wrote:

> What we could do is to do no-WAL automatically for empty tables (like
> when a database is first loaded),


You forget that some databases use WAL for PITR / replication and doing
it automatically there would surely mess up their replica.

How is index creation handeled if it is not logged in WAL ?
- is it not automatically WAL'ed ?
- Must one recreate indexes after PITR or failover ?

> and use the flag for cases where the
> tables is not zero pages. The fact is that database loads are a prefect
> case for this optimization and old dumps are not going to have that flag
> anyway, and automatic is better if we can do it.


--
Hannu Krosing <hannu@tm.ee>

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #66 (permalink)  
Old 04-11-2008, 05:12 AM
Alon Goldshuv
 
Posts: n/a
Default Re: NOLOGGING option, or ?

>> 2) A modified command syntax for introducing a direct single row error
>> handling. By direct I mean - a row that if rejected from within the COPY
>> command context does not throw an error and rollsback the whole transaction.
>> Instead the error is caught and recorded elsewhere, maybe in some error
>> table, with some more information that can later on be retrieved. The
>> following rows continue to be processed. This way there is barely any error
>> handling overhead.

>
> Is there any idea on exactly how would this be done? Do you plan on
> using savepoints to implement it? I fail to see how is this "barely any
> overhead". Savepoints are not that expensive but they are not free either.
> (No, I haven't measured it.)


Good question, I am not entirely sure if this is possible yet, as I didn't
think it through entirely yet.

I guess data errors could be divided into 2 main categories: mal-formed data
where error is detected even before forming a tuple, and the other is errors
that are caused by some constraint violation, that is, after the tuple is
formed and inserted.

From what I hear and experience the big majority of errors are of the first
type. In that case the error could be caught, the data line + line number +
error description could be inserted into an ERROR table (all TEXT fields),
and then COPY can skip forming a tuple, and move to parsing the next. In
this process there is barely any overhead.

The more difficult part obviously is handling the second error type, which I
haven't looked at yet deeply. Hopefully it is not impossible to do while
keeping transaction integrity (Any ideas anyone?). The overhead for this one
will probably be larger, but again, we expect those to happen less (in most
cases at least). Nevertheless, it is surely much faster than recursively
narrowing down batch sizes.


Alon.



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:11 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com