Unix Technical Forum

PITR Backups

This is a discussion on PITR Backups within the Pgsql Performance forums, part of the PostgreSQL category; --> On Mon, 2007-06-25 at 08:28 -0700, Dan Gorman wrote: > I took several snapshots. In all cases the FS ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #31 (permalink)  
Old 04-19-2008, 11:07 AM
Simon Riggs
 
Posts: n/a
Default Re: PITR Backups

On Mon, 2007-06-25 at 08:28 -0700, Dan Gorman wrote:
> I took several snapshots. In all cases the FS was fine. In one case
> the db looked like on recovery it thought there were outstanding
> pages to be written to disk as seen below and the db wouldn't start.
>
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21
> 00:39:43 PDTLOG: redo done at 71/99870670
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21
> 00:39:43 PDTWARNING: page 28905 of relation 1663/16384/76718 was
> uninitialized


OK, please put log_min_messages = DEBUG2 and re-run the recovery please.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #32 (permalink)  
Old 04-19-2008, 11:07 AM
Tom Lane
 
Posts: n/a
Default Re: PITR Backups

Dan Gorman <dgorman@hi5.com> writes:
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21
> 00:39:43 PDTLOG: redo done at 71/99870670
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21
> 00:39:43 PDTWARNING: page 28905 of relation 1663/16384/76718 was
> uninitialized
> ... lots of these ...
> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [40-1] 2007-06-21
> 00:39:43 PDTPANIC: WAL contains references to invalid pages


(BTW, you'll find putting a space at the end of log_line_prefix
does wonders for log readability.)

Reformatting and sorting, we have

WARNING: page 3535207 of relation 1663/16384/33190 was uninitialized
WARNING: page 3535208 of relation 1663/16384/33190 did not exist
WARNING: page 3535209 of relation 1663/16384/33190 did not exist

WARNING: page 800226 of relation 1663/16384/33204 did not exist

WARNING: page 17306 of relation 1663/16384/76710 did not exist

WARNING: page 13626 of relation 1663/16384/76716 did not exist

WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
WARNING: page 28902 of relation 1663/16384/76718 was uninitialized
WARNING: page 28903 of relation 1663/16384/76718 was uninitialized
WARNING: page 28904 of relation 1663/16384/76718 was uninitialized
WARNING: page 28905 of relation 1663/16384/76718 was uninitialized
WARNING: page 28906 of relation 1663/16384/76718 was uninitialized
WARNING: page 28907 of relation 1663/16384/76718 was uninitialized
WARNING: page 28908 of relation 1663/16384/76718 was uninitialized
WARNING: page 28909 of relation 1663/16384/76718 was uninitialized
WARNING: page 28910 of relation 1663/16384/76718 was uninitialized
WARNING: page 28911 of relation 1663/16384/76718 was uninitialized
WARNING: page 28912 of relation 1663/16384/76718 was uninitialized
WARNING: page 28913 of relation 1663/16384/76718 was uninitialized
WARNING: page 28914 of relation 1663/16384/76718 was uninitialized
WARNING: page 28915 of relation 1663/16384/76718 was uninitialized
WARNING: page 28916 of relation 1663/16384/76718 was uninitialized
WARNING: page 28917 of relation 1663/16384/76718 was uninitialized
WARNING: page 28918 of relation 1663/16384/76718 was uninitialized

WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
WARNING: page 26708 of relation 1663/16384/76719 was uninitialized
WARNING: page 26710 of relation 1663/16384/76719 was uninitialized
WARNING: page 26711 of relation 1663/16384/76719 was uninitialized
WARNING: page 26712 of relation 1663/16384/76719 was uninitialized
WARNING: page 26713 of relation 1663/16384/76719 was uninitialized

So the problems were pretty localized, probably at the ends of these
files. Can you go back to the source database and check which
tables these are --- match the last number cited in each line
against pg_class.relfilenode? Are they tables or indexes, and
about how big are they?

A possible explanation is we stopped scanning WAL before reaching
records that truncated or dropped these tables. But it's not clear why.
Could we see the last few log lines before the "redo done" one?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #33 (permalink)  
Old 04-19-2008, 11:07 AM
Simon Riggs
 
Posts: n/a
Default Re: PITR Backups

On Mon, 2007-06-25 at 12:34 -0400, Tom Lane wrote:
> Dan Gorman <dgorman@hi5.com> writes:
> > Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21
> > 00:39:43 PDTLOG: redo done at 71/99870670


This is mid-way through an xlog file.

> > Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21
> > 00:39:43 PDTWARNING: page 28905 of relation 1663/16384/76718 was
> > uninitialized
> > ... lots of these ...
> > Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [40-1] 2007-06-21
> > 00:39:43 PDTPANIC: WAL contains references to invalid pages

>
> (BTW, you'll find putting a space at the end of log_line_prefix
> does wonders for log readability.)
>
> Reformatting and sorting, we have
>
> WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28902 of relation 1663/16384/76718 was uninitialized


> WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26708 of relation 1663/16384/76719 was uninitialized


Those two are interesting because we appear to have two valid pages in
the middle of some uninitialized ones. That implies were not looking at
an unapplied truncation.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #34 (permalink)  
Old 04-19-2008, 11:07 AM
Dan Gorman
 
Posts: n/a
Default Re: PITR Backups

Thanks for the pointers to a) make it readable and b) log min messages

I didn't however keep the snapshots around. I could try and re-set
this scenario up. I was in the middle of doing some data migration
with Netapp and wanted to just 'test' it to make sure it was sane.

If you guys would like me to try to 'break' it again and keep the db
around for further testing let me know.

Regards,
Dan Gorman


On Jun 25, 2007, at 9:34 AM, Tom Lane wrote:

> Dan Gorman <dgorman@hi5.com> writes:
>> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21
>> 00:39:43 PDTLOG: redo done at 71/99870670
>> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21
>> 00:39:43 PDTWARNING: page 28905 of relation 1663/16384/76718 was
>> uninitialized
>> ... lots of these ...
>> Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [40-1] 2007-06-21
>> 00:39:43 PDTPANIC: WAL contains references to invalid pages

>
> (BTW, you'll find putting a space at the end of log_line_prefix
> does wonders for log readability.)
>
> Reformatting and sorting, we have
>
> WARNING: page 3535207 of relation 1663/16384/33190 was uninitialized
> WARNING: page 3535208 of relation 1663/16384/33190 did not exist
> WARNING: page 3535209 of relation 1663/16384/33190 did not exist
>
> WARNING: page 800226 of relation 1663/16384/33204 did not exist
>
> WARNING: page 17306 of relation 1663/16384/76710 did not exist
>
> WARNING: page 13626 of relation 1663/16384/76716 did not exist
>
> WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28902 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28903 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28904 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28905 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28906 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28907 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28908 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28909 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28910 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28911 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28912 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28913 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28914 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28915 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28916 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28917 of relation 1663/16384/76718 was uninitialized
> WARNING: page 28918 of relation 1663/16384/76718 was uninitialized
>
> WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26708 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26710 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26711 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26712 of relation 1663/16384/76719 was uninitialized
> WARNING: page 26713 of relation 1663/16384/76719 was uninitialized
>
> So the problems were pretty localized, probably at the ends of these
> files. Can you go back to the source database and check which
> tables these are --- match the last number cited in each line
> against pg_class.relfilenode? Are they tables or indexes, and
> about how big are they?
>
> A possible explanation is we stopped scanning WAL before reaching
> records that truncated or dropped these tables. But it's not clear
> why.
> Could we see the last few log lines before the "redo done" one?
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq




---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #35 (permalink)  
Old 04-19-2008, 11:07 AM
Gregory Stark
 
Posts: n/a
Default Re: PITR Backups

"Simon Riggs" <simon@2ndquadrant.com> writes:

>> WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
>> WARNING: page 28902 of relation 1663/16384/76718 was uninitialized

>
>> WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
>> WARNING: page 26708 of relation 1663/16384/76719 was uninitialized

>
> Those two are interesting because we appear to have two valid pages in
> the middle of some uninitialized ones. That implies were not looking at
> an unapplied truncation.


You don't have fsync off do you? That could explain missing pages at the end
of a file like this too. And it would explain how you could have two written
in the midst of others that are missing.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #36 (permalink)  
Old 04-19-2008, 11:07 AM
Tom Lane
 
Posts: n/a
Default Re: PITR Backups

"Simon Riggs" <simon@2ndquadrant.com> writes:
>> Reformatting and sorting, we have
>>
>> WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
>> WARNING: page 28902 of relation 1663/16384/76718 was uninitialized


>> WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
>> WARNING: page 26708 of relation 1663/16384/76719 was uninitialized


> Those two are interesting because we appear to have two valid pages in
> the middle of some uninitialized ones. That implies were not looking at
> an unapplied truncation.


Not necessarily --- it's possible the WAL sequence simply didn't touch
those pages.

Your suggestion to rerun the recovery with higher log_min_messages
is a good one, because that way we'd get some detail about what the
WAL records that touched the pages were. I think DEBUG1 would be
sufficient for that, though, and DEBUG2 might be pretty durn verbose.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #37 (permalink)  
Old 04-19-2008, 11:07 AM
Dan Gorman
 
Posts: n/a
Default Re: PITR Backups

No, however, I will attach the postgreql.conf so everyone can look at
other settings just in case.


Regards,
Dan Gorman

On Jun 25, 2007, at 10:07 AM, Gregory Stark wrote:

> "Simon Riggs" <simon@2ndquadrant.com> writes:
>
>>> WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
>>> WARNING: page 28902 of relation 1663/16384/76718 was uninitialized

>>
>>> WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
>>> WARNING: page 26708 of relation 1663/16384/76719 was uninitialized

>>
>> Those two are interesting because we appear to have two valid
>> pages in
>> the middle of some uninitialized ones. That implies were not
>> looking at
>> an unapplied truncation.

>
> You don't have fsync off do you? That could explain missing pages
> at the end
> of a file like this too. And it would explain how you could have
> two written
> in the midst of others that are missing.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #38 (permalink)  
Old 04-19-2008, 11:07 AM
Tom Lane
 
Posts: n/a
Default Re: PITR Backups

Dan Gorman <dgorman@hi5.com> writes:
> I didn't however keep the snapshots around. I could try and re-set
> this scenario up. I was in the middle of doing some data migration
> with Netapp and wanted to just 'test' it to make sure it was sane.


> If you guys would like me to try to 'break' it again and keep the db
> around for further testing let me know.


Yeah, please do. It's not entirely clear whether you've found a bug
or not, and it'd be good to determine that ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:18 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com