This is a discussion on Re: SUSE port (was [ANNOUNCE] PostgreSQL 8.0.0 Release Candidate 5) within the pgsql Hackers forums, part of the PostgreSQL category; --> Simon Riggs wrote: > Not sure what is going on here: why is SUSE not listed on the > ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Simon Riggs wrote: > Not sure what is going on here: why is SUSE not listed on the > supported platforms list? (still) RC5 contains: SUSE Linux x86 8.0.0 Peter Eisentraut (<peter_e@gmx.net>), 2005-01-10 9.1 In the meantime I have received confirmation from Reinhard Max that his test methods match our requirements, so the list will be completed with the other platforms he reported in due time. I'm sorry, but I won't just add "it works" notices if it's not clear what kind of testing was done. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| Greg Stark wrote: >I think part of the problem is that there's a bunch of features related to >these types of queries and the lines between them blur. > >You seem to be talking about putting visibility information inside indexes for >so index-only plans can be performed. But you're also talking about queries >like "select count(*) from foo" with no where clauses. Such a query wouldn't >be helped by index-only scans. > >Perhaps you're thinking about caching the total number of records in a global >piece of state like a materialized view? That would be a nice feature but I >think it should done as a general materialized view implementation, not a >special case solution for just this one query. > >Perhaps you're thinking of the min/max problem of being able to use indexes to >pick out just the tuples satisfying the min/max constraint. That seems to me >to be one of the more tractable problems in this area but it would still >require lots of work. > >I suggest you post a specific query you find is slow. Then discuss how you >think it ought to be executed and why. > > > You are correct, I am proposing to add visibility to the indexes. As for unqualified counts, I believe that they could take advantage of an index-only scan as it requires much less I/O to perform an index scan than a sequential scan on large tables. Min/Max would also take advantage of index only scans but say, for example, that someone has the following: Relation SOME_USERS user_id BIGINT PK user_nm varchar(32) UNIQUE INDEX some_other_attributes... If an application needs the user names, it would run SELECT user_nm FROM SOME_USERS... in the current implementation this would require a sequential scan. On a relation which contains 1M+ tuples, this requires either a lot of I/O or a lot of cache. An index scan would immensely speed up this query. ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html |
| |||
| "Jonah H. Harris" <jharris@tvi.edu> writes: > Tom, Bruce, and others involved in this recurring TODO discussion… > First, let me start by saying that I understand this has been discussed > many times before; however, I’d like to see what the current state of > affairs is regarding the possibility of using a unique index scan to > speed up the COUNT aggregate. It's not happening, because no one has come up with a workable proposal. In particular, we're not willing to slow down every other operation in order to make COUNT-*-with-no-WHERE-clause faster. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html |
| |||
| On Wed, 12 Jan 2005 at 16:29, Peter Eisentraut wrote: > In the meantime I have received confirmation from Reinhard Max that > his test methods match our requirements, so the list will be > completed with the other platforms he reported in due time. Today I've updated the RPMs on the FTP server to RC5, which implicitly means that PostgreSQL passes the regression tests on the provided platforms. ftp://ftp.suse.com/pub/projects/post...resql-8.0.0rc5 I have also successfully compiled and tested RC5 (but not yet created RPMs) on the following platforms: 8.1-i386 8.2-i386 sles8-i386 sles8-ia64 sles8-ppc sles8-ppc64 sles8-s390 sles8-s390x The only failure I have to report is sles8-x86_64, where I am getting segfaults from psql during the regression tests. I am still investigating what's going on there.... cu Reinhard ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Wed, 12 Jan 2005 at 18:20, Reinhard Max wrote: > I am still not sure whether the kerberos library, glibc, or > PostgreSQL is to blame, or if it's a combination of bugs in these > components that triggers the segfault. The problem is, that the heimdal implementation of kerberos5 used on sles8 needs an extra include statement for com_err.h in src/interfaces/libpq/fe-auth.c to get the prototype for error_message(), while on newer SUSE-releases using the MIT Kerberos5 implementation this prototype is provided by krb5.h itself. So I suspect this bug might hit everyone using heimdal, but it only gets triggered when one of the calls to kerberos in src/interfaces/libpq/fe-auth.c returns with an error. I am still not sure why the crash only happened on x86_64. cu Reinhard ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| |||
| Tom Lane wrote: >The fundamental problem is that you can't do it without adding at least >16 bytes, probably 20, to the size of an index tuple header. That would >double the physical size of an index on a simple column (eg an integer >or timestamp). The extra I/O costs and extra maintenance costs are >unattractive to say the least. And it takes away some of the >justification for the whole thing, which is that reading an index is >much cheaper than reading the main table. That's only true if the index >is much smaller than the main table ... > > regards, tom lane > > I recognize the added cost of implementing index only scans. As storage is relatively cheap these days, everyone I know is more concerned about faster access to data. Similarly, it would still be faster to scan the indexes than to perform a sequential scan over the entire relation for this case. I also acknowledge that it would be a negative impact to indexes where this type of acces isn't required, as you suggested and which is more than likely not the case. I just wonder what more people would be happier with and whether the added 16-20 bytes would be extremely noticable considering most 1-3 year old hardware. ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html |
| |||
| Jonah H. Harris said: > Tom Lane wrote: > >>The fundamental problem is that you can't do it without adding at least >>16 bytes, probably 20, to the size of an index tuple header. That >>would double the physical size of an index on a simple column (eg an >>integer or timestamp). The extra I/O costs and extra maintenance costs >>are unattractive to say the least. And it takes away some of the >>justification for the whole thing, which is that reading an index is >>much cheaper than reading the main table. That's only true if the >>index is much smaller than the main table ... >> > I recognize the added cost of implementing index only scans. As > storage is relatively cheap these days, everyone I know is more > concerned about faster access to data. Similarly, it would still be > faster to scan the indexes than to perform a sequential scan over the > entire relation for this case. I also acknowledge that it would be a > negative impact to indexes where this type of acces isn't required, as > you suggested and which is more than likely not the case. I just > wonder what more people would be happier with and whether the added > 16-20 bytes would be > extremely noticable considering most 1-3 year old hardware. > > Monetary cost is not the issue - cost in time is the issue. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| > >The index could then make sensible the reasonably common practice of > >using a covered index - i.e. putting additional columns into the index > >to satisfy the whole query just from the index. > I am willing to take it on and I understand that the workload is mine. > As long as everyone gives me some suggestions, I'm good it being optional. If nobody is working on it, you may find that the below TODO item might accomplish most of what you're looking for as well as generally improving performance. The count(*) on a where clause would result in one index scan and one partial sequential heap scan. Not as fast for the specific examples you've shown, but far better than today and covers many other cases as well. Fetch heap pages matching index entries in sequential order Rather than randomly accessing heap pages based on index entries, mark heap pages needing access in a bitmap and do the lookups in sequential order. Another method would be to sort heap ctids matching the index before accessing the heap rows. ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html |
| |||
| "Jonah H. Harris" <jharris@tvi.edu> writes: > You are correct, I am proposing to add visibility to the indexes. Then I think the only way you'll get any support is if it's an option. Since it would incur a performance penalty on updates and deletes. > As for unqualified counts, I believe that they could take advantage of an > index-only scan as it requires much less I/O to perform an index scan than a > sequential scan on large tables. No, sequential scans require slightly more i/o than index scans. More importantly they require random access i/o instead of sequential i/o which is much slower. Though this depends. If the tuple is very wide then the index might be faster to scan since it would only contain the data from the fields being indexed. This brings to mind another approach. It might be handy to split the heap for a table into multiple heaps. The visibility information would only be in one of the heaps. This would be a big win if many of the fields were rarely used, especially if they're rarely used by sequential scans. > Relation SOME_USERS > user_id BIGINT PK > user_nm varchar(32) UNIQUE INDEX > some_other_attributes... What's with the fetish with unique indexes? None of this is any different for unique indexes versus non-unique indexes. -- greg ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| ||||
| "Jonah H. Harris" <jharris@tvi.edu> writes: > Looking at the message boards, there is significant interest in the COUNT(*) > aspect. However, rather than solely address the COUNT(*) TODO item, why not fix > it and add additional functionality found in commercial databases as well? I > believe Oracle has had this feature since 7.3 and I know people take advantage > of it. I think part of the problem is that there's a bunch of features related to these types of queries and the lines between them blur. You seem to be talking about putting visibility information inside indexes for so index-only plans can be performed. But you're also talking about queries like "select count(*) from foo" with no where clauses. Such a query wouldn't be helped by index-only scans. Perhaps you're thinking about caching the total number of records in a global piece of state like a materialized view? That would be a nice feature but I think it should done as a general materialized view implementation, not a special case solution for just this one query. Perhaps you're thinking of the min/max problem of being able to use indexes to pick out just the tuples satisfying the min/max constraint. That seems to me to be one of the more tractable problems in this area but it would still require lots of work. I suggest you post a specific query you find is slow. Then discuss how you think it ought to be executed and why. -- greg ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |