This is a discussion on Performance of count(*) within the Pgsql Performance forums, part of the PostgreSQL category; --> "Craig A. James" <cjames@modgraph-usa.com> writes: > Tom Lane wrote: >> You mean >> http://archives.postgresql.org/pgsql...0/msg00283.php >> ? I don't see ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| "Craig A. James" <cjames@modgraph-usa.com> writes: > Tom Lane wrote: >> You mean >> http://archives.postgresql.org/pgsql...0/msg00283.php >> ? I don't see anything there that bears on Steve's suggestion. > Mea culpa, it's October 8, not October 18: > http://archives.postgresql.org/pgsql...0/msg00143.php I still don't see the relevance to Steve's suggestion. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Michael Stone schrieb: > On Thu, Mar 22, 2007 at 02:24:39PM -0400, Merlin Moncure wrote: >> Tino was saying that rather that build a complete indexing storage >> management solution that lives outside the database, it is better to >> do intelligent session management so that you get the simplicity if a >> two tier client server system but the scalability of a web app. > > No, what he was saying was "there's this thing called a cursor". I > thought there was enough information in the original message to indicate > that the author knew about cursors. There are certainly pros and cons > and nuances to different approaches, but Tino's message didn't touch on > anything that specific. Sure, the message thread sometimes loose history so I wasnt 100% sure what the framework really is - although I assumed it could be a web solution. With stand alone applications you usually have a limited number of users connecting and they are connected during the session so you can easily use cursors there. > And even if you do use some kind of "intelligent session management", > how many simultaneous cursors can postgres sanely keep track of? > Thousands? Millions? Tens of Millions? I suspect there's a scalability > limit in there somewhere. Luckily I don't spend much time in the web > application space, so I don't need to know. Depending on the application, you can even simulate above situation with a web framework if you manage session in the web framework with persistent connections for a limited amount of users to work the same time (certainly not feasable for a public web shop but for data management systems for inhouse use). In this case, cursors would be perfect too. In any other case I fail to see the advantage in storing "index data" outside the database with all the roundtripping involved. If the query is complex and rerunning it for every batch is expensive, fetching the whole result to the application in case of users really traversing the complete batch (How often is that really done? I mean, who browses to an end of a huge result set?) is costy as well w/o really benefit. It would be much more easy and clean imho, in this case to really fetch the data to session and batch linked scratch table. If its fast or you can prepare a batch helper table with index, you can just select the batch equival portion of the result. You dont need extensive session management in the web application to scroll thru result sets in this way. This can all be encoded in forms or links. Regards Tino ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Craig A. James schrieb: > Tino Wildenhain wrote: >>> You guys can correct me if I'm wrong, but the key feature that's >>> missing from Postgres's flexible indexing is the ability to maintain >>> state across queries. Something like this: >>> >>> select a, b, my_index_state() from foo where ... >>> offset 100 limit 10 using my_index(prev_my_index_state); >>> >> >> Yes, you are wrong :-) The technique is called "CURSOR" >> if you maintain persistent connection per session >> (e.g. stand allone application or clever pooling webapplication) > > That's my whole point: If relational databases had a simple mechanism > for storing their internal state in an external application, the need > for cursors, connection pools, and all those other tricks would be > eliminated. Well the cursor is exactly the simple handle to the internal state of the relational db you are looking for. Do you really think transferring the whole query-tree, open index and data files to the client over the network would really improve the situation? > As I said earlier, relational technology was invented in an earlier era, > and hasn't caught up with the reality of modern web apps. There is nothing modern with todays web apps. >> If its a naive web application you just store your session >> in tables where you can easily maintain the scroll state >> as well. > > One thing I've learned in 25 years of software development is that > people who use my software have problems I never imagined. I've been > the one who was naive when I said similar things about my customers, and > was later embarrassed to learn that their problems were more complex > than I ever imagined. Sure it really depends on the application how the best solution would look like but I'm quite certain, counterfaiting internal stuff of the underlying relational database in the application makes more problems then it solves. If you can't handle SQL, dont use SQL, you can build web applications w/o any relational database if you want it. Regards Tino Wildenhain ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Michael Stone schrieb: > On Thu, Mar 22, 2007 at 06:27:32PM +0100, Tino Wildenhain wrote: >> Craig A. James schrieb: >>> You guys can correct me if I'm wrong, but the key feature that's >>> missing from Postgres's flexible indexing is the ability to maintain >>> state across queries. Something like this: >>> >>> select a, b, my_index_state() from foo where ... >>> offset 100 limit 10 using my_index(prev_my_index_state); >>> >> >> Yes, you are wrong :-) The technique is called "CURSOR" >> if you maintain persistent connection per session >> (e.g. stand allone application or clever pooling webapplication) > > Did you read the email before correcting it? From the part you trimmed out: > >> The problem is that relational databases were invented before the web >> and its stateless applications. In the "good old days", you could >> connect to a database and work for hours, and in that environment >> cursors and such work well -- the RDBMS maintains the internal state >> of the indexing system. But in a web environment, state information >> is very difficult to maintain. There are all sorts of systems that >> try (Enterprise Java Beans, for example), but they're very complex. Yes, but actually this is not true. They are not so complex in this regard. All you have to do is to look in the pg_cursor view if your cursor is there and if not, create it in your session. All you need to maintain is the cursor name which maps to your session + the special query you run. This should be easy in any web application. > It sounds like they wrote their own middleware to handle the problem, > which is basically what you suggested (a "clever pooling web > application") after saying "wrong". I read about "building index data outside postgres" which still is the wrong approach imho. This discussion is a bit theoretical until we see the actual problem and the proposed solution here. Regards Tino ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Fri, Mar 23, 2007 at 01:01:02PM +0100, Tino Wildenhain wrote: >This discussion is a bit theoretical until we see the actual problem >and the proposed solution here. It's good to see you back off a bit from your previous stance of assuming that someone doesn't know what they're doing and that their solution is absolutely wrong without actually knowing anything about what they are trying to do. Mike Stone ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| Michael Stone schrieb: > On Fri, Mar 23, 2007 at 01:01:02PM +0100, Tino Wildenhain wrote: >> This discussion is a bit theoretical until we see the actual problem >> and the proposed solution here. > > It's good to see you back off a bit from your previous stance of > assuming that someone doesn't know what they're doing and that their > solution is absolutely wrong without actually knowing anything about > what they are trying to do. Well I'm sure its very likely wrong :-) At least the core part of it with the statement of "keeping index data outside postgres". What I meant with my comment about the theoreticalness: we cannot make educated suggestions about alternative solutions to the problem until we know the problem and maybe the current solution in detail. Regards Tino ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |