This is a discussion on IMMUTABLE? within the Pgsql Performance forums, part of the PostgreSQL category; --> Performance Folks, I just had an article[1] published in which I demonstrated recursive PL/pgSQL functions with this function: CREATE ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Performance Folks, I just had an article[1] published in which I demonstrated recursive PL/pgSQL functions with this function: CREATE OR REPLACE FUNCTION fib ( fib_for int ) RETURNS integer AS $$ BEGIN IF fib_for < 2 THEN RETURN fib_for; END IF; RETURN fib(fib_for - 2) + fib(fib_for - 1); END; $$ LANGUAGE plpgsql; Naturally, it's slow: try=# \timing try=# select fib(28); fib -------- 317811 (1 row) Time: 10642.803 ms Now, I mistakenly said in my article that PostgreSQL doesn't have native memoization, and so demonstrated how to use a table for caching to speed up the function. It's pretty fast: try=# select fib_cached(28); fib_cached ------------ 317811 (1 row) Time: 193.316 ms But over the weekend, I was looking at the Pg docs and saw IMMUTABLE, and said, "Oh yeah!". So I recreated the function with IMMUTABLE. But the performance was not much better: try=# select fib(28); fib -------- 317811 (1 row) Time: 8505.668 ms try=# select fib_cached(28); fib_cached ------------ 317811 (1 row) So, what gives? Am I missing something, or not understanding how IMMUTABLE works? Many TIA, David 1. http://www.onlamp.com/pub/a/onlamp/2...l-plpgsql.html ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| David Wheeler <david@kineticode.com> writes: > So, what gives? Am I missing something, or not understanding how > IMMUTABLE works? The latter. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On May 15, 2006, at 20:21, Tom Lane wrote: >> So, what gives? Am I missing something, or not understanding how >> IMMUTABLE works? > > The latter. Hee-hee! And after all those nice things I wrote about you in a previous email on this list! But seriously, the documentation says (as if I need to tell you, but I was reading it again to make sure that I'm not insane): > IMMUTABLE indicates that the function always returns the same > result when given the same argument values; that is, it does not do > database lookups or otherwise use information not directly present > in its argument list. If this option is given, any call of the > function with all-constant arguments can be immediately replaced > with the function value. So that seems pretty clear to me. Now, granted, the recursive calls to fib() don't pass a constant argument, but I still would think that after the first time I called fib(28), that the next call to fib(28) would be lightening fast, even if fib(27) wasn't. So, uh, would you mind telling me what I'm missing? I'm happy to turn that knowledge into a documentation patch to help future boneheads like myself. :-) Thanks, David ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| David Wheeler <david@kineticode.com> writes: > But seriously, the documentation says (as if I need to tell you, but > I was reading it again to make sure that I'm not insane): >> IMMUTABLE indicates that the function always returns the same >> result when given the same argument values; that is, it does not do >> database lookups or otherwise use information not directly present >> in its argument list. If this option is given, any call of the >> function with all-constant arguments can be immediately replaced >> with the function value. Sure. As I read it, that's talking about a static transformation: planner sees 2 + 2 (or if you prefer, int4pl(2,2)), planner runs the function and replaces the expression with 4. Nothing there about memoization. It's true that the system *could* memoize (or in our more usual parlance, cache function values) given the assumptions embodied in IMMUTABLE. But we don't, and I don't see any statement in the docs that promises that we do. For 99% of the functions that the planner deals with, memoization would be seriously counterproductive because the function evaluation cost is comparable to if not less than the lookup cost in a memo table. (int4pl is a good case in point.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Tue, May 16, 2006 at 12:31:41AM -0400, Tom Lane wrote: > It's true that the system *could* memoize (or in our more usual > parlance, cache function values) given the assumptions embodied in > IMMUTABLE. But we don't, and I don't see any statement in the docs > that promises that we do. For 99% of the functions that the planner > deals with, memoization would be seriously counterproductive because > the function evaluation cost is comparable to if not less than the > lookup cost in a memo table. (int4pl is a good case in point.) This seems to change as soon as one takes into account user functions. While most immutable functions really seem to be small and their execution fast, stable functions often hide complex sql (sometimes combined with if-then-else or other program flow logic). So irrespective of caching to prevent evaluation across statements, within a single statement, is there a strong reason why for example in WHERE col = f(const) with f() declared as immutable or stable and without an index on col, f() still gets called for every row? Or is this optimization just not done yet? Joachim ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Joachim Wieland <joe@mcknight.de> writes: > So irrespective of caching to prevent evaluation across statements, within a > single statement, is there a strong reason why for example in > WHERE col = f(const) with f() declared as immutable or stable and without an > index on col, f() still gets called for every row? Or is this optimization > just not done yet? The above statement is not correct, at least not for immutable functions. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Tue, May 16, 2006 at 09:33:14AM -0400, Tom Lane wrote: > Joachim Wieland <joe@mcknight.de> writes: > > So irrespective of caching to prevent evaluation across statements, within a > > single statement, is there a strong reason why for example in > > WHERE col = f(const) with f() declared as immutable or stable and without an > > index on col, f() still gets called for every row? Or is this optimization > > just not done yet? > The above statement is not correct, at least not for immutable functions. So an immutable function gets evaluated once but a stable function still gets called for every row? Wouldn't it make sense to call a stable function only once as well? Joachim ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On May 15, 2006, at 21:31, Tom Lane wrote: > Sure. As I read it, that's talking about a static transformation: > planner sees 2 + 2 (or if you prefer, int4pl(2,2)), planner runs the > function and replaces the expression with 4. Nothing there about > memoization. Oh, I see. So it's more like a constant or C macro. > It's true that the system *could* memoize (or in our more usual > parlance, cache function values) given the assumptions embodied in > IMMUTABLE. But we don't, and I don't see any statement in the docs > that promises that we do. For 99% of the functions that the planner > deals with, memoization would be seriously counterproductive because > the function evaluation cost is comparable to if not less than the > lookup cost in a memo table. (int4pl is a good case in point.) Yes, but there are definitely programming cases where memoization/ caching definitely helps. And it's easy to tell for a given function whether or not it really helps by simply trying it with CACHED and without. Would this be a simple thing to implement? Best, David ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| > Yes, but there are definitely programming cases where > memoization/caching definitely helps. And it's easy to tell for a given > function whether or not it really helps by simply trying it with CACHED > and without. > > Would this be a simple thing to implement? It's called a "table" ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| On May 16, 2006, at 18:29, Christopher Kings-Lynne wrote: >> Yes, but there are definitely programming cases where memoization/ >> caching definitely helps. And it's easy to tell for a given >> function whether or not it really helps by simply trying it with >> CACHED and without. >> Would this be a simple thing to implement? > > It's called a "table" http://www.justatheory.com/computers...es/postgresql/ higher_order_plpgsql.html Yes, I know. :-P But it'd be easier to have a CACHED keyword, of course. Best, David ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| Thread Tools | |
| Display Modes | |
|
|