This is a discussion on patch: garbage error strings in libpq within the Pgsql Patches forums, part of the PostgreSQL category; --> Several libpqxx users have been reporting odd problems with certain error messages generated by libpq. One of them was ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Several libpqxx users have been reporting odd problems with certain error messages generated by libpq. One of them was the inclusion of garbage data. As it turns out, src/interfaces/libpq/fe-misc.c contains several instances of this construct: printfPQExpBuffer(&conn->ErrorMessage, libpq_gettext("error: %s"), SOCK_STRERROR(SOCK_ERRNO, buffer, sizeof(buffer))); This may occur in other source files as well. On Unix-like systems, SOCK_ERRNO defines to plain errno--which is likely to be overwritten by the libpq_gettext(). I'm attaching a patch that fixes these instances by introducing a named pointer to the SOCK_STRERROR message, initialized before either of the other function calls. Another approach would have been to make libpq_gettext() preserve errno. It's tempting, but I'm not sure it would be valid from a language-lawyer point of view. There is no sequence point between the evaluations of libpq_gettext() and SOCK_STRERROR(). From what I vaguely remember hearing somewhere in the distant past, that means that theoretically they may be evaluated not just in any order but even in parallel. I guess it may actually happen if both inlining and scheduling are sufficiently aggressive. Even if libpq_gettext() is made to restore errno, it will still have to pollute errno at some points during its execution. Jeroen ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| jtv@xs4all.nl writes: > Another approach would have been to make libpq_gettext() preserve errno. That seems like a far easier, cleaner, and more robust fix than this. Moreover I don't believe that this approach works either, as the result of strerror() is not guaranteed still usable after another strerror call (ie, it can use a static buffer repeatedly), so you'd still have the problem if libpq_gettext invokes strerror. I suppose that a really robust solution would involve libpq_gettext saving errno, restoring errno, and invoking strerror() again ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Tom Lane wrote: > jtv@xs4all.nl writes: >> Another approach would have been to make libpq_gettext() preserve errno. > > That seems like a far easier, cleaner, and more robust fix than this. Provided that either: (a) the C standard has added a sequence point between the arguments in a function call, which AFAIK wasn't there before, or the sequence point was there all along (and the compiler implements it); (b) the compiler is sufficiently naive; (c) you get lucky with instruction scheduling on your particular architecture. This is why I called this approach was "tempting," but didn't go for it. I felt it was better to really fix the instances I found first, then see what patterns emerge and refactor. Like maybe a wrapper for printfPQExpBuffer() that takes a PGconn *, an untranslated format string, and varargs; this in turn can do the libpq_gettext(). That would cover all uses of printfPQExpBuffer() in libpq--except for one of the out-of-memory errors where no translation is done, which may have been unintentional (and this bug is again duplicated in the code). > Moreover I don't believe that this approach works either, as the result > of strerror() is not guaranteed still usable after another strerror call > (ie, it can use a static buffer repeatedly), so you'd still have the > problem if libpq_gettext invokes strerror. I suppose that a really > robust solution would involve libpq_gettext saving errno, restoring > errno, and invoking strerror() again ... Check again. The calls to strerror() are routed through pqStrerror() which copies the error message to the buffer, or in the case of GNU strerror_r(), at least ensures it is in some reusable location. Jeroen ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| jtv@xs4all.nl wrote: > (a) the C standard has added a sequence point between the arguments in a > function call, which AFAIK wasn't there before, or the sequence point was > there all along (and the compiler implements it); Per C99 6.5.2.2.10, a sequence point occurs between the evaluation of the arguments to a function and the call of the function itself. Therefore a sequence point occurs before the call to libpq_gettext(). So ISTM having libpq_gettext() preserve errno should work. -Neil ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Neil Conway wrote: > Per C99 6.5.2.2.10, a sequence point occurs between the evaluation of > the arguments to a function and the call of the function itself. > Therefore a sequence point occurs before the call to libpq_gettext(). So > ISTM having libpq_gettext() preserve errno should work. In C99, at least. But that's not the dialect postgres is written in; even gcc 4.0 leaves C99 support turned off by default. Does anyone know what the situation is in C89, or whatever the applicable standard is? Jeroen ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| jtv@xs4all.nl wrote: > Does anyone know what the situation is in C89, or whatever the applicable > standard is? [ *looks* ] The text is the same in both versions: http://dev.unicals.com/papers/c89-draft.html#3.3.2.2 "The order of evaluation of the function designator, the arguments, and subexpressions within the arguments is unspecified, but there is a sequence point before the actual call." (On reading this more closely, I suppose you could make the argument that a function call that takes place in the argument list of another function call is a "subexpression within the [outer function's] arguments", so the order of evaluation prior to the call of the outer function would be undefined. But I don't think that's the right reading of the standard.) -Neil ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Neil Conway wrote: > The text is the same in both versions: > > http://dev.unicals.com/papers/c89-draft.html#3.3.2.2 > > "The order of evaluation of the function designator, the arguments, and > subexpressions within the arguments is unspecified, but there is a > sequence point before the actual call." > > (On reading this more closely, I suppose you could make the argument > that a function call that takes place in the argument list of another > function call is a "subexpression within the [outer function's] > arguments", so the order of evaluation prior to the call of the outer > function would be undefined. But I don't think that's the right reading > of the standard.) That is pretty much what I remember hearing at the time. To me what this says is only that (the program will behave as if) all arguments shall be evaluated before the function is called--but in an otherwise unspecified order. What we're currently doing has this basic shape: int x = 0; static int a() { x = 1; return x; } static int b() { printf("b sees x=%d\n", x); return x;} static int c(int l, int r) { printf("c sees x=%d\n", x); return x; } int main() { return c(a(),b()); } Now, the best we can hope for based on what you quote is that we will see "c sees x=1" but we don't know what we'll see coming out of b(). And the wording makes it equally clear that we would not change any of this by doing c(b(),a()) instead of c(a(),b()). The "best we can hope for" depends on the definition of "unspecified." This is where it gets really tricky. I see two different possible implications depending on that definition: (optimistic) The program will execute as if the code said either "t1=a(); t2=b(); c(t1,t2)" or "t1=b(); t2=a(); c(t1,t2)" but we don't know which. I wouldn't bet on this one as a guarantee, although naive compilers will probably behave like this. (pessimistic) The executions of a() and b() may be interspersed freely, although as a practical matter the compiler will respect the sequence points within each. But that still means there is no sequence point between any one given expression in the execution of a() and any other in the execution of b(), therefore setting a variable in a() and also touching it in b() leaves behaviour undefined. The program may react in any way it likes without violating the standard, including traveling back in time and refusing to start at all (really!), going off to make tea, or standing on its head and donning a tutu. A well-known way to trigger undefined behaviour is "x++=x++;" because there is no sequence point between the two side effects. Try it: gcc will give you a stern warning. Given that there is no sequence point between argument expressions, as per the paragraph you quote, the same must go for "c(x++,x++)". So then it becomes dubious that there is suddenly a guarantee for "c(a(),b())"! Jeroen ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| |||
| jtv@xs4all.nl wrote: > That is pretty much what I remember hearing at the time. > A well-known way to trigger undefined behaviour is "x++=x++;" because > there is no sequence point between the two side effects. Try it: gcc will > give you a stern warning. Given that there is no sequence point between > argument expressions, as per the paragraph you quote, the same must go for > "c(x++,x++)". So then it becomes dubious that there is suddenly a > guarantee for "c(a(),b())"! Right; my interpretation is that the "sequence point before function call" rule applies recursively. So in c(a(...), b(...)), there are in fact three sequence points, which precede the calls of a, b, and c. Shouldn't that be sufficient to ensure that the evaluation of libpq_gettext() is not interleaved with the evaluation of the other arguments to the printf()? -Neil ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Neil Conway <neilc@samurai.com> writes: > Right; my interpretation is that the "sequence point before function > call" rule applies recursively. So in c(a(...), b(...)), there are in > fact three sequence points, which precede the calls of a, b, and c. > Shouldn't that be sufficient to ensure that the evaluation of > libpq_gettext() is not interleaved with the evaluation of the other > arguments to the printf()? I think this is all irrelevant language-lawyering; jtv spotted the true problem which is that we do not protect errno during the *first* call of libpq_gettext. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| ||||
| Tom Lane wrote: > I think this is all irrelevant language-lawyering; jtv spotted the true > problem which is that we do not protect errno during the *first* call of > libpq_gettext. I think you're missing the point. Obviously the current code is wrong, the debate is over the best way to fix it. Jeroen's interpretation of the spec suggests that merely having libpq_gettext() preserve errno is not sufficient. I'm not convinced this his interpretation is correct, but it is a question worth resolving. -Neil ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |