Unix Technical Forum

ISO-8859-1 encoding not enforced?

This is a discussion on ISO-8859-1 encoding not enforced? within the pgsql Hackers forums, part of the PostgreSQL category; --> Is PostgreSQL supposed to enforce a LATIN1/ISO-8859-1 encoding if that's the database encoding? Because people using this database can ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 04:23 AM
Christopher Kings-Lynne
 
Posts: n/a
Default ISO-8859-1 encoding not enforced?

Is PostgreSQL supposed to enforce a LATIN1/ISO-8859-1 encoding if that's
the database encoding?

Because people using this database can happily insert any old non-LATIN1
junk into the database, then when I export as XML, all XML validation
fails because the encoding is not correct.

If this is not expected behaviour, I will submit an example script
showing the problem...

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 04:23 AM
Tom Lane
 
Posts: n/a
Default Re: ISO-8859-1 encoding not enforced?

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Is PostgreSQL supposed to enforce a LATIN1/ISO-8859-1 encoding if that's
> the database encoding?


AFAIK, there are no illegal characters in 8859-1, except \0 which we
do reject.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 04:23 AM
Andrew Dunstan
 
Posts: n/a
Default Re: ISO-8859-1 encoding not enforced?

Tom Lane said:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> Is PostgreSQL supposed to enforce a LATIN1/ISO-8859-1 encoding if
>> that's the database encoding?

>
> AFAIK, there are no illegal characters in 8859-1, except \0 which we do
> reject.
>


Perhaps Chris is confusing ISO/IEC 8859-1 with ISO-8859-1 a.k.a. Latin-1.

According to the wikipedia,

"The IANA has approved ISO-8859-1 (note the extra hyphen), a superset of
ISO/IEC 8859-1, for use on the Internet. This character map, or character
set or code page, supplements the assignments made by ISO/IEC 8859-1,
mapping control characters to code values 00-1F, 7F, and 80-9F. It thus
provides for 256 characters via every possible 8-bit value.
[snip]
The name Latin-1 is an informal alias [for ISO-8859-1] unrecognized by ISO
or the IANA, but is perhaps meaningful in some computer software."

But let's not start accepting \0 ;-)

cheers

andrew





---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 04:24 AM
Christopher Kings-Lynne
 
Posts: n/a
Default Re: ISO-8859-1 encoding not enforced?

>>Is PostgreSQL supposed to enforce a LATIN1/ISO-8859-1 encoding if that's
>>the database encoding?

>
> AFAIK, there are no illegal characters in 8859-1, except \0 which we
> do reject.


Hmmm...

It turns out I was confused by the developer who reported this issue.
Basically they have a requirement that they only want the parts of
LATIN1 that can be converted to single byte UTF8 (ie. 7bit ascii).

Only about 8 of these high bit characters existed in our database, so I
replaced them and put in a CHECK constraint on a few fields like this:

CHECK (description = convert(description, 'ISO-8859-1', 'UTF-8'))

Can I put in a request for a '7 bit ascii' encoding for PostgreSQL

Chris

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 04:24 AM
Alvaro Herrera
 
Posts: n/a
Default Re: ISO-8859-1 encoding not enforced?

On Wed, Apr 13, 2005 at 10:10:32AM +0800, Christopher Kings-Lynne wrote:

> Can I put in a request for a '7 bit ascii' encoding for PostgreSQL


Given all the problems with unwanted recoding I've seen, I think such an
encoding should be the default instead of unchecked-8-bits SQL_ASCII :-(

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Amanece. (Ignacio Reyes)
El Cerro San Cristóbal me mira, cínicamente, con ojos de virgen"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 04:24 AM
Christopher Kings-Lynne
 
Posts: n/a
Default Re: ISO-8859-1 encoding not enforced?

> Given all the problems with unwanted recoding I've seen, I think such an
> encoding should be the default instead of unchecked-8-bits SQL_ASCII :-(


I agree, but that would be a nightmare of backwards compaitibility

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:03 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com