This is a discussion on char or int vs varchar within the pgsql Novice forums, part of the PostgreSQL category; --> Hi All, I am building a new table and I would appreciate your thoughts on a data storage and ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi All, I am building a new table and I would appreciate your thoughts on a data storage and performance question I need to store a (at the moment) tristate variable. Now this could be done as a char "L", "S", "T" as an int2 1, 2, 3 or as a varchar(8) "loose", "standard", "tight" among possibly other ways. For absolute storage minimum char would be the best at 1 byte but the documentation indicates that it is an internal type so I do not know whether or not I can really use it. If not then character(1) would have to be used and it looks like that would use up 5 bytes. For readability varchar(8) obviously wins. The documentation assures me that other than using 12 bytes there is "no performance differences between these three types". Finally there is the smallint at 2 bytes. Almost the smallest in terms of disk usage I wonder how it will affect performance since in most cases I would probably have to use a CASE statement to convert to something a human would intuitively understand. Kind Regards, Keith ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| > -----Original Message----- > From: pgsql-novice-owner@postgresql.org [mailto > owner@postgresql.org] On Behalf Of Keith Worthington > Sent: Friday, April 22, 2005 1:21 PM > To: PostgreSQL Novice > Subject: [NOVICE] char or int vs varchar > I need to store a (at the moment) tristate variable. Now this could be > done > as a char "L", "S", "T" as an int2 1, 2, 3 or as a varchar(8) "loose", > "standard", "tight" among possibly other ways. Indexing is fastest on integer types. Char types are probably stored underneath as integers (guessing.) VARCHARs are variable allocation and when you change the string length it may have to reallocate. Generally integers are fastest, then single characters then CHAR[x] types and then VARCHARS. String compares in your queries are an issue, as well. "... where x = 'loose'" could be an issue if you query it a lot. CHAR[1] would work, or you can use a bit field (true,false,not-set) or if you're creative, 2 bits (01,11,10,11) and you don't use one set. Check out the "bytea" types, but it's got a 4 byte header so you'll end up using 33 bits. :P Personally, I'd use an "char" (note quotes - not a char(1) ) or an unsigned smallint (in that order) so I don't have to worry about a sign-flag, I have extended room for more flags and it indexes quickly. > Finally there is the smallint at 2 bytes. Almost the smallest in terms of > disk usage I wonder how it will affect performance since in most cases I > would > probably have to use a CASE statement to convert to something a human > would > intuitively understand. Using "full strings of varchar(8)" is probably your slowest and least efficient route...but the most readable. I don't think Postgres does enum() fields. That would probably be ideal in this case, since you could use human-readable strings and it'd store/index based on the numeric offset. Oh well. G'luck. -Mike / Szii ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |