This is a discussion on CREATE TABLE with mediumblob, but most of the data will never be that big within the MySQL forums, part of the Database Server Software category; --> Hi, I want to create a table to hold some user comments. Some comments will be very short while ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I want to create a table to hold some user comments. Some comments will be very short while others could be rather long. I want to allow long comments rather than restrict my users. From a purely technical point of view will I be wasting space/resources if I have 'mediumblob' rather than 'blob' or 'tinyblob' given that most of the time I will not have data bigger than a 'blob'/'tinyblob'? Will my script be any slower? I will do no SELECT query in the data itself, (as all the comments are linked to a page ID i will rather have an index on the ID). What do you think Thanks FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=22335 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
| |||
| == Quote from FFMG (FFMG.305s57@no-mx.httppoint.com)'s article > Hi, > I want to create a table to hold some user comments. > Some comments will be very short while others could be rather long. > I want to allow long comments rather than restrict my users. > From a purely technical point of view will I be wasting space/resources > if I have 'mediumblob' rather than 'blob' or 'tinyblob' given that most > of the time I will not have data bigger than a 'blob'/'tinyblob'? > Will my script be any slower? > I will do no SELECT query in the data itself, (as all the comments are > linked to a page ID i will rather have an index on the ID). > What do you think > Thanks > FFMG the storage requirements for different data types are well publicized and you can find them in the manual. nobody can tell you how to store your data. you need to know your data and store in the most efficient way allowing for users to save most if not all of thier data all the time. -- POST BY: lark with PHP News Reader ;o) |
| |||
| lark;103938 Wrote: > > the storage requirements for different data types are well publicized > and you can find them in the > manual. nobody can tell you how to store your data. you need to know > your data and store in the most > efficient way allowing for users to save most if not all of thier data > all the time. > I am not sure I understand what you are trying to say. I guess you are telling me to read the manual, but I doubt this will really help me with my question. I know how much data a tinyblob/blob/mediumblob can hold and I know what data I want to store, and how I want to store it. But in the making of my decision I also need to make sure that I will not slow the execution time for the sake of saving large amount of data once in a blue moon. In other words I know that I want a medium blob, but I know that the data in the blob will often be very small. So, is the cost of having a medium blob worth it. I hope that my question was a bit clearer now. Thanks FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=22335 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
| |||
| FFMG wrote: > lark;103938 Wrote: >> >> the storage requirements for different data types are well publicized >> and you can find them in the >> manual. nobody can tell you how to store your data. you need to know >> your data and store in the most >> efficient way allowing for users to save most if not all of thier >> data all the time. >> > > I am not sure I understand what you are trying to say. I guess you are > telling me to read the manual, but I doubt this will really help me > with my question. > > I know how much data a tinyblob/blob/mediumblob can hold and I know > what data I want to store, and how I want to store it. > > But in the making of my decision I also need to make sure that I will > not slow the execution time for the sake of saving large amount of > data once in a blue moon. > > In other words I know that I want a medium blob, but I know that the > data in the blob will often be very small. So, is the cost of having a > medium blob worth it. > > I hope that my question was a bit clearer now. > > Thanks > > FFMG The manual will tell you the "cost" of a medium blob field. |
| |||
| > > I hope that my question was a bit clearer now. > Per the manual, the storage is: BLOB: L + 2 bytes MEDIUMBLOB: L + 3 bytes LONGBLOB: L + 4 bytes The extra bytes are used to store the length of the data in the field. This means that regardless of which storage you use, the length of the data is known. The only performance loss over using a LONGBLOB that's entirely full as opposed to something like a VARCHAR(25) is the time it takes for the disk to seek in a full table scan. That being said, if you're using indexes properly, then there's no performance loss. If you plan on doing full table scans often, you should use a fixed row format and create another table to hold the variable entries with an index/primary key. When you need the value, do a join. If you use variable column types and do a full table scan, MySQL has to check the length of every variable field so it knows how far to see to get to the next row of data. As you can imagine, this is painfully slow. Regards, Michael Martinek |
| |||
| Paul Lautman;103969 Wrote: > > The manual will tell you the "cost" of a medium blob field. Thanks for all the help guys. I have been around forums long enough to know that my question was not as straight forward as 'read the manual'. But thanks anyway. FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=22335 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
| |||
| Michael Martinek;103982 Wrote: > > The extra bytes are used to store the length of the data in the field. > This means that regardless of which storage you use, the length of the > data is known. The only performance loss over using a LONGBLOB that's > entirely full as opposed to something like a VARCHAR(25) is the time > it takes for the disk to seek in a full table scan. > But I think I will be ok in my case as most of the data will be 'short'. So the seek time should not be significant. Michael Martinek;103982 Wrote: > > If you plan on doing full table scans often, you > should use a fixed row format and create another table to hold the > variable entries with an index/primary key. Thanks, I am not planning on doing any queries on the data itself so I should be ok I think. All my queries will be on a 'page_id', (SELECT * FROM 'table' where page_id='x'). So like you said the only data lost will be seek time and I don't expect the data to be 100% full all the time. Thanks for the info. FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=22335 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |
| |||
| > > The extra bytes are used to store the length of the data in the field. > > This means that regardless of which storage you use, the length of the > > data is known. The only performance loss over using a LONGBLOB that's > > entirely full as opposed to something like a VARCHAR(25) is the time > > it takes for the disk to seek in a full table scan. > > But I think I will be ok in my case as most of the data will be > 'short'. So the seek time should not be significant. > You'll be fine. See below. > > > If you plan on doing full table scans often, you > > should use a fixed row format and create another table to hold the > > variable entries with an index/primary key. > > Thanks, I am not planning on doing any queries on the data itself so I > should be ok I think. > All my queries will be on a 'page_id', (SELECT * FROM 'table' where > page_id='x'). > If you're indexing on page_id, then you're not going to see any problems on seek times (reference above). Since the index will show the offset of the row, you'll be safe since MySQL will know how to get to that row immediately rather than processing all the existing entries (full table scan). Just be sure to keep fixed columns sizes before variable data.. so you won't need to have MySQL reading column lengths to get to a fixed length column. I'm not sure of your MySQL experience, so please don't take this as an insult if you already know this. But for example, if you have an int unsigned named page_views and you stick it after 20 variable length columns, MySQL will need to check the length of those 20 columns before it can get page_views.. whereas if you put all fixed columns first, MySQL can just increment the offset by the size of that column type. > So like you said the only data lost will be seek time and I don't > expect the data to be 100% full all the time. > That seek time loss that you'd be seeing is if you're doing full table scans. If you're using an index, like on page_id.. then you'll be fine. MySQL will have the offset to where the row is, and even if your table is several gigs in size because of that data.. as long as page_id is indexed, you'll get results back immediately. > Thanks for the info. No problem! When in doubt, use EXPLAIN <query>. For example: EXPLAIN SELECT * FROM 'table' where page_id = '25'; It'll show you if it's using an index or not. What type of optimizations it'll be employing, etc. Really nifty. Again, if you knew all this-- great. If not, then others who come across it in the future may learn something (we were all novice at one time).. hence the basics and examples. Regards, Michael Martinek |
| ||||
| Michael Martinek;105331 Wrote: > > If you're indexing on page_id, then you're not going to see any > problems on seek times (reference above). Since the index will show > the offset of the row, you'll be safe since MySQL will know how to get > to that row immediately rather than processing all the existing > entries (full table scan). Just be sure to keep fixed columns sizes > before variable data.. so you won't need to have MySQL reading column > lengths to get to a fixed length column. I'm not sure of your MySQL > experience, so please don't take this as an insult if you already know > this. But for example, if you have an int unsigned named page_views > and you stick it after 20 variable length columns, MySQL will need to > check the length of those 20 columns before it can get page_views.. > whereas if you put all fixed columns first, MySQL can just increment > the offset by the size of that column type. > Thanks, I did know that, but I totally forgot about it I'll check my table structures just to make sure. Michael Martinek;105331 Wrote: > > > So like you said the only data lost will be seek time and I don't > > expect the data to be 100% full all the time. > > > That seek time loss that you'd be seeing is if you're doing full table > scans. If you're using an index, like on page_id.. then you'll be > fine. MySQL will have the offset to where the row is, and even if your > table is several gigs in size because of that data.. as long as > page_id is indexed, you'll get results back immediately. > > > Thanks for the info. > No problem! When in doubt, use EXPLAIN <query>. For example: > EXPLAIN SELECT * FROM 'table' where page_id = '25'; >[/color] EXPLAIN was telling me that I was doing the right thing. But I wanted to make sure that I was not going to loose a large amount of seek time simply for the sake of one or two record been very large. Because I will never SELECT any of the blob data I will probably not see any data loss. Thanks again, FFMG -- 'webmaster forum' (http://www.httppoint.com) | 'Free Blogs' (http://www.journalhome.com/) | 'webmaster Directory' (http://www.webhostshunter.com/) 'Recreation Vehicle insurance' (http://www.insurance-owl.com/other/car_rec.php) | 'Free URL redirection service' (http://urlkick.com/) ------------------------------------------------------------------------ FFMG's Profile: http://www.httppoint.com/member.php?userid=580 View this thread: http://www.httppoint.com/showthread.php?t=22335 Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing). |