Text fields in the DB (was Re: DB design document)

Amitha Perera perera@cs.rpi.edu
Thu, 14 Dec 2000 17:39:46 -0500


As far as I know:
- The only way to avoid a maximum length on a (text) field is to use a
binary object. These are called BLOBs, CLOBs, binary, and other things
by different database systems. Even these are often limited by the
implementation, but the limit is often large enough (typically >2K) to
be considered infinite for memo field purposes.

- varchar(n) is similar to char(n), except that a varchar(n) field stores
the length of the string along with the string, while char(n) field
assumes the string has length n. Thus, if you write "X" to a
varchar(5) field and to a char(5) field, the DB would store ("X    ",1)
and ("X    ") respectively. If you retrieve the string, you'd get "X"
and "X    " respectively. Both varchar(n) and char(n) fields occupy on the
order of n bytes in the database, and is located near the other fields
in the same row.

I think Patrick's comments on performance hits when using binary
objects is correct. If your binary objects are, say, pictures, then
the cost of reading them may outweigh the cost of finding it on
disk. If your binary objects are about 50 bytes, then it is much, much
better to have it near the rest of the stuff your are reading.

Spatio-temporal locality "laws" are quite valid, and should not be
discarded lightly. I would strengthen Patrick's statement to "if you
plan on having frequently accessed data".

Amitha.

On Thu, 14 Dec 2000 12:51:44 -0500, David Merrill wrote:
> On Thu, Dec 14, 2000 at 09:54:44AM -0600, Patrick Spinler wrote:
> > David Merrill wrote:
> > > 
> > > You can't avoid having a limit on text fields, but you can make them
> > > very large.
> > 
> > In short, if you plan on having updatable data, use varchar() columns,
> > for which most databases preallocate space in the record.  This means
> > some wastage of data storage, but you get a lot of positive performance
> > boosts because of it.
> 
> I think you have this backwards. varchar means "variable length char",
> while 'char' or sometimes 'text' is fixed length. By variable vs fixed
> I mean how it is stored on disk, not the constraints on valid entries.