DB design document

David Merrill dmerrill@lupercalia.net
Sun, 17 Dec 2000 21:47:35 -0500


On Sun, Dec 17, 2000 at 05:21:27PM -0600, Rob Browning wrote:
> David Merrill <dmerrill@lupercalia.net> writes:
> 
> > > We need to be more specific about what we need in terms of query
> > > efficiency.  It may make sense to use one storage/indexing strategy
> > > for non-text fields and a completely different one for text fields.
> > > It's my opinion that having arbitrarily long text fields is a
> > > prerequisite that other factors need to work around.  
> 
> [...]
> 
> > What thoughts do you have on how the data should be "searched and
> > processed"?
> 
> I think what Bill's getting at is that these text fields, lets take
> the "memo" field for example, may very rarely be the subject of
> search, or any other operation except setting/getting, and all of us
> feel pretty strongly that for a field like this, arbitrary length may
> very well be more important than efficiency.
> 
> For example, if we did decide that setting/getting for the memo
> fields, was the only thing that mattered 99.9% of the time, it might
> make sense to store memos in some alternate way in the DB (other than
> just in a giant fixed-length column which would be mostly empty) so
> that their highly variable lengths don't screw up the efficiency of
> operations on the more heavily indexable/computable numeric fields,
> and don't waste so much space.
> 
> Now I'm not proposing any solution here, and even my example above
> might be bogus, but I can imagine a DB that allows an arbitrary length
> field (say a blob) that wouldn't be very efficient if stored in a
> table with a bunch of other fixed length fields.  In that situation,
> perhaps we make a table of all the fixed length (the more
> computational/indexable fields), and another table of blobs that are
> indexed to the first table by GUID.
> 
> I think this is the kind of question Bill's getting at, and regardless
> it's certainly one I'd like to see discussed.

I think it's been addressed in the current version of the design doc.
Nearly all text fields other than GUIDs are variable length, and
essentially unlimited in length.

-- 
Dr. David C. Merrill                     http://www.lupercalia.net
Linux Documentation Project                dmerrill@lupercalia.net
Collection Editor & Coordinator            http://www.linuxdoc.org
                                       Finger me for my public key

We are the flow, and we are the ebb.
We are the weavers, we are the web.