Thoughts about the stock quote database

Jean-David Beyer jdbeyer@exit109.com
Sun, 17 Dec 2000 21:39:37 -0500


Bill Gribble wrote:
> 
> On Sun, Dec 17, 2000 at 08:38:36PM -0500, Jean-David Beyer wrote:
> > Do gnc_commodities have unique keys that identify them? E.g., does the
> > commodity
> >
> > {stock, company_name McAfee Associates, ticker MCAF, other_data}
> >
> > have as one of its keys a unique identifier that is the same as the
> > unique identifier of the commodity
> >
> > {stock, company_name Network Associates, ticker NETA, other data} ?
> 
> These are two different questions, IMO.  Do gnc_commodities have
> unique keys?  Right now, they are required to have a unique key that's
> a combination of their exchange and ticker symbol, so yes.  Given your
> examples about recycling ticker symbols over time, it seems that this
> isn't good enough, but I think that's a somewhat minor problem.
> Accounts have pointers to gnc_commodities, not just names.  If the
> problem is picking the right stock for the account, there are plenty
> of UI ways to make sure you get the right one.
> 
> Your second question: what do we do about stocks that change names
> and/or ticker symbols?  This is an issue we haven't really addressed.
> There are several possible approaches: store "alias" information with
> the structure (possibly dated); require a new gnc_commodity to be used
> and allow for a transfer of value from one to the other; others.  This
> is a significant problem, no doubt.  Right now the user could manually
> do the second of these, and we could easily add support for the first
> as we redo the quote system this time around.
> 
> > Because they are the same company, but at different times.
> 
> I'm not so sure about that.  It may be the same company, but it's not
> the same stock.  I would guess that NETA and MCAF have different CUSIP
> ID's; they are fundamentally different securities.

Actually, I should have picked a different company for the example,
because NETA is the result of a merger between MCAF and something else
that I have forgotten, and assumed a new name around the very end of
1997 and the beginning of 1998. I hate mergers because sometimes it is
considered that the resulting company is the same as one of the earlier
companys, and sometimes it is considered that the merged company is a
new company, at least judging by the CRSP PERMNOs (the unique
identifiers used by these guys:
http://gsbwww.uchicago.edu/research/crsp/ .

I am pretty sure that the company that was MCAF was the surviving
corporation that picked the new name and ticker. So if I guess right
(and there is no reason to assume so, but even if it is not right in
this case, it is in other cases), NETA would have the same CUSIP number
as the old MCAF did, and the new MCAF (see immediately below) would have
a new CUSIP number. I have access to some CRSP data, but it is too old
to know about the switch from MCAF to NETA. But to complicate matters,
NETA spun off MCAFEE.COM that is now using the MCAF ticker around
1999-12-31. I think I have owned MCAF and NETA in the past, but I cannot
recall if I had any when the merger took place, so I do not know if
shareowners got new certificates or not.

What I do for a database that includes stock prices for another
application is that I assign each company a unique company id, and there
is a relation that just has this ID as a primary key and another field
with one of the names of that company. The name is not really needed for
anything, but it sure improves the human factors when examining the
database. I cannot use the CRSP PERMNO or the CUSIP because I cannot
find either number for all stocks, so I just generate sequential unique
numbers. This is satisfactory for what I am doing. (Actually, I have the
dbms do the generation so that if multiple applications were making new
entries at the same time, the transaction management of the dbms would
avoid concurrent update problems with duplicate supposedly unique
identifiers.)

I have another relation with the ticker symbol, the unique company id,
and the beginning date, and the ending date when that ticker was used
for that company. For a company with different classes of stock, each
class is assumed to belong to a different company. The unique company id
in this relation is a foreign key that refers to the relation that
contains the unique company id for the company.

There is a similar relation with the names of the companies, since the
names do not always switch at the same time as the ticker. 

(The relations of interest are prices, dividends, splits, etc., with a
primary key that is {unique company id, date} )

Furthermore, there is a constraint that a company must have exactly one
ticker symbol at a time, but it can have multiple names at the same
time. This is useful for me, but probably unnecessary for GnuCash,
because I get data from many sources, and they all use different names
for the same company at the same time. (In fact, some use the wrong
ticker symbols because they do not change them when they should, or they
do change them, but then change them back, and then change them forward
again. Very vexing.)

Another vexation is when the same company trades on more than one
exchange. It will have different ticker, such as XXX on the NYSE and
XXX.T or XXX.TO on the Toronto exchange (quoted in different currencies,
I suppose). Are these really different companies? I am forced by my
design to assume so. I do now fully understand the implications of this,
so I am ignoring it for now.
> 
> > So if someone who owns 100 shares of NETA wants to know what he paid
> > for it, but he bought 50 shares of MCAF at an earlier time, and it
> > split 2:1 somewhere in the meantime, would he have an easy way of
> > finding it all out?
> 
> I would hope so :)
> 
> Bill Gribble

-- 
 .~.  Jean-David Beyer           Registered Linux User 85642.
 /V\                             Registered Machine    73926.
/( )\ Shrewsbury, New Jersey
^^-^^ 9:05pm up 13 days, 5:53, 2 users, load average: 3.91, 3.76, 3.61