Image enabling gnucash?

Fri, 23 Feb 2001 11:27:23 -0500

Christopher Browne wrote:

 > On Thu, 22 Feb 2001 19:33:32 EST, the world broke into rejoicing as

 > Imaging doesn't typically have too much to do with "screen scraping"
 >  from what I've seen.  [For those not familiar with the term, it's
 > usually referring to the situation where you treat a mainframe
 > "host" as a captive application.  Your program grabs the fields
 > written to the screen, and writes data back as needed to control
 > the system.  The typical vendor for this is AttachMate.  In
 > UnixLand, the comparable thing is to write Expect scripts.  More
 > recently, in "WindowsLand," the analagous sort of thing is to
 > capture Win32 events, with the big vendor being Mercury Interactive...
 >   But I digress...]

May be I should have made myself clear. Screen scraping won't make sense 
in the case of gnucash. The technique is used to image enable an 
accounting app say running on AS400 while the imaging engine runs on 
Oracle on HP-UX. The index data is passed by the terminal program is 
then used to fetch the images from the imaging engine. I used to do this 
with WANG Open/IMAGE which is now sold to Eastman Software.

 > -> There needs to be a significant amount of "metadata" _TIGHTLY_ 
attached
 >  to each such document so that it can be readily cross-indexed. 
Preferably
 >  that data should be inside the document so that it all remains
 > atomic if I copy files from one place to another.

Most imaging vendors make the image directories offlimits to the average 
user and provide utilities or HSM to manage migration to high capacity 
storage. At the low end many vendors use CD-R for permanent storage and 
the migration is done with utilities which will update the database 
tables. If you need to keep the CD-R on line you can put it in a CD 
tower and map a drive letter (sigh) to the server...all this is very 
windows centric these days :(

 > It is unacceptable if moving a document from one place to another makes
 >  it "lost" from the system...

This is definitely a headache unless you use BLobs. But BLobs bring 
their own set of problems too...difficulty in migrating to high capacity 
storage such as juke boxes for instance.

 > -> There needs to be an organized way of collecting the documents 
together
 >  so that saving to CD or moving to a new HD is straightforward and
 > safe.

Agreed. This would have to be built in. Would be like HTML directories 
on web sites. You have to setup appropriate permssions.

 > -> It needs to be downright _easy_ to attach metadata to documents,
 > including dates, descriptions, and arbitrary additional sets of
 > tags that _I_ may need to define.

One way is to keep all this in a db table indexed by say 'transaction 
id' or some unique ID. (gnucash has something like that I presume)

 > -> There needs to be a database that collects the metadata and allows 
efficient lookups of documents.  "I want all the NationsBank and Bank
 >    of America statements."

THe scan or attach button in register windows could bring up a dialog 
which would allow you to enter values for user defined keys..in addition 
to the ones the program automatically indexes the documents by.

 > -> There needs to be a way of associating images together.  My credit
 >    union statement often consists of four pages, for instance.

Multipage tiff can work here. You can create those while scanning.

 > -> It would be a slick idea to allow a variety of kinds of images.
 >    Postscript, TIFF, JPEG, all would be valuable.

For sure. I would even venture further and allow and type of file and on 
retrieval lauch the app using MIME types.

 > -> Note that if Postscript is on the list, that means that I could
 >    set up a print queue that grabs documents from the web and
 >    pushes them in. Thus, I could preserve Citibank "electronic bank
 >    statements" by printing from Netscrape to print queue -Parchive...
 >
 > It could well be that all this stuff gets tossed into a set of 
directories
 > where each file is named based on [say] its MD5 checksum to maximize
 > uniqueness of names.

Many of the imaging systems I have seen for medium size businesses (100K 
docs / yr) will use some filename generation / subdirectory creation / 
nesting algorithm which can be tuned to the FS.

 > If you try to make it have anything to do with "screen scraping," a big
 > punch in the nose is likely mandated.  :-)

:) No such thing will be necessary...besides I need that nose to smell 
out all those businesses which need imaging.

 > But aside from that, there would be _tremendous_ value to connecting in
 > a sort of "poor man's imaging system."
 >
 > Indeed, that is something that is _well_ worthy of its own project in its
 > own right.  I would suggest that you _not_ make it GnuCash-specific, but
 > rather try to define some clean interfaces for connecting them together.
 > As you suggest, a few hooks are all that are needed to make it link
 > together.
 > --

My thoughts precisely. I think I did touch upon that in my orig posting. 
"Back end could be mysql and the imaging engine can have its own API so 
that other apps can share the functionality. "

That's sort of where I'd like to head.

bakki
-- 
   .-.    | Bakki Kudva__________________Open Source EDMS______
   oo|    | Navaco                       ph:  814-833-2592
  /`'\    | 420 Pasadena Drive           fax: 603-947-5747
(\_;/)   | Erie, PA 16505               http://www.navaco.com/