Testing reports

Tue Apr 10 13:56:14 EDT 2012

John Ralls <jralls at ceridwen.us> writes:

> On Apr 9, 2012, at 10:53 PM, Colin Scott wrote:
>
>> 
>> Yes, I accept that one might argue about whether the output is the HTML or the report.  I would incline to the former, on the grounds that if the HTML doesn't change then neither does the report, and if the HTML does then the report might (or might not).  I guess it depends, as you suggest, on how you create the HTML - and in general I don't disagree with anything you say.  I am sure you similarly understand where I am coming from ...  :-)
>> 
>> If my approach seems simplistic, then it is deliberately so.  I found over many years that it is unhelpful to introduce needless complexities, and that it is always easier to start simple and add the complexities one finds to be necessary, rather than starting complex and then finding that some of the initial complexity is superfluous.  This approach generally gave me more robust results, and got them quicker ...
>> 
>> Colin
>> 
>> -------- Original Message --------
>> 
>> *Subject:* spam,Re: spam,Re: Testing reports
>> *From:* Colin Law <clanlaw at googlemail.com>
>> *To:* gnucash at double-bars.net, gnucash-user at gnucash.org
>> *Date:* Tue, 10 Apr 2012 13:26:49 +0100
>> 
>> On 10 April 2012 04:32, Colin Scott <gnucash at double-bars.net> wrote:
>>> 
>>>> Can one not find an html parser that will read the html into a DOM
>>>> tree?  Then one could walk the tree comparing the tags, attributes,
>>>> and contents with a DOM from a reference page.  That way minor
>>>> changes in the html that do not change the resulting displayed page
>>>> will be ignored.
>>> 
>>> I expect you are right, but I've never worked with HTML at that sort of level, so please pardon my ignorance!
>>> 
>>> Besides, I was making two points.  The first was that one needs to weigh the cost of what one is doing against the benefits.  If it's easy, then maybe it's worth doing - but maybe not, read on!  :-)
>>> 
>>> The second was that *ANY* change to the HTML output needs to be flagged and checked - if you have something that works, it shouldn't be changed at all unless there is a very good reason makding the change.  If the output changes, and the changed HTML is determined to be correct, then it should become the reference against which future test output is compared.  IMHO one should *NEVER* automatically accept a change in output without it having been thoroughly scrutinised and checked by a human!
>> 
>> It is a bit of a philosophical point really, but one could argue that
>> the output from the software is a report rather than a chunk of html.
>> What matters is what the report looks like, and if the Document Object
>> Model has not changed then it will look the same whatever the details
>> of the underlying html.  Using the DOM allows the test to relate to
>> the end result, with tests such as "there should be a paragraph of a
>> particular class containing particular text", rather then "there
>> should be a string of the form <p class =......".
>> I don't know how the html is generated in the code, if it uses an html
>> library (or if at some point in the future it used an html library)
>> then another issue could be that a newer version of the library might,
>> for example, change the order of attributes in a tag.  A DOM based
>> test would not care about this but for a text based one it would be a
>> disaster.
>
> You guys are forgetting about CSS's affect on the way a particular DOM
> is rendered. That can make a huge difference in appearance. Play
> around a bit with http://www.csszengarden.com/ to see what I mean (or
> just for fun, it's an awesome demo!)

Personally I'm ignoring CSS, because I don't care about testing the
layout.

> An easy way to test the HTML without worrying about tag case or
> extraneous whitespace is to use HTML Diff [1] against a canned
> comparison document. That will get us close enough for a first order
> test, I think. Anyone got time to code something up?

Is there some way to automate the validation of HTML Diff output?  When
I did a cursory look at HTML Diff options they all marked up the HTML
with diff-tags, whereas what we want would be essentially a boolean
output whether there are any differences or not.

> The talk about DOM gave me another idea: Since the html generator
> seems to be outputting XHTML transitional (at least it *says* it is),
> we can write a strict schema and then validate the output report
> against the schema. If it validates, it passes.

Except that we don't care about the HTML formatting as much as we care
about the contents of the report (see my response from a minute ago).  A
Schema validation isn't going to tell you that the report output $10.01
when it should have output $100.01.

> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord at MIT.EDU                        PGP key available