Is there anything *enjoyable* about our development process?

Thomas Bushnell BSG tb at becket.net
Mon Oct 17 00:11:39 EDT 2005


"Stuart D. Gathman" <stuart at bmsi.com> writes:

> If you define "token" as whatever the language lexical structure
> defines as a "token" (and in Python, the INDENT and DEDENT tokens
> are composed of white space characters), then you can add whitespace
> between "tokens" wherever you please, since whitespace outside of
> tokens is ignored.

As I said, the problem is that the definition of token is
*context-sensitive* in python.

> And remember C also has whitespace in some of its tokens - string literals.
> (So perhaps you want to define "token" as "whatever languages I am familiar
> with define as a token".)

Yes, but C tokens are not context sensitive.  (It happens that C
*grammar* is context-sensitive, because of the way typedef works.
This is generally regarded as a wart on the language for just that
reason.) 

This is the best characterization; it is mathematically precise.  The
"add between tokens" is vague for this reason.  

> Here is a suggestion.  

Look, I really do understand Python syntax.  I wrote a Python parser.
(Have you?)

> The Python *grammar* is context free in terms of the tokens
> involved.  I know you hate the lexical representation of the INDENT
> and DEDENT tokens, but it is only a *lexical* issue, and easily
> adapted to your taste.  

I said exactly this, the lexical definition is context-sensitive.

> And the standard lexical representation
> happens to match what C, Pascal, or any other block structured
> language looks like when presented for human consumption.  

That's exactly the point and the problem.  A C program is structure in
such a way that there is a machine-readable specification, which can
be (if you want) formatted to look pretty for people.

A Python program does *not* have that.  The logic that in C is in the
pretty-printer: which is to say, in the human interface agent (emacs!)
is, in Python, in every agent that wants to insert code into Python
programs.

> Even a LISP tokenizer has context.  

Yes, but LISP tokens are scanned context-free.  You may not be
familiar with the meaning of that phrase; I am certain that the
inventor of Python's syntax was entirely unaware of *why* context free
scanning and parsing was such a tremendous advance.

> It is only after the characters have been converted to the beautiful
> internal list structures that LISP is context free.  Internally, the
> Python compiler has INDENT and DEDENT tokens to delimit blocks just
> like any other language with blocks.

No, the LISP lexical representation is context-free.

> I am often told that the standard prefix notation in LISP is not a problem,
> because it is trivial to add an infix front end.  

Huh?  Quite the contrary, prefix notation is fine because it's
*already* fine.  No need for a front-end.  

> So everyone in a LISP project has to use the same syntax - which is
> almost always prefix - making the possibility of using infix instead
> a red herring.

You're right; I wonder why I ever brought up such a red herring.  Oh
wait, I didn't: you did.

Are you familiar with LISP macros?  Even a little?  I suspect not. 

> Agreed, but the code generator argument is a red herring for an extension
> language.  

What?  Really?  You do understand that you are only proving that
programmers who have never learned what LISP is about are deficient in
their training.

> Unless the algorithm is truly self modifying (like, say, "genetic"
> algorithms for super-optimizers), code generators are a hack to work
> around weaknesses in a language (e.g. bitblt mini compilers for
> graphics drawing primitives).

This is only a demonstration of your ignorance of LISP and what macros
are really used for.  You are thinking of C.

> Do we agree on these requirements for the high level language?
>
> Must be dynamic (late binding).
> Must have automatic memory management.
> Compilation must be "invisible" to the script programmer (gives
>   the illusion of a pure interpreter even if some kind of byte code
>   or machine code is cached).
> Must be a standard package on most distros.
> Must be currently actively supported.
> Must provide a clean C language extension API to interface with
>   the core GnuCash C code.
>   o Must support callbacks into C from high level language.
>   o Must support callbacks into high level language from C.
> Must provide a robust module system able to encapsulate the GnuCash interface
>   in a module.

Must have hope for not locking the user in to only one extenion
language. 

> Robust native support for "object oriented" paradigm (LISP could argue
>   that closures are more general).

Scheme has it already, LISP which lacks the necessary language feature
has it by virtue of CLOS.

> Block structure (BEGIN/END,INDENT/DEDENT,{/},[/],yada/yoda).

Scheme has that.

> Scripts should be understandable by humans with minimal 
>   or no introduction to the language.  This may leave out otherwise
>   very elegant languages like Prolog and pure functional languages.
>   o This may also leave out languages with postfix or prefix based syntax.
>   o Infix languages with no operator precedence are probably also out
>     (e.g. Smalltalk, APL).

What does that mean?  Humans who have programming experience or humans
who don't?



More information about the gnucash-devel mailing list