Is there anything *enjoyable* about our development process?
Thomas Bushnell BSG
tb at becket.net
Mon Oct 17 00:11:39 EDT 2005
"Stuart D. Gathman" <stuart at bmsi.com> writes:
> If you define "token" as whatever the language lexical structure
> defines as a "token" (and in Python, the INDENT and DEDENT tokens
> are composed of white space characters), then you can add whitespace
> between "tokens" wherever you please, since whitespace outside of
> tokens is ignored.
As I said, the problem is that the definition of token is
*context-sensitive* in python.
> And remember C also has whitespace in some of its tokens - string literals.
> (So perhaps you want to define "token" as "whatever languages I am familiar
> with define as a token".)
Yes, but C tokens are not context sensitive. (It happens that C
*grammar* is context-sensitive, because of the way typedef works.
This is generally regarded as a wart on the language for just that
reason.)
This is the best characterization; it is mathematically precise. The
"add between tokens" is vague for this reason.
> Here is a suggestion.
Look, I really do understand Python syntax. I wrote a Python parser.
(Have you?)
> The Python *grammar* is context free in terms of the tokens
> involved. I know you hate the lexical representation of the INDENT
> and DEDENT tokens, but it is only a *lexical* issue, and easily
> adapted to your taste.
I said exactly this, the lexical definition is context-sensitive.
> And the standard lexical representation
> happens to match what C, Pascal, or any other block structured
> language looks like when presented for human consumption.
That's exactly the point and the problem. A C program is structure in
such a way that there is a machine-readable specification, which can
be (if you want) formatted to look pretty for people.
A Python program does *not* have that. The logic that in C is in the
pretty-printer: which is to say, in the human interface agent (emacs!)
is, in Python, in every agent that wants to insert code into Python
programs.
> Even a LISP tokenizer has context.
Yes, but LISP tokens are scanned context-free. You may not be
familiar with the meaning of that phrase; I am certain that the
inventor of Python's syntax was entirely unaware of *why* context free
scanning and parsing was such a tremendous advance.
> It is only after the characters have been converted to the beautiful
> internal list structures that LISP is context free. Internally, the
> Python compiler has INDENT and DEDENT tokens to delimit blocks just
> like any other language with blocks.
No, the LISP lexical representation is context-free.
> I am often told that the standard prefix notation in LISP is not a problem,
> because it is trivial to add an infix front end.
Huh? Quite the contrary, prefix notation is fine because it's
*already* fine. No need for a front-end.
> So everyone in a LISP project has to use the same syntax - which is
> almost always prefix - making the possibility of using infix instead
> a red herring.
You're right; I wonder why I ever brought up such a red herring. Oh
wait, I didn't: you did.
Are you familiar with LISP macros? Even a little? I suspect not.
> Agreed, but the code generator argument is a red herring for an extension
> language.
What? Really? You do understand that you are only proving that
programmers who have never learned what LISP is about are deficient in
their training.
> Unless the algorithm is truly self modifying (like, say, "genetic"
> algorithms for super-optimizers), code generators are a hack to work
> around weaknesses in a language (e.g. bitblt mini compilers for
> graphics drawing primitives).
This is only a demonstration of your ignorance of LISP and what macros
are really used for. You are thinking of C.
> Do we agree on these requirements for the high level language?
>
> Must be dynamic (late binding).
> Must have automatic memory management.
> Compilation must be "invisible" to the script programmer (gives
> the illusion of a pure interpreter even if some kind of byte code
> or machine code is cached).
> Must be a standard package on most distros.
> Must be currently actively supported.
> Must provide a clean C language extension API to interface with
> the core GnuCash C code.
> o Must support callbacks into C from high level language.
> o Must support callbacks into high level language from C.
> Must provide a robust module system able to encapsulate the GnuCash interface
> in a module.
Must have hope for not locking the user in to only one extenion
language.
> Robust native support for "object oriented" paradigm (LISP could argue
> that closures are more general).
Scheme has it already, LISP which lacks the necessary language feature
has it by virtue of CLOS.
> Block structure (BEGIN/END,INDENT/DEDENT,{/},[/],yada/yoda).
Scheme has that.
> Scripts should be understandable by humans with minimal
> or no introduction to the language. This may leave out otherwise
> very elegant languages like Prolog and pure functional languages.
> o This may also leave out languages with postfix or prefix based syntax.
> o Infix languages with no operator precedence are probably also out
> (e.g. Smalltalk, APL).
What does that mean? Humans who have programming experience or humans
who don't?
More information about the gnucash-devel
mailing list