Taking the defs file issue

Rob Browning rlb@cs.utexas.edu
12 Nov 2000 15:41:41 -0600

(I've included gnucash-devel at this point because several of the
 people there should be seeing this discussion, and until I get the
 g-wrap list set up, which should be in about a week and a half,
 gnucash-devel is the de-facto g-wrap devel list.  I'll also CC your
 previous mail there.)

Ariel Rios <ariel@arcavia.com> writes:

> (module module-name
>   (submodule-of module-name)) ;; submodule is optional
> Ex: (module Gtk)
> Ex: (module Rgb
>       (submodule-of Gdk))
> modules are later referred to with a list of module names, like 
> (Gdk Rgb) or (Gtk)
> Object and boxed type definitions automatically create a submodule.
> For example, GtkCList creates the module (module CList (submodule-of
> (Gtk))) which is referred to as module (Gtk CList).

OK, so this is a bit more complex than what I'd gathered from just
looking at the current .defs files.

> (type
>  (alias some-unique-identifier)
>  (in-module module-name)   ;; optional, gchar* is not in a module
>  (gtk-type-id gtk-type-system-id) ;; optional, absent if this is not
>                                   ;; in the type system
>  (is-parametric boolean)          ;; optional default to #f
>  (in-c-name name-of-symbol-in-C)
>  (out-c-name name-of-symbol-in-C)
>  (inout-c-name name-of-symbol-in-C))

What exactly does "parametric" mean in this context?

> Ex: (type
>      (alias string)
>      (gtk-type-id GTK_TYPE_STRING)

Hmm.  gtk-type-id doesn't sound like something that g-wrap would
normally care about, at least not directly since it's very GTK/GNOME
specific.  Is this just used to refer to the macro that should be used
for coercions?

>      (in-c-name "const gchar*")
>      (out-c-name "gchar**")      ;; actually I'm not sure how strings work out/inout
>      (inout-c-name "gchar*"))

How are these used?

After looking at the rest of this document without (admittedly) having
enough time to *carefully* understand everything, though I think I
have a good overall idea of what's going on, it looks like your .defs
file is trying to capture a lot more information than g-wrap does (at
least currently).

Since g-wrap's approach is (I think?) much simpler (and more limited),
perhaps it would be easier for me to summarize g-wrap's approach and
then you (collectively) can tell me whether or not g-wrap's likely to
be of use to you -- either by being enhanced to cover what you need,
or perhaps by being used as a lower-level, more primitive backend that
you generate output for.

Here goes:

>From the user's perspective g-wrap knows about three "types of type"
for a wrapped function's arguments and return values (actually four if
you include the enum support I'm about to add): simple-types,
complex-types and pointer-tokens.

Simple types are those for which it's sufficient to just tell g-wrap
how to identify and convert instances of the type.  Usually this means
only types where memory allocation semantics are irrelevant: numeric
types, characters, const-strings, etc.  As an example of a
simple-type, here's how we just added support for "long long" to

  (add-type 'long-long "long long"  
            ;; fn-convert-to-scm 
            (lambda (x) (list "gh_longlong2scm(" x ")"))
            ;; fn-convert-from-scm 
            (lambda (x) (list "gh_scm2longlong(" x ")"))
            ;; fn-scm-is-a
            (lambda (x) (list "gh_exact_p(" x ")")))

These add-type calls can be added to g-wrap's built in type spec, or
the user can put them in their own files.

complex-types are those that can still be converted to/from a guile
type, but for which memory allocation issues are important.  In those
cases, you have to tell g-wrap, in addition to the above information,
both how to clean up (deallocate) an instance of the type and whether
or not cleaning up those values should be the default for instances of
the type that are passed as arguments, and for instances returned as
return values.  The default can also be overridden on a
per-wrapped-function signature basis.

As an example, here's how you could define a hypothetical type
representing "strings that should be considered to be owned by the
caller", meaning that g-wrap should, by default, "clean up" the C-side
temporary that it creates to use as a parameter to a C-side function
call, or the C-side "temporary" that's received the return value from
the C-side function call.  Also, as you can see from the C-side
conversion code given in the definition below, the new type,
caller-owned-string, also accepts and returns #f as a representation
of the NULL string.

    "const char*"
    (lambda (x) (list "((" x ") ? gh_str02scm(" x ") : SCM_BOOL_F)"))
    (lambda (x)
      (list "(((" x ") == SCM_BOOL_F) ? NULL : gh_scm2newstr(" x ", NULL))"))
    (lambda (x)
      (list "((" x " == SCM_BOOL_F) || "
            "(SCM_NIMP(" x ") && SCM_STRINGP(" x ")))"))
    ;; c-cleanup-arg-default?
    ;; c-cleanup-ret-default?
    ;; fn-c-cleanup
    (lambda (x) (list "if(" x ") { free(" x "); }"))))

That covers simple and complex types where there's a direct
scheme-side representation of the type, but for any types that don't
really have a direct scheme-side representation, g-wrap has
pointer-tokens.  On the scheme side, pointer-tokens appear as an
opaque "holder" for a C-side pointer.  In reality, a pointer-token is
just a smob containing both an indicator of what kind of pointer is
being contained and the C-side pointer itself.  You can't really do
anything with pointer-tokens on the scheme side except pass them
around to other wrapped functions, but in nearly all, if not all the
cases we've had in gnucash, this has been sufficient.

In fact, I'm about to add guile-side access to some of the glib
containers like GLists, GSlists, etc., and I'm just going to use
pointer-tokens.  This means that a thousand element GList* (on the
C-side) would just come over as an opaque (few-byte) pointer-token
rather than being, as you might expect, automatically exploded into a
thousand element scheme list of the contained pointer-tokens.

However, I'm also going to provide conversion functions that'll
"explode" a GList pointer-token into a real scheme list of
sub-pointer-tokens when that's what you need, but by not performing
the "explosion" by default we avoid incurring the overhead of
conversion in cases where all you really want to do is pass the GList*
to another function that expects a GList*.

As an example of pointer-tokens, here's how you might wrap up
GtkWindow*'s in g-wrap:

  (make-pointer-token-type 'GtkWindow* "GtkWindow*")
so that then you could define a wrapper for gtk_window_set_title like

   'void "gtk_window_set_title" '((GtkWindow* window) (const-string title))
   "Set the given window's title.")

As I mentioned, on the guile side, a GtkWindow* would just be
represented as a garbage collected smob that "knew" it was a
GtkWindow*.  It would have a printed representation something like
this: "<pt:GtkWindow:0xAF0AD>".  For instances of GtkWindow*, and
indeed for all pointer-tokens, allocation/deallocation semantics are
left up to the API user, as they would be if you were just programming
gtk from C as well.  This is because in general, I tend to favor this
approach to one that tries to hide the allocation semantics.  Like it
or not, when you're wrapping a C API, I think you generally *do* have
to know (and care) about the allocation semantics, and I tend to feel
that trying to automate them too much (beyond the "'cleanup
'no-cleanup" stuff we already have in g-wrap) is just going to hurt
more than help.

> Whenever a type alias can be used, it is also possible to use the
> keyword "native", which implies that the type in question is too
> C-specific to represent. Then a c-declaration will typically be
> available for use.

This sounds something like g-wrap's pointer-token.

>   (enum DirectionType
>     (in-module Gtk)
>     (c-name GtkDirectionType)
>     (value (nick tab-forward) (c-name GTK_DIR_TAB_FORWARD))

You may want to reconsider using strings for the c-name rather than
symbols since given R5RS, you don't know that case will be preserved
when you go from symbol->string.

>       (c-declaration "c-type-and-name")) ;; c-declaration only required
>                                          ;; if the type alias is "native"
>   (varargs #t) ;; has varargs at the end
> )

How do you handle varargs from guile?  Robert and I tried to figure
out a way to do that last week and came to the conclusion that it
wasn't possible without some non-portable assembly nonsense (or
similar) since C doesn't have anything remotely like "apply" for
va_lists; ISTR Robert actually found something in the C FAQ about


Overall, as I somewhat suggested above, and bearing in mind that I
haven't yet *fully* grokked what you're proposing since I don't have
as much familiarity with gtk, gnome, and the current state of
guile-{gnome,gtk} as you do, it sounds like your current .defs
proposal goes far beyond what g-wrap was intended to do, covering
quite a bit more ground.

Some of that ground seems like it just involves adding more detail to
the wrapping of C-side data structures than g-wrap has heretofore
attempted, but some of it sounds like it's just trying to
automagically handle a lot of very glib/gtk/gnome specific bits,
including bits of the object system -- as it stands now, g-wrap knows
nothing of object hierarchies, struct contents, etc.

So where does this leave us?

With respect to aspects of the .defs proposal that argue for a richer
wrapping of C-side data structures, I'd be in favor of augmenting
g-wrap to cover those bits, though I don't want to lose the ability to
just ignore C-side issues (as you can with pointer-tokens) when you
don't care about fancy handling on the scheme side, and all you want
is performance.

One thing I noticed was that you mentioned was handling of signals
(i.e. c-side callbacks).  If we could think of a good way to handle
that generically, then I'd like to add that to g-wrap.  As yet, g-wrap
doesn't have *any* special facilities for dealing with C-side
callbacks (to/from scheme).  In gnucash, we just use the special
g-wrap type SCM which lets you pass a guile pointer straight through
to C and then require you to set up your own handling of the "thunk
pointer" on the C-side with hand-written functions.

On the issue of "wrapping C structures in more detail" -- right now
g-wrap presumes the C API is "function rich".  By that I mean that it
presumes that you can do everything you need with a structure or other
C datatype via the functional API.  This means that if you want to
wrap an API that requires you to access struct members directly, you
have to write a set of helper functions that are then themselves
g-wrappable.  This has it's drawbacks, but it also has the fairly
substantial advantage of keeping the semantics and implementation of
g-wrap simpler than they would be otherwise.

As a final note, I'm wondering if it might be possible, and make more
sense to think of g-wrap as "assembly language", or as the backend,
for the guile-{gtk,gnome} definitions.  g-wrap could remain targeted
at providing "everything necessary" for wrapping generic C APIs for
scheme (guile), but in perhaps a somewhat primitive fashion.  Then
there would be a translator from the .defs files into g-wrap code that
would handle the GNOME/GTK specifcs like the object system and
hierarchy, etc.

Presuming that's even feasible, then I think that could still provide
us lot of mutual benefit, dividing some of the labor, and sharing the
development of the common bits.

Whew.  Sorry for the length...

Rob Browning <rlb@cs.utexas.edu> PGP=E80E0D04F521A094 532B97F5D64E3930