среда, 12 мая 2010 г.

Better SWIG wrappers: variants-like types and enums

Today I was trying to improve our JNI wrappers for some complex structure. The structure is a tree-like object with different kinds of nodes with variant-like data. The structure represents a C/C++ source file parse tree, so it have a lot of nodes and by the library design all of these nodes should be as compact as possible, so two different node types may be represented by two different structures, like this:

enum FooKind {
foo1,
foo2,
foo3,
};

struct Foo {
... // some data common for all Foos
unsigned int kind:2; // reducing field size
union {
T1 field1; // when kind == foo1
T2 field2; // when kind == foo2
T3 field3; // when kind == foo3
} variant;
};

enum BarKind {
bar1,
bar2,
...
bar10,
};

struct Bar {
... // some data common for all Bars
unsigned int kind:2; // reducing field size
union {
T4 field1; // when kind == bar1
...
} variant;
};


Such structures may be good or not, but the wrappers generated for "kind" fields will be unsafe. We actually previously had some problems related to misuses of node kinds which led to incorrect access to the fields of variant, corrupted data, non deterministic bugs which were hard to reproduce and JVM crashes (these are my favourite).
That errors could have been prevented by type-safe wrappers for enums. I didn't wanted to manually edit our wrappers generated by SWIG (yeah, we do this sometimes) so I've decided to find a way to force SWIG do this for me. I don't had much SWIG experience, but I've already implemented some tricks with typemaps, so I've decided to get use of them once more. The basic idea was following: "The only place where I really need type-safe enums is my proxy class, so I need to do something with it's getter for dangerous field". I've started with only one type and my typemap code was like following:
%typemap(jni) unsigned int kind "jbyte" // It's smaller than the original!
%typemap(jtype) unsigned int kind "byte"
%typemap(jstype) unsigned int kind "FooKind"
%typemap(javaout) unsigned int kind {
return MyNumber.swigToEnum($jnicall)
}

Above approach is simple, powerful(it's typemaps!), but have at least two disadvantages:
  1. The "enum implementation" is hardcoded in javaout typemap making it's harder to change

  2. Above typemap will match all the entities named "kind" with "unsigned int" type

I haven't found a way to to match only field from Foo type and use different typemap for Bar, but things like '%typemap(jstype) unsigned int Foo::kind "FooKind"' wasn't working, thus entire approach became unusable for my case.

The other way to affect generated wrapper code is type extensions. I've got an idea to ignore existent field and add my own with proper type:
%ignore Foo::kind;
%extend Foo {
FooKind kind;
}


Above thing didn't worked because I've tried to extend the type with the field which being ignored. Occasionally I've found that if you will extend some type with the field which already defined in it, then only first declaration (yours extension) will be used and SWIG will produce warning about field defined more than once. So following code actually may be a solution for a problem:
%extend Foo {
FooKind kind;
}


Ones who hate warnings can ignore old field and generate a new field with the desired type and different name. In that case SWIG will reference a undefined function in a getter for a new field, so it will be necessary to implement one more item:
%{
static FooKind Foo_safeKind_get(Foo *self) {
return self->kind;
}
%}

%ignore Foo::kind;
%extend Foo {
FooKind safeKind;
}

Of course above code snippet can be put into macros to keep it simple:
%define CHANGE_FIELD_TYPE(type, field, newFieldType)

%{
static newFieldType type ## _safe ## field ## _get(type *self) {
return self->field;
}
%}

%ignore type::field;
%extend type {
newFieldType safe ## field;
}

%enddef

CHANGE_FIELD_TYPE(Foo, kind, FooKind)