A story of C++

For the past 4 months I was trying - and by trying I mean straggling - to simultaneously learn C++ and create a wrapper around gbwtgrapht . Having only limited experience exposing api from C, I assumed that it would have been a similar undertaking. It turned out quite the opposite, it’s a huge deal to export method functionality and/from C++. The resulting the wrapper code kept growing, was of low quality and not worth investing any more time. Maybe it was better to rewrite the whole thing in a more modular way, in C or Rust.

Enter SWIG

As I was ready to defenestrate a significant amount of work hours, I stumbled upon SWIG. SWIG stands for Simplified Wrapper and Interface Generator. It abstracts class methods in a similar way: It exposes classes and non-primitive return values as "void*" while the resulting function’s name is prefixed by that of the class. Unlike my version of a wrapper, it allows many other goodies, like the following:

overloaded functions

Pretty cool IMHO, function overloading with proper error messages.

Another cool aspect is the typemaps. SWIG uses nothing sort of a language FFI but rather language specific C embedding (at least for Racket), so with a small section of code like the following

scheme handle t typemap

One can achieve seamless conversion between gbwtgraph types and types from the target language:

nodes to handles

The output of GRAPH-nodes-to-handles normally is a byte array (char*) that holds a uint_64 and has a byte appended at the end to indicate orientation of traversal (left = 0, right =1), not very user friendly. With the typemaps it is translated to a pair with either "+" or "-" and a Racket integer. More visual, and handleable (pan intented).

Building the SWIG based wrapper (Racket)

Basic requirements

All libraries required to build gbwtgraph, are also required to build the wrapper so first, build the libraries as instructed in How to compile the library

Prepare gbwtgraph

Next there is a peculiarity that swig has and must be dealt with. The swig preproccesor fails when encountering "const static" and this is not a bug (see here ). So do an in place replace of "const static" with "static const" in gbwtgraph’s directory :

find ./ | xargs -I % perl -pi -e 's/static const/const static/g' %

Similarly do

perl -pi -e 's/^\[\[deprecated/\/\/\[\[deprecated/' /usr/local/include/handlegraph/types.hpp

Racket gcc

The default Racket binary in most implementations is "Racket CZ". We want "Racket BC" because this is what swig uses. Also we want it to use the cgc garbage collector. Following the documentation here, grab the latest - at the time of writting - racket source code and after unziping run

  ./configure --enable-bcdefault  --enable-cgcdefault --prefix={installation folder here}

Replacing {installation folder here} with your preference for installation location (I use ~/.local).

Compile swig

Finally grab the swig configuration file gbwtgraph.i and run swig on it:

swig -c++ -mzscheme -declaremodule  gbwtgraph.i;

Then use Racket’s raco tool to compile the resulting .cxx file with the appropriate flags for CGC and C++ support, into an object file.

raco ctool ++ccf  -fpermissive ++ccf -lstdc++  --cc --cgc  gbwtgraph_wrap.cxx ;

Assuming that /usr/local/lib is the location of the gbwtgraph dependencies Compile the object files to a dynamic library.

g++ -o gbwtgraph.so -Wl,--whole-archive \
                /usr/local/lib/libgbwtgraph.a \
                /usr/local/lib/libhandlegraph.a \
                /usr/local/lib/libsdsl.a \
                /usr/local/lib/libgbwt.a -Wl,--no-whole-archive -shared  ~/.local/lib/racket/mzdyn.o  gbwtgraph_wrap.o  -pthread -fopenmp

Make the necessary "Racket module directories" , so that you can run from the current folder the compiled racket module library, gbwtgraph.so

mkdir  -p $(racket -e '(string->symbol (path->string (build-path "compiled" "native" (system-library-subpath))))' | cut -c2-)

Move the dynamic library there.

mv gbwtgraph.so  $(racket -e '(string->symbol (path->string (build-path "compiled" "native" (system-library-subpath))))' | cut -c2-)

Now you cant use gbwtgraph from within Racket by doing

(require "gbwtgraph")

The names for all the exported functions functions are located in the scheme_reload function in the gbwtgraph_wrap.cxx file generated from swig

Next steps: Differential privacy

Update

Instead of building it step by step you use the Dockerfile

Next step

Now that we managed to embed gbwtgraph in racket - with the potential of adding other languages, we have the necessary tools to actually make s DSL for differential privacy on variation graphs