Skip to content
Snippets Groups Projects
Commit e1778aec authored by Mick Jordan's avatar Mick Jordan
Browse files

doc: add developer section

parent e7fa39b5
No related branches found
No related tags found
No related merge requests found
fficall contains the implementation of the R FFI, as described in https://cran.r-project.org/doc/manuals/r-release/R-exts.html.
It's actually a bit more than that as it also contains code copied from GnuR, for example that supports graphics or is sufficiently
simple that it is neither necessary nor desirable to implement in Java. As this has evolved a better name for 'fficall' would be 'main'
for compatibility with GnuR.
There are four sub-directories:
include
common
jni
variable_defs
include
=======
'include' should be thought as analgous to GnuR's src/include, i.e. internal headers needed by the code in 'src/main'.
What are trying to do by redefining them here is provide a boundary so that we don't accidently capture code from GnuR that
is specific to the implementation of GnuR that is different in FastR, e.g., the representation of R objects. Evidently not every
piece of GnuR code or an internal header has that characteristic but this strategy allows us some control to draw the boundary as
tight as possible. Obviously we want to avoid duplicating (copying) code, as this requires validating the copy when migrating GnuR versions,
so there are three levels of implementation choice for the content of the header in this directory:
* Leave empty. This allows a #include to succeed and, if code does not actually use any symbols from the header, is ok.
* Indirect to the real GnuR header. This is potentially dangerous but a simple default for code that uses symbols from the header.
* Extract specific definitions from the GnuR header into a cut-down version. While this copies code it may be necessary
to avoid unwanted aspects of the GnuR header. In principle this can be done by a 'copy with sed' approach.
The indirection requires the use of the quote form of the #include directive. To avoid using a path that is GnuR version dependent,
the file gnurheaders.mk provides a make variable GNUR_HEADER_DEFS with a set of appropriate -D CFLAGS.
Ideally, code is always compiled in such a way that headers never implicitly read from GnuR, only via the 'include' directory.
Unfortunately this cannot always be guaranteed as a directive of the form include "foo.h" (as opposed to include <foo.h>) in the
GnuR C code will always access a header in the same directory as the code being compiled. I.e., only the angle-bracket form can be controlled
by the -I compiler flag. If this is a problem, the only solution is to 'copy with sed' the .c file and convert the quote form to the
angle bracket form.
common
======
'common' contains code that has no explicit JNI dependencies and has been extracted for reuse in other implementations. This code is mostly
copied/included from GnuR. N.B. Some modified files have a "_fastr" suffix to avoid a clash with an existing file in GnuR that would match
the Makefile rule for compiling directly from the GnuR file.
jni
===
'jni' contains the implementation that is based on and has explicit dependencies on Java JNI.
The R FFI is rather baroque and defined in large set of header files in the 'include' directory that is a sibling of 'fficall'.
In GnuR, the implementation of the functions is spread over the GnuR C files in 'src/main'. To ease navigation of the FastR implementation,
in general, the implementation of the functions in a header file 'Rxxx.h' is stored in the file 'Rxxx.c'.
The points of entry from Java are defined in the file rfficall.c. Various utility functions are defined in rffiutils.{h,c}.
variable_defs
=============
The GnuR FFI defines a large number of (extern) variables the defintiions of which, in GnuR, are scattered across the source files.
In FastR these are collected into one file, variable_defs.h. However, the actual initialization of the variables is, in general, implementation
dependent. In order to support a JNI and a non-JNI implementation, the file is stored in a seperate directory.
......@@ -4,3 +4,4 @@
[Limitations](Limitations.md)
[For Developers](dev/Index.md)
# FastR Developer Documentation
## Index
[R FFI Implementation](ffi.md)
# The R FFI Implementation
# Introduction
The implementation of the [R FFI](https://cran.r-project.org/doc/manuals/r-release/R-exts.html) is contained in the `fficall` directory of
the `com.oracle/truffle.r.native` project`. It`s actually a bit more than that as it also contains code copied from GNU R, for example that supports graphics or is sufficiently
simple that it is neither necessary nor desirable to implement in Java. As this has evolved a better name for `fficall` would probably be `main`
for compatibility with GNU R.
There are four sub-directories in `fficall/src`:
* `include`
* `common`
* `variable_defs`
* `jni`
## The `fficall/include` directory
`include` should be thought as analgous to GNU R's `src/include`, i.e. internal headers needed by the code in `src/main`.
What we are trying to do by redefining them here is provide a boundary so that we don`t accidently capture code from GNU R that
is specific to the implementation of GNU R that is different in FastR, e.g., the representation of R objects. Evidently not every
piece of GNU R code or an internal header has that characteristic but this strategy allows us some control to draw the boundary as
tight as possible. Obviously we want to avoid duplicating (copying) code, as this requires validating the copy when migrating GNU R versions,
so there are three levels of implementation choice for the content of the header in this directory:
* Leave empty. This allows a `#include` to succeed and, if code does not actually use any symbols from the header, is ok.
* Indirect to the real GNU R header. This is potentially dangerous but a simple default for code that uses symbols from the header.
* Extract specific definitions from the GNU R header into a cut-down version. While this copies code it may be necessary to avoid unwanted aspects of the GNU R header. In principle this can be done by a "copy with sed" approach.
The indirection requires the use of the quote form of the `#include` directive. To avoid using a path that is GNU R version dependent,
the file ``gnurheaders.mk` provides a make variable `GNUR_HEADER_DEFS` with a set of appropriate -`D CFLAGS`.
Ideally, code is always compiled in such a way that headers are never implicitly read from GNU R, only via the `include` directory.
Unfortunately this cannot always be guaranteed as a directive of the form include "foo.h" (as opposed to include <foo.h>) in the
GNU R C code will always access a header in the same directory as the code being compiled. I.e., only the angle-bracket form can be controlled
by the `-I` compiler flag. If this is a problem, the only solution is to "copy with sed" the `.c` file and convert the quote form to the
angle bracket form.
## The `common` directory
`common` contains code that has no explicit JNI dependencies and has been extracted for reuse in other implementations. This code is mostly
copied/included from GNU R. N.B. Some modified files have a `_fastr` suffix to avoid a clash with an existing file in GNU R that would match
the Makefile rule for compiling directly from the GNU R file.
## The `variable_defs` directory
The GNU R FFI defines a large number of (extern) variables the definitions of which, in GNU R, are scattered across the source files.
In FastR these are collected into one file, `variable_defs.h`. However, the actual initialization of the variables is, in general, implementation
dependent. In order to support a JNI and a non-JNI implementation, the file is stored in a separate directory.
## The `jni` directory
`jni` contains the implementation that is based on and has explicit dependencies on Java JNI. It is described in more detail [here](jni_ffi.md)
# Notes on the JNI implementation
# Introduction
The R FFI is rather baroque and defined in large set of header files in the `include` directory that is a sibling of `fficall`.
In GNU R, the implementation of the functions is spread over the GNU R C files in `src/main`. To ease navigation of the FastR implementation,
in general, the implementation of the functions in a header file `Rxxx.h` is stored in the file `Rxxx.c`.
The points of entry from Java are defined in the file `rfficall.c`. Various utility functions are defined in `rffiutils.{h,c}`.
## JNI References
Java object values are passed to native code using JNI local references that are valid for the duration of the call. The reference protects the object from garbage collection. Evidently if native code holds on to a local reference by storing it in a native variable,
that object might be collected, possibly causing incorrect behavior (at best) later in the execution. It is possible to convert a local reference to a global reference that preserves the object across multiple JNI calls but this risks preventing objects from being collected. The global variables defined in the R FFI, e.g. R_NilValue are necessarily handled as global references. However, by default, other values are left as local references, although this can be changed by setting the variable alwaysUseGlobal in rffiutils.c to a non-zero value.
that object might be collected, possibly causing incorrect behavior (at best) later in the execution. It is possible to convert a local reference to a global reference that preserves the object across multiple JNI calls but this risks preventing objects from being collected. The global variables defined in the R FFI, e.g. `R_NilValue` are necessarily handled as global references. Other values are left as local references, with some risk that native code might capture a value that would then be collected once the call completes.
## Vector Content Copying
The R FFI provides access to vector contents as raw C pointers, e.g., int *. This requires the use of the JNI functions to access/copy the underlying data. In addition it requires that multiple calls on the same SEXP always return the same raw pointer.
The R FFI provides access to vector contents as raw C pointers, e.g., `int *`. This requires the use of the JNI functions to access/copy the underlying data. In addition it requires that multiple calls on the same SEXP always return the same raw pointer.
Similar to the discussion on JNI references, the raw data is released at the end of the call. There is currently no provision to retain this data across multiple JNI calls.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment