diff --git a/core/alien/alien-docs.factor b/core/alien/alien-docs.factor index 95b29ee50b..7bba9d7332 100755 --- a/core/alien/alien-docs.factor +++ b/core/alien/alien-docs.factor @@ -210,8 +210,9 @@ $nl ARTICLE: "alien-callback" "Calling Factor from C" "Callbacks can be defined and passed to C code as function pointers; the C code can then invoke the callback and run Factor code:" { $subsection alien-callback } -"There are some details concerning the conversion of Factor objects to C values, and vice versa. See " { $link "c-data" } "." -{ $subsection "alien-callback-gc" } ; +"There are some caveats concerning the conversion of Factor objects to C values, and vice versa. See " { $link "c-data" } "." +{ $subsection "alien-callback-gc" } +{ $see-also "byte-arrays-gc" } ; ARTICLE: "dll.private" "DLL handles" "DLL handles are a built-in class of objects which represent loaded native libraries. DLL handles are instances of the " { $link dll } " class, and have a literal syntax used for debugging prinouts; see " { $link "syntax-aliens" } "." @@ -290,7 +291,7 @@ $nl "The C library interface is entirely self-contained; there is no C code which one must write in order to wrap a library." $nl "C library interface words are found in the " { $vocab-link "alien" } " vocabulary." -{ $warning "Since C does not retain runtime type information or do any kind of runtime type checking, any C library interface is not pointer safe. Improper use of C functions can crash the runtime or corrupt memory in unpredictible ways." } +{ $warning "C does not perform runtime type checking, automatic memory management or array bounds checks. Incorrect usage of C library functions can lead to crashes, data corruption, and security exploits." } { $subsection "loading-libs" } { $subsection "alien-invoke" } { $subsection "alien-callback" } diff --git a/core/alien/c-types/c-types-docs.factor b/core/alien/c-types/c-types-docs.factor index fe6873ac3a..8d2b03467b 100755 --- a/core/alien/c-types/c-types-docs.factor +++ b/core/alien/c-types/c-types-docs.factor @@ -158,6 +158,19 @@ HELP: define-out { $description "Defines a word " { $snippet "<" { $emphasis "name" } ">" } " with stack effect " { $snippet "( value -- array )" } ". This word allocates a byte array large enough to hold a value with C type " { $snippet "name" } ", and writes the value at the top of the stack to the array." } { $notes "This is an internal word called when defining C types, there is no need to call it on your own." } ; +ARTICLE: "byte-arrays-gc" "Byte arrays and the garbage collector" +"The Factor garbage collector can move byte arrays around, and it is only safe to pass byte arrays to C functions if the garbage collector will not run while C code still has a reference to the data." +$nl +"In particular, a byte array can only be passed as a parameter if the the C function does not use the parameter after one of the following occurs:" +{ $list + "the C function returns" + "the C function calls Factor code via a callback" +} +"Returning from C to Factor, as well as invoking Factor code via a callback, may trigger garbage collection, and if the function had stored a pointer to the byte array somewhere, this pointer may cease to be valid." +$nl +"If this condition is not satisfied, " { $link "malloc" } " must be used instead." +{ $warning "Failure to comply with these requirements can lead to crashes, data corruption, and security exploits." } ; + ARTICLE: "c-out-params" "Output parameters in C" "A frequently-occurring idiom in C code is the \"out parameter\". If a C function returns more than one value, the caller passes pointers of the correct type, and the C function writes its return values to those locations." $nl @@ -229,13 +242,11 @@ $nl { $subsection } { $subsection } { $warning -"The Factor garbage collector can move byte arrays around, and it is only safe to pass byte arrays to C functions if the function does not store a pointer to the byte array in some global structure, or retain it in any way after returning." -$nl -"Long-lived data for use by C libraries can be allocated manually, just as when programming in C. See " { $link "malloc" } "." } +"The Factor garbage collector can move byte arrays around, and code passing byte arrays to C must obey important guidelines. See " { $link "byte-arrays-gc" } "." } { $see-also "c-arrays" } ; ARTICLE: "malloc" "Manual memory management" -"Sometimes data passed to C functions must be allocated at a fixed address, and so garbage collector managed byte arrays cannot be used. See the warning at the bottom of " { $link "c-byte-arrays" } " for a description of when this is the case." +"Sometimes data passed to C functions must be allocated at a fixed address. See " { $link "byte-arrays-gc" } " for an explanation of when this is the case." $nl "Allocating a C datum with a fixed address:" { $subsection malloc-object } @@ -245,8 +256,6 @@ $nl { $subsection malloc } { $subsection calloc } { $subsection realloc } -"The return value of the above three words must always be checked for a memory allocation failure:" -{ $subsection check-ptr } "You must always free pointers returned by any of the above words when the block of memory is no longer in use:" { $subsection free } "You can unsafely copy a range of bytes from one memory location to another:" @@ -271,20 +280,25 @@ ARTICLE: "c-strings" "C strings" { $subsection string>u16-alien } { $subsection malloc-char-string } { $subsection malloc-u16-string } -"The first two allocate " { $link byte-array } "s, and the latter allocates manually-managed memory which is not moved by the garbage collector and has to be explicitly freed by calling " { $link free } "." +"The first two allocate " { $link byte-array } "s, and the latter allocates manually-managed memory which is not moved by the garbage collector and has to be explicitly freed by calling " { $link free } ". See " { $link "byte-arrays-gc" } " for a discussion of the two approaches." $nl "Finally, a set of words can be used to read and write " { $snippet "char*" } " and " { $snippet "ushort*" } " strings at arbitrary addresses:" { $subsection alien>char-string } -{ $subsection alien>u16-string } ; +{ $subsection alien>u16-string } +"For example, if a C function returns a " { $snippet "char*" } " but stipulates that the caller must deallocate the memory afterward, you must define the function as returning " { $snippet "void*" } ", and call one of the above words before passing the pointer to " { $link free } "." ; ARTICLE: "c-data" "Passing data between Factor and C" -"Two defining characteristics of Factor are dynamic typing and automatic memory management, which are somewhat incompatible with the machine-level data model exposed by C. Factor's C library interface defines its own set of C data types, distinct from Factor language types, together with automatic conversion between Factor values and C types. For example, C integer types must be declared and are fixed-width, whereas Factor supports arbitrary-precision integers. Also Factor's garbage collector can move objects in memory, which means that special support has to be provided for passing blocks of memory to C code." +"Two defining characteristics of Factor are dynamic typing and automatic memory management, which are somewhat incompatible with the machine-level data model exposed by C. Factor's C library interface defines its own set of C data types, distinct from Factor language types, together with automatic conversion between Factor values and C types. For example, C integer types must be declared and are fixed-width, whereas Factor supports arbitrary-precision integers." +$nl +"Furthermore, Factor's garbage collector can move objects in memory; for a discussion of the consequences, see " { $link "byte-arrays-gc" } "." { $subsection "c-types-specs" } { $subsection "c-byte-arrays" } { $subsection "malloc" } { $subsection "c-strings" } { $subsection "c-arrays" } { $subsection "c-out-params" } +"Important guidelines for passing data in byte arrays:" +{ $subsection "byte-arrays-gc" } "C-style enumerated types are supported:" { $subsection POSTPONE: C-ENUM: } "C types can be aliased for convenience and consitency with native library documentation:" diff --git a/core/alien/compiler/compiler-tests.factor b/core/alien/compiler/compiler-tests.factor index 7e2e23726b..f9dc426de1 100755 --- a/core/alien/compiler/compiler-tests.factor +++ b/core/alien/compiler/compiler-tests.factor @@ -330,11 +330,11 @@ FUNCTION: double ffi_test_36 ( test-struct-12 x ) ; ! Hack; if we're on ARM, we probably don't have much RAM, so ! skip this test. -cpu "arm" = [ - [ "testing" ] [ - "testing" callback-5a callback_test_1 - ] unit-test -] unless +! cpu "arm" = [ +! [ "testing" ] [ +! "testing" callback-5a callback_test_1 +! ] unit-test +! ] unless : callback-6 "void" { } "cdecl" [ [ continue ] callcc0 ] alien-callback ;