Calling C/C++ extensions with ctypes
Objectives
You will learn:
- how to call C/C++ compiled code from Python using the
ctypes
module - how to compile a C++ extension using
setuptools
We’ll use the code under cext
. Start by
cd cext
Why extend Python with C/C++
- You want to call a function that is implemented in C, C++ or Fortran. This can give you access to a vast collection of libraries so you won’t have to reinvent the wheel
- You have identified a performance bottleneck - reimplementing some parts of your Python code in C, C++ or Fortran could give you a performance boost
- It makes your code type safe. In contrast to C, C++ and Fortran, Python is not a typed language - you can pass any object to any Python function. This can cause runtime failures in Python which cannot occur in C, C++ or Fortran, as the error would be caught by the compiler
Pros
- A good way to glue Python with an external library
- Can be used to incrementally migrate code to C/C++
- Very flexible
- Simpler and easier to maintain than custom C extensions
Cons
- Has a learning curve, one must understand how Python and C work
- Mistakes often lead to segmentation faults, which can be hard to debug
Learn the basics
Let’s go to our example of computing the sum of all the elements of an array. In order not to interfere with the scatter code, let’s create a directory mysum_example
and go to that directory:
mkdir mysum_example
cd mysum_example
Open your editor and copy-paste the following code
/**
* Compute the sum an array
* @param n number of elements
* @param array input array
* @return sum
*/
extern "C" // required when using C++ compiler
long long mysum(int n, int* array) {
// return type is 64 bit integer
long long res = 0;
for (int i = 0; i < n; ++i) {
res += array[i];
}
return res;
}
Save the above in file mysum.cpp
.
Note: the extern "C"
line ensures that function mysum
can be called from “C”. Because Python is written in C, your external function must be C callable.
To compile mysum.cpp
we need to write a setup.py
file. Open your editor, copy-paste the lines
from setuptools import setup, Extension
# Compile *mysum.cpp* into a shared library
setup(
#...
ext_modules=[Extension('mysum', ['mysum.cpp'],),],
)
and save in file setup.py. The fact that mysum.cpp has the .cpp extension indicates that the source file is written in C++.
Compile the code with the command:
python setup.py build
This will compile the code and produce a shared library under build/lib.linux-x86_64-3.6
, something like mysum.cpython-36m-x86_64-linux-gnu.so
. The extension .so indicates that the above is a shared library (also called dynamic-link library or shared object). The advantage of creating a shared library over a static library is that in the former the Python interpreter needs not be recompiled. The good news is that setuptools
knows how to compile shared libraries so you won’t have to worry about the details.
Notes:
- by convention this file should be named
setup.py
- a more realistic example might have
include
directories and libraries listed in setup.py if the C++ extension depends on external packages. An example of asetup.py
file can be found here.
Steps required to call an external function from Python
To call mysum
from Python we’ll use the ctypes
module. The steps are:
- use function
CDLL
to open the shared library.CDLL
expects the path to the shared library and returns a shared library object. - tell the argument and result types of the function. The argument types are listed in members
argtypes
(a list) andrestype
, respectively. Use for instancectypes.c_int
for a Cint
. See table below to find out how to translate other C types to their correspondingctypes
objects. - call the function, casting the Python objects into ctypes objects if required. The table below shows how you can cast some common C/C++ types in corresponding Python objects, which can be handed over to an external C/C++ function
Translation table for some Python and C/C++ types
The following table can be used to translate some common types between Python and C:
Python | C/C++ type | Comments |
---|---|---|
None |
NULL |
|
ctypes.char_p |
char* |
|
ctypes.c_int |
int |
No need to cast |
ctypes.c_longlong |
long long |
|
ctypes.c_double |
double |
|
numpy.ctypeslib.ndpointer(dtype=numpy.float64) |
double* |
pass a numpy array of type numpy.float64 |
numpy.ctypeslib.ndpointer(dtype=numpy.int32) |
int* |
pass a numpy array of type numpy.int32 |
For a complete list of C to ctypes type mapping see the Python documentation.
Calling mysum from Python
Let’s return to our mysum
C++ function, which we would like to call from Python:
import ctypes
import numpy
import glob
# find the shared library, the path depends on the platform and Python version
libfile = glob.glob('build/*/mysum*.so')[0]
# 1. open the shared library
mylib = ctypes.CDLL(libfile)
# 2. tell Python the argument and result types of function mysum
mylib.mysum.restype = ctypes.c_longlong
mylib.mysum.argtypes = [ctypes.c_int,
numpy.ctypeslib.ndpointer(dtype=numpy.int32)]
array = numpy.arange(0, 100000000, 1, numpy.int32)
# 3. call function mysum
array_sum = mylib.mysum(len(array), array)
print('sum of array: {}'.format(array_sum))
Additional explanation
-
By default, arguments are passed by value. To pass an array of ints (
int*
), specifynumpy.ctypeslib.ndpointer(dtype=numpy.int32)
in theargtypes
list. You can declaredouble*
similarly by usingnumpy.ctypeslib.ndpointer(dtypes=numpy.float64)
-
Strings will need to be converted to byte strings in Python 3 (
str(mystring).encode('ascii')
) -
Passing by reference, for instance
int&
can be achieved usingctypes.byref(myvar_t)
withmyvar_t
of typectypes.c_int
-
Numpy arrays of type numpy.int have precision numpy.int64 so make sure to create an array of type numpy.int32, which has the same precision as C int.
-
When passing arrays, it is possible to specify extra restrictions on the numpy arrays at the interface level, for example the number of dimensions the array should have or its shape. If an array passed in as an argument does not meet the specified requirements and exception will be raised. A full list of possible options can be found in the
numpy.ctypeslib.ndpointer
documentation.
Exercises
We’ve created a version of
scatter.py
that compute the scattered wave from a contour segment in C++. Compile the code usingpython setup.py build
. (Make sure you have theBOOST_DIR
environment variable set as described here.). The code runs faster than the pure Python code underoriginal/
; however, there is still room for improvement. Your task will be to replace the computation in Python functionisInsideContour
defined inscatter.py
with the C++ version implemented insrc/is_inside_contour.cpp
.
- run and time/profile
scatter.py
, making note of the checksum- look into
src/is_inside_contour.h
to determine the interface of the function- in
setup.py
, addsrc/is_inside_contour.cpp
to thewave
library extension- define the C++ function interface in
scatter.py
- modify
scatter.py
to call the C++ function- re-run and time/profile
scatter.py
, making sure the checksum has not changed