Tuesday, April 15, 2014

NumPy on PyPy - Status Update

Work on NumPy on PyPy continued in March, though at a lighter pace than the previous few months. Progress was made on both compatibility and speed fronts. Several behavioral issues reported to the bug tracker were resolved. The most significant of these was probably the correction of casting to built-in Python types. Previously, int/long conversions of numpy scalars such as inf/nan/1e100 would return bogus results. Now, they raise or return values, as appropriate.

On the speed front, enhancements to the PyPy JIT were made to support virtualizing the raw_store/raw_load memory operations used in numpy arrays. Further work remains here in virtualizing the alloc_raw_storage when possible. This will allow scalars to have storages but still be virtualized when possible in loops.

Aside from continued work on compatibility/speed of existing code, we also hope to begin implementing the C-level components of other numpy modules such as mtrand, nditer, linalg, and so on. Several approaches could be taken to get C-level code in these modules working, ranging from reimplementing in RPython to interfacing with existing code with CFFI, if possible. The appropriate approach depends on many factors and will probably vary from module to module.

To try out PyPy + NumPy, grab a nightly PyPy and install our NumPy fork. Feel free to report comments/issues to IRC, our mailing list, or bug tracker. Thanks to the contributors to the NumPy on PyPy proposal for supporting this work.


Werner Beroux said...

Trying to install scipy on top gives me an error while compiling scipy/cluster/src/vq_module.c; isn't scipy yet supported?

Anonymous said...

scipy is not supported. Sometimes scipy functions are in fact in numpy in which case you can just copy the code. Otherwise you need to start learning cffi.

Yichao Yu said...

You mentioned storage and scalar types. Is it related to this bug

vak said...

what is the status about incorporating BLAS library?

Anonymous said...

How far is running Pandas on Pypy? Will it be just a recompile when Numpy is ported, or is it heavy work to port Pandas to Pypy after Numpy is done? Should I look after another solution than plan to run Pandas on Pypy?