Monday, April 27, 2009

My project proposal, Implementing NumPy's ufuncs using CorePy was accepted for GSoC 2009. I'm pretty excited on a number of levels -- I get to experience GSoC, become more involved in the general Python community, and do the actual project itself.

Some background:

NumPy is a very general Python library for doing scientific programming. In particular, NumPy contains an N-dimensional array class and a set of operations over these arrays called ufunc. These operations include basic arithmetic (add, multiply), math functions (sin, cos), logic/comparison functions (min, max, greater than), and more advanced operations like reduce and accumulate.

CorePy provides a framework for generating and executing high performance computation kernels (assembly code) at runtime directly in Python. Runtime-generated code can be specialized/optimized in domain-specific ways, using parameters that might only be available at runtime (think generative programming).

Put these two pieces together and you have my project. NumPy's ufuncs serve as a base for developing more complex array-based operations and algorithms, making them an ideal area for significant performance optimization. CorePy is an excellent tool for this task, especially due to the way it exposes the underlying processor architecture. x86 SSE instructions can (will) be used to develop vectorized ufunc operations, and multiple cores leveraged to obtain even more parallelism.

I'll keep updating here as I progress through the project.