

Discover more from Snakes in the Fast Lane
We are going to create our first Numba version of the NumPy-based ray tracer. We are working to make our processing as fast as possible. For context, you can find the article on the Numpy original version here.
The code for both versions is quite small and available in a single file the git repository.
It’s as simple as…
If you want to use Numba, you just need to decorate the relevant functions on the code linked above with numba.njit
:
import numba
@numba.njit
def hit_sphere(center, radius, ray):
...
@numba.njit
def color_sphere(ray):
...
@numba.njit
def run_raytrace():
...
That is all that is, really. Or is it?
This is a bit of a lie
Honestly, I made our life easier - by designing the NumPy version carefully from the onset - so that it would be easy to create Numba efficient code.
Sparkling the code with Numba decorators doesn’t do much most of the time. One has to be mindful of the Python dialect we are using and many times we will have to change the code in order Numba to be really able to optimize it,
So, do not expect that your first code will be so easy to optimize, tweaks are normally necessary. Further below you will find a list of design choices on the NumPy version that made creating an efficient Numba version so easy. We will be revisiting the design concerns of writing code for Numba in future posts.
That being said, this is a good starting point to discuss important Numba issues, lets get at it.
Decorating with jit
vs njit
We used numba.njit
as our decorator. There is a more general decorator which is numba.jit
(JIT meaning Just-In-Time compilation). njit
is equivalent to jit(nopython=True)
. What is going on here?
By default jit
will try to generate code that doesn’t interact with the Python object machinery but, and this is crucial, it will fall back to inter-operation with the Python object machinery if it fails convert everything to low-level code.
The biggest problem with speed efficiency of code is, by far, having interactons with the Python object machinery. So we want, when possible, to generate code that is as interaction free as possible!
The njit
decorator, on the other hand, will fail if it isn’t able to generate code that has 0 interactions with the Python object machinery. It is more strict and, quite frankly we want this semantics most of the time if we want really fast code.
Actually you can pepper your code and jit
decorators and Numba will happily oblige. The problem: the generated solution will be probably gain very little in performance. It can even get worse than a pure-Python solution.
Use njit
. Yes, it will be painful as you will have to redesign your code - you will have many failures from Numba. But you can gain orders of magnitude in terms of performance. As importantly, at least in my view, when you suffer through njit
error messages - and redesign our code accordingly - you will learn a lot about what patterns make code efficient in the Python world. This pain actually leads to some gain in terms of knowledge maturity about performance.
Side note: We will be using different syntactic sugar from @njit
If you look at the code in the repository, you will see that we don’t use decorators as above. Because we want to easily compare the performance of Numba vs non-Numba implementations were are going to do something less common than using a decorator with the typical syntax.
We want to be able to call all functions with and without Numba. So, instead of using the decorators as you see above, we are going to annotate functions if we actually decide to use Numba. Compare the two implementations below:
@app.command()
def run_non_numba():
array = run_raytrace()
# Explain why Image is not used in the numba function
img = Image.fromarray(np.transpose(array, (1, 0, 2)))
img.save("out.png")
@app.command()
def run_numba():
global hit_sphere, color_sphere
hit_sphere = numba.njit()(hit_sphere)
color_sphere = numba.njit()(color_sphere)
array = numba.njit()(run_raytrace)()
img = Image.fromarray(np.transpose(array, (1, 0, 2)))
img.save("out.png")
app.run()
This allows us to only optimize the code if we decide to use the Numba version. We are replacing the functions - originally not Numba based - with Numba decorated versions when use run_numba
. This generates the same code than using the standard decoration syntax, really. Only that you only decorate when you decide to use Numba.
This is just decorator syntax shuffling to be able optimize in real-time, not really important for our optimization problem. No need to dwell on it much. Back to our scheduled programming….
Calling the code
Calling the Numba-based code is as simple as:
time python -m numba_raytracer.basic run-numba
You can find an image with the rendering in the file out.png
. Not exactly the funniest of images: a red sphere on a blue gradient, but hey, it works.
Remember that you can call the non-optimized version with:
time python -m numba_raytracer.basic run-non-numba
Performance analysis
On my computer the Numba version is 13 times faster than the non-Numba one! Not bad just for adding 3 simple code annotations.
No, I am not going to post the absolute numbers on purpose. I am not particularly fond of over-quantification at this level of discussion. The point here is to have a quasi-intuitive grasp of the advantages of Numba. Feel free to check the values on your machine.
But don’t worry, in several future posts we are going to dig really deep into performance analysis and profiling of code. For now I am keeping the comparative analysis deliberately shallow, but that will surely change later.
Where I am fooling you
It seems so easy, doesn’t it? You just add some decorators and off you go!
It is rarely that easy (unless you use jit
instead of njit
, but as we discussed above that would not really mean much), and it only worked in this case base the original code was designed to make the conversion easy.
In most cases code will have to be adapted to be processed by Numba.
Here are a few pitfalls that I have fallen into during the coding of the first version, in the previous article, and that I needed re-write for Numba njit
to do it’s magic. These are not visible any more in code as I fixed them, but I want make sure you are aware of them.
You need to decorate all the functions that you are calling. So,it’s not enough to decorate the top-level
run_raytrace
, but we need also to decoratecolor_sphere
andhit_sphere
. Not a big deal for such a small application. Remember that would should not decorate everything - just the computationally expensive part.Calls to the Pillow image library are done outside the optimized code. While Numba can optimize NumPy, that is actually an exception for Python libraries. If you use external libraries you normally have to make sure any calls to those libraries are not inside an optimized (
njit
) function. Numba won’t process that code and it will fail. Again, This is mostly OK as Numba is not meant to convert all code, but only the performance-critical part.As you can see, I did not use typing annotations. We will be discussing type annotations of Numba code a at later stage.
Numba requires the types of some returns to be have more consistency than pure-Python. For example, look at this version of
color_sphere
:def color_sphere_doesnt_work(ray): if hit_sphere(np.array([0, 0, -1]), 0.5, ray): return np.array([1, 0, 0]) unit_direction = ray[1] / np.linalg.norm(ray[1]) t = 0.5 * (unit_direction[1] + 1.0) return np.array([1.0, 1.0, 1.0]) * (1 - t) + np.array([0.5, 0.7, 1.0]) * t
Note that the first return has an array of integers, whereas the second return is float based. Numba will refuse to process this code (which is perfectly valid otherwise) as it expects that both returns have the same type. So for this to work we need to annotate the first array creation like this
return np.array([1, 0, 0], dtype=np.float64)
. Then both returns will have the same type - a float in this case.As a question of style and bug-resilience I actually prefer that Numba requires this. You are assured that you always return the same type.
Wrapping up
This concludes our first example with Numba.
We started a discussion on how to write Python code that can actually be optimized by Numba; we introduced the decorator syntax to use Numba; also the fundamental types of compilation approaches and some of the pitfalls related to making the code compilable with njit
.
We still have a long, long journey ahead of us…