Skip to content Skip to sidebar Skip to footer

Optimize A Function That Acts On A Numpy Array With An If Statement

Suppose I have a code like: import numpy as np def value_error(x): if x > 10: return 0. else: return np.sin(x) This could give me a ValueError: The tru

Solution 1:

Many of the ufunc take a where parameter

In [98]: x=np.arange(10)*2
In [99]: mask = x<10
In [100]: y = np.zeros(10)
In [101]: np.sin(x,where=mask,out=y)
Out[101]: 
array([ 0.        ,  0.90929743, -0.7568025 , -0.2794155 ,  0.98935825,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ])

While this is a small case, timeit suggests it doesn't have much advantage over the mask use of `@divakar's answer:

In [104]: timeit np.sin(x,where=mask,out=y)
5.17 µs ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [105]: timeit y[mask] = np.sin(x[mask])
4.69 µs ± 9.54 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

(for much larger x, the where parameter has a slight time advantage over the mask use.)


Solution 2:

Here's a mask based one that operates with np.sin only on the valid ones -

out = np.zeros(x.shape)
mask = x <= 10
out[mask] = np.sin(x[mask])

Leveraging numexpr module for faster transcendental operations -

import numexpr as ne

out = np.zeros(x.shape)
mask = x <= 10
x_masked = x[mask]
out[mask] = ne.evaluate('sin(x_masked)')

Solution 3:

Note that your function does not take advantage of numpy's vectorisation. There are a few possible options.

Option 1
This seems like a good use case for np.where -

y = np.where(x > 10, 0, np.sin(x))

Which returns values based on the mask provided. Here's a sample -

x
array([  0.1,   0.2,   0.3,  11. ,   0.1,  11. ])

np.where(x > 10, 0, np.sin(x))
array([ 0.09983342,  0.19866933,  0.29552021,  0.        ,  0.09983342,  0.        ])

Note that this method still calls the "expensive function" for each element.


Option 2
Another possibility is to use a mask and set values conditionally -

y = np.sin(x)
y[x > 10] = 0

Similar to above, you could multiply x by a mask and call np.sin on the result -

y = np.sin(x * (x < 10))

As Divakar mentioned, you can use numexpr with this condition -

import numexpr as ne
y = ne.evaluate("sin(x * (x < 10))")

This should be faster than the ones above.


Post a Comment for "Optimize A Function That Acts On A Numpy Array With An If Statement"