Public channel for discussing Numba usage. Don't post confidential info here! Consider posting questions to: https://numba.discourse.group/ !
float64[:,;]
to float64[:,::1]
does that get rid of the error?
_debug_print=True
in nrtdynmod.py
, the values printed by NRT_Incref
make no sense (e.g. I am getting *** NRT_Incref 393217 [...]
), while NRT_Decref
prints more reasonable values. (e.g. ** NRT_Decref 2 [...]
). Is this normal?
%zu
for a value I believe to be of type signed int. changing the Format to %hhi
does little to solve the problem)
Hi, all. I am a newbie of numba and have been looking chance to contirbute to numba. I was tring to fix issue6949 in the pasted weekend, which is my first try, but i am stuck now and asking for some help.
The result of array inpalce binop(such as +=) is ircorrect when aliases of the lhs in the function expression existed. For example (provided in #6949):
@jit
def self_addition(x):
x += x.T
return x
x = np.array([[1,2],[3,4]])
self_addition(x)
> array([[2, 5],[8, 8]])
In this case, x
will be temporarily changed to [[2,5], [3,4]]
since the first row will be calculated first when matrix addition is performed. Therefore, x.T
will be [[2,3], [5,4]]
(the correct value is [[1,3], [2,4]])
when the second row is calculated. I am still not sure which pass cased this error , does anyone know anything about this?
from numba import cuda
from time import sleep
import numpy as np
@cuda.jit
def report_progress(progress):
for i in range(1000000):
cuda.atomic.inc(progress, 0, 10000000)
cuda.nanosleep(1000)
progress = cuda.managed_array(1, dtype=np.uint64)
progress[0] = 0
report_progress[1, 1](progress)
val = 0
count = 0
while val < 1000000:
sleep(0.001)
val = progress[0]
print(f"{count}: {val}")
count += 1
Gives
0: 369
1: 1119
2: 1825
...
1251: 998488
1252: 999286
1253: 1000000
for me on RTX 8000
Hello,
I am unable to call a GUfunc with another GUfunc defined with @guvectorize.
Exemple :
import numpy as np
import numba
from numba import int64, guvectorize, float64
@guvectorize(["float64[:], float64[:]"], "(i)->()", nopython=True)
def mysum(x, y):
"Reimplement x.sum(axis=-1)"
acc = 0
for xi in x:
t = np.array([0.])
mysum2(x,t)
acc += t[0]
y[0] = acc
@guvectorize(["float64[:], float64[:]"], "(i)->()", nopython=True)
def mysum2(x, y):
"Reimplement x.sum(axis=-1)"
acc = 0
for xi in x:
acc += xi
y[0] = acc
print(mysum(np.array([1.,2.,3.]), np.array([0.])))
This raises:Untyped global name 'mysum2': cannot determine Numba type of <class 'numpy.ufunc'>
Swaping the order does not fix the issue.
Thanks
mysum2
and it should work
number_of_points = 100000
depth = np.linspace(10, 20, number_of_points)
unc = np.linspace(1, 2, number_of_points)
cell_indices = np.random.randint(0, 400, number_of_points)
grid = np.full((20, 20), np.nan)
uncgrid = np.full((20, 20), np.nan)
@numba.jit(nopython=True, nogil=True)
def grid_mean(depth: np.array, uncertainty: np.array, cell_indices: np.array, grid: np.ndarray, uncertainty_grid: np.ndarray):
unique_indices = np.unique(cell_indices)
flatgrid = grid.ravel()
flatunc = uncertainty_grid.ravel()
for uniq in unique_indices:
msk = cell_indices == uniq
flatgrid[uniq] = np.mean(depth[cell_indices[msk]])
flatunc[uniq] = np.mean(uncertainty[cell_indices[msk]])
return flatgrid.reshape(grid.shape), flatunc.reshape(grid.shape)
TypingError Traceback (most recent call last)
<ipython-input-79-d59e448f35b9> in <module>
3 points = np.linspace((1,2),(5,6),20)
4 return points
----> 5 clus()
~/.local/lib/python3.6/site-packages/numba/core/dispatcher.py in _compile_for_args(self, args, *kws)
413 e.patch_message(msg)
414
--> 415 error_rewrite(e, 'typing')
416 except errors.UnsupportedError as e:
417 # Something unsupported is present in the user code, add help info
~/.local/lib/python3.6/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
356 raise e
357 else:
--> 358 reraise(type(e), e, None)
359
360 argtypes = []
~/.local/lib/python3.6/site-packages/numba/core/utils.py in reraise(tp, value, tb)
78 value = tp()
79 if value.traceback is not tb:
---> 80 raise value.with_traceback(tb)
81 raise value
82
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function linspace at 0x7fa6305d9840>) found for signature:
linspace(Tuple(Literalint, Literalint), Tuple(Literalint, Literalint), Literalint)
There are 2 candidate implementations:
- Of which 2 did not match due to:
Overload of function 'linspace': File: numba/core/typing/npydecl.py: Line 615.
With argument(s): '(UniTuple(int64 x 2), UniTuple(int64 x 2), int64)':
No match.
During: resolving callee type: Function(<function linspace at 0x7fa6305d9840>)
During: typing of call at <ipython-input-79-d59e448f35b9> (3)
File "<ipython-input-79-d59e448f35b9>", line 3:
def clus():
points = np.linspace((1,2),(5,6),20)
AttributeError: 'ManagedNDArray' object has no attribute 'alloc_size'
. I don't get this error when I use this function on 1 GPU only. But when I use it on several GPUs with cuda.gpus[i] I get this error.
from numba import njit
@njit
def div_by_zero(x):
return x/0
div_by_zero(10)
@njit
def parent(x):
return div_by_zero(x)
parent(10)
in example like this, the error message is
----> 15 parent(10)
ZeroDivisionError: division by zero
is there a way (e.g. debug option) to pinpoint to where the error actually happens? return x/0
? In larger code base, it gets harder to find where the error actually happens.
I hava got so many warnings belows when i run python -m numba.runtests
:
something like
..../root/huawei/numba/numba/tests/test_sort.py:934: NumbaPendingDeprecationWarning:
Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'A' of function 'make_quicksort_impl.<locals>.GET'.
For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types
File "numba/misc/quicksort.py", line 55:
@wrap
def GET(A, idx_or_val):
^
new_x.sort(key=key)
and
/root/huawei/numba/numba/core/object_mode_passes.py:161: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.
For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit
File "numba/tests/test_builtins.py", line 40:
def any_usecase(x, y):
^
warnings.warn(errors.NumbaDeprecationWarning(msg,
..................................<string>:2: RuntimeWarning: invalid value encountered in arccos
........<string>:2: RuntimeWarning: invalid value encountered in arccos
.........................<string>:2: RuntimeWarning: invalid value encountered in arccosh
........<string>:2: RuntimeWarning: invalid value encountered in arccosh
.........................<string>:2: RuntimeWarning: invalid value encountered in arcsin
........<string>:2: RuntimeWarning: invalid value encountered in arcsin
.............................................................<string>:2: RuntimeWarning: divide by zero encountered in arctan
....<string>:2: RuntimeWarning: divide by zero encountered in arctan
...............................................<string>:2: RuntimeWarning: invalid value encountered in arctanh
........<string>:2: RuntimeWarning: invalid value encountered in arctanh
...................../root/huawei/numba/numba/core/typed_passes.py:331: NumbaPerformanceWarning:
The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.
To find out why, try turning on parallel diagnostics, see https://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help.
File "numba/tests/test_blackscholes.py", line 27:
def cnd_array(d):
<source elided>
(K * (A1 + K * (A2 + K * (A3 + K * (A4 + K * A5))))))
return np.where(d > 0, 1.0 - ret_val, ret_val)
^
def cnd_array(d):
/root/huawei/numba/numba/core/object_mode_passes.py:151: NumbaWarning: Function "cnd_array" was compiled in object mode without forceobj=True.
Is it normal or something was error with my environment, conda list
show messages:
packages in environment at /root/miniconda3/envs/numbaenv:
Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 51_gnu
blas 1.0 openblas
ca-certificates 2021.4.13 hd43f75c_1
certifi 2020.12.5 py38hd43f75c_0
cffi 1.14.5 py38hdced402_0
jinja2 2.11.3 pyhd3eb1b0_0
ld_impl_linux-aarch64 2.36.1 h0ab8de2_3
libffi 3.3 h7c1a80f_2
libgcc-ng 10.2.0 h1234567_51
libgfortran-ng 10.2.0 h9534d94_51
libgfortran5 10.2.0 h1234567_51
libgomp 10.2.0 h1234567_51
libllvm10 10.0.1 h0eb0881_6
libopenblas 0.3.13 hf4835c0_1
libstdcxx-ng 10.2.0 h1234567_51
llvmdev 9.0.1 0 numba
llvmlite 0.37.0.dev0 pypi_0 pypi
markupsafe 1.1.1 py38hfd63f10_0
ncurses
Hi, I am trying to use the normal cdf function with a numba njitted function, so e.g. using numba_scipy - but it fails so far - here my code:
from numba import njit
from scipy.special import ndtr
from scipy.stats import norm as no
# import numba_scipy
a = no.cdf(0.) # desired test outcome: 0.5
@njit
def test(x):
a = ndtr(x)
b = no.cdf(x)
return a,b # naive implementations via scipy normal cdf functions
print(test(0.))
From what I read on numba_scipy, it just would need to be installed... however I get an error
Untyped global name 'ndtr': Cannot determine Numba type of <class 'numpy.ufunc'>
How should I code it? Thanks!
As I understood it, telling Numba to inline a function is mostly for type inference and doesn't affect performance. However, I have a function that IS getting performance benefits from inlining that I'm not quite understanding. I was wondering if anyone here had any ideas:
@nb.jit(nopython=True, inline='always')
def bounded_uint(ub):
x = np.random.randint(low=2**32)
m = ub * x
l = np.bitwise_and(m, 2**32 - 1)
if l < ub:
t = (2**32 - ub) % ub
while l < t:
x = np.random.randint(low=2**32)
m = ub * x
l = np.bitwise_and(m, 2**32 - 1)
return m >> 32
@nb.jit(nopython=True)
def nb_fast_shuffle(arr):
i = arr.shape[0] - 1
while i > 0:
j = bounded_uint(i + 1)
arr[i], arr[j] = arr[j], arr[i]
i -= 1
x = np.arange(1000)
%%timeit
nb_fast_shuffle(x)
4.25 µs ± 47.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#change inline to "never"
%%timeit
nb_fast_shuffle(x)
7.73 µs ± 699 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Thanks!
Hello, I hope you are all well. I'm encountering a small and annoying error which must be related to the context used by cuda. Let me explain. I run several kernels and I want to read the data from the cpu while the kernel is running. So far, no problem with the solution proposed by @gmarkall
worked perfectly well to use a managed array and then read the contents of the variables used in the kernel afterwards.
The idea is that I run a kernel in a context using:
with gpu.cuda[gpu-number]:
I read what's inside while the gpu is running. So I can have the evolution of my variables.
And I do the same operation several times. I restart the kernel several times once the previous one has finished.
Only it doesn't work after the first iteration. I can't read the variables. The code wait just after the kernel function is called and prevents me from reading what's inside. It is only after the kernel has finished its calculations that I have access to the variables again.
The problem is a bit unclear on my side, and I have the impression that it is related to the contexts of cuda. Anybody have an idea? Thanks to you.
index_list_type = typeof(typed.List.empty_list(types.int64))
, is there any way to create an instance of that type inside a jitted fn? i.e. how can I simplify this code?@overload(_groupby_indices_impl)
def _groupby_indices_specialize(group):
index_list_type = typeof(typed.List.empty_list(types.int64))
group_type = typeof(group.dtype)
def groupby(group):
keys = typed.Dict.empty(group_type, types.uint64)
groups = typed.List.empty_list(index_list_type)
for i, group in enumerate(group):
if group not in keys:
key = keys[group] = len(keys)
groups.append(typed.List.empty_list(types.int64))
key = keys[group]
groups[key].append(i)
return groups
return groupby
generated_jit
function? I was hoping something like the following would work:from collections import namedtuple
import numpy as np
from numba import generated_jit
conf1 = namedtuple("conf1", "opt1 opt2 opt3")
conf2 = namedtuple("conf2", "opt1 opt2")
@generated_jit(nogil=True, nopython=True)
def g(config):
if isinstance(config, conf1):
def impl(config):
print("Using conf1 impl.")
return
else:
def impl(config):
print("Using conf2 impl.")
return
return impl
c1 = conf1(np.ones(10), "321", 1)
c2 = conf2("123", 4.5)
g(c1)
g(c2)