Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
Graham Markall
I think it's because your spec specifies arrays that may not be contiguous, because the last dimension has no step
if you change float64[:,;] to float64[:,::1] does that get rid of the error?
oops, just saw the in-thread reply :-)
2 replies
Jakub Nabaglo
I’m trying to learn more about defining bespoke low-level types for use in Numba. What’s the difference between reflection vs boxing/unboxing?
3 replies
@gmarkall I worked , Thank you for help kind sir.
At more experienced numba developers: when setting _debug_print=True in nrtdynmod.py, the values printed by NRT_Incref make no sense (e.g. I am getting *** NRT_Incref 393217 [...]), while NRT_Decref prints more reasonable values. (e.g. ** NRT_Decref 2 [...]). Is this normal?
20 replies
(Ignoring the fact that the printf strings use %zu for a value I believe to be of type signed int. changing the Format to %hhi does little to solve the problem)
3 replies
Is it possible to access values used in the kernel when the GPU is running a task? I have seen that you can use cuda.stream only it slows down the GPU a lot.
10 replies
Guoqiang QI

Hi, all. I am a newbie of numba and have been looking chance to contirbute to numba. I was tring to fix issue6949 in the pasted weekend, which is my first try, but i am stuck now and asking for some help.
The result of array inpalce binop(such as +=) is ircorrect when aliases of the lhs in the function expression existed. For example (provided in #6949):

def self_addition(x):
    x += x.T
    return x

x = np.array([[1,2],[3,4]])

> array([[2, 5],[8, 8]])

In this case, xwill be temporarily changed to [[2,5], [3,4]] since the first row will be calculated first when matrix addition is performed. Therefore, x.T will be [[2,3], [5,4]] (the correct value is [[1,3], [2,4]]) when the second row is calculated. I am still not sure which pass cased this error , does anyone know anything about this?

Would it be possible to replicate Python Deque object type in Numba, if it took ONLY numpy 2D array in input and output? I need to use this in Numba's Jitclass.
7 replies
Graham Markall
@jdenozi try this for an example (requires CC 6.0, not exhaustively tested, YMMV, etc etc):
from numba import cuda
from time import sleep
import numpy as np

def report_progress(progress):
    for i in range(1000000):
        cuda.atomic.inc(progress, 0, 10000000)

progress = cuda.managed_array(1, dtype=np.uint64)
progress[0] = 0

report_progress[1, 1](progress)

val = 0
count = 0

while val < 1000000:
    val = progress[0]
    print(f"{count}: {val}")
    count += 1
19 replies


0: 369
1: 1119
2: 1825
1251: 998488
1252: 999286
1253: 1000000

for me on RTX 8000

CARON Renaud

I am unable to call a GUfunc with another GUfunc defined with @guvectorize.
Exemple :

import numpy as np
import numba
from numba import int64, guvectorize, float64

@guvectorize(["float64[:], float64[:]"], "(i)->()", nopython=True)
def mysum(x, y):
    "Reimplement x.sum(axis=-1)"
    acc = 0
    for xi in x:
        t = np.array([0.])
        acc += t[0]
    y[0] = acc

@guvectorize(["float64[:], float64[:]"], "(i)->()", nopython=True)
def mysum2(x, y):
    "Reimplement x.sum(axis=-1)"
    acc = 0
    for xi in x:
        acc += xi
    y[0] = acc

print(mysum(np.array([1.,2.,3.]), np.array([0.])))

This raises:
Untyped global name 'mysum2': cannot determine Numba type of <class 'numpy.ufunc'>

Swaping the order does not fix the issue.


Siu Kwan Lam
@BlackBip, it's a known limitation that guvectorize cannot be used in other jit-compiled code. You can rewrite mysum2 as a @numba.jit function
you can probably just replace the decorator on mysum2 and it should work
CARON Renaud
@sklam But if I do that I can't use cuda for faster calculations right?
16 replies
Guilherme Leobas
Does llvmlite 0.36 supports llvm 11 or only llvm 10?
6 replies
Guoqiang QI
Does anyone know how to get the corresponding LLVM type of a given numba type , such as get IntType(32) by int32 ?
10 replies
Eric Younkin
Hi, wondering if I could get a quick look at this code sample just to see if there are some obvious optimizations I could make. Code is to take the mean of each cell in a grid. Its about 4x faster without the numba decorator, so I feel like there is something I am missing. I did look at the inspect_types method, and it is inferring types correctly I think.
number_of_points = 100000
depth = np.linspace(10, 20, number_of_points)
unc = np.linspace(1, 2, number_of_points)
cell_indices = np.random.randint(0, 400, number_of_points)
grid = np.full((20, 20), np.nan)
uncgrid = np.full((20, 20), np.nan)

@numba.jit(nopython=True, nogil=True)
def grid_mean(depth: np.array, uncertainty: np.array, cell_indices: np.array, grid: np.ndarray, uncertainty_grid: np.ndarray):
    unique_indices = np.unique(cell_indices)
    flatgrid = grid.ravel()
    flatunc = uncertainty_grid.ravel()
    for uniq in unique_indices:
        msk = cell_indices == uniq
        flatgrid[uniq] = np.mean(depth[cell_indices[msk]])
        flatunc[uniq] = np.mean(uncertainty[cell_indices[msk]])
    return flatgrid.reshape(grid.shape), flatunc.reshape(grid.shape)
9 replies
Hi,I was able to linspace in numba but when i try to pass tuple in linspace I get an error
2 replies
def clus():
points = np.linspace((1,2),(5,6),20)
return points

TypingError Traceback (most recent call last)

<ipython-input-79-d59e448f35b9> in <module>
3 points = np.linspace((1,2),(5,6),20)
4 return points
----> 5 clus()

~/.local/lib/python3.6/site-packages/numba/core/dispatcher.py in _compile_for_args(self, args, *kws)
413 e.patch_message(msg)
--> 415 error_rewrite(e, 'typing')
416 except errors.UnsupportedError as e:
417 # Something unsupported is present in the user code, add help info

~/.local/lib/python3.6/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
356 raise e
357 else:
--> 358 reraise(type(e), e, None)
360 argtypes = []

~/.local/lib/python3.6/site-packages/numba/core/utils.py in reraise(tp, value, tb)
78 value = tp()
79 if value.traceback is not tb:
---> 80 raise value.with_traceback(tb)
81 raise value

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function linspace at 0x7fa6305d9840>) found for signature:

linspace(Tuple(Literalint, Literalint), Tuple(Literalint, Literalint), Literalint)

There are 2 candidate implementations:

  - Of which 2 did not match due to:
  Overload of function 'linspace': File: numba/core/typing/npydecl.py: Line 615.
    With argument(s): '(UniTuple(int64 x 2), UniTuple(int64 x 2), int64)':
   No match.

During: resolving callee type: Function(<function linspace at 0x7fa6305d9840>)
During: typing of call at <ipython-input-79-d59e448f35b9> (3)

File "<ipython-input-79-d59e448f35b9>", line 3:
def clus():
points = np.linspace((1,2),(5,6),20)

Hello, I have a question:when I wanna use @jit(cache=True),will raise warningCannot cache compiled function "deal1" as it uses lifted code
Is there a way I can see information about the code that was lifted in debug mode? I want to use this fun to AOT
1 reply
CARON Renaud
Hello, how could one do S -= P with S and P with (n,2) and (1,2) array inside of a cuda device function efficiently please?
45 replies
Graham Markall
@BlackBip I just saw you also posted on discourse about the previous question - please do ping here if you create posts on discourse, then I can jump on there and have a look
Hello all!
1 reply
Gabriele Maurina
Hi, is it possible to compile and save vectorize and guvectorize functions? I am working on an application where time is key and 6 seconds to load numba and compile everytime is really long. I wish I could compile them only once and then disribute the binaries
6 replies
Samuel Holt
Hi All, Can we do another release of the library ? I love Numba and actively use it for mathematical processing at scale, however I really would like the new feature in PR : numba/numba#6608 of using f-strings to be used, however it is merged however it is not released yet. How can I help to run a release of the latest master on github ? Also thank you to you all, for such an amazing project and library
6 replies
Filippo Vicentini
Hi all (devs), what is the relationship between experimental.jitclass, experimental.structref and numba-estras/generic_jitclass ? Is there consensus of what will become the final implementation, and if some higher level interface will be built on top of structref?
4 replies
Hello, does anyone have any idea where this error may have come from? I am trying to use: cuda.managed_array but I find this error: AttributeError: 'ManagedNDArray' object has no attribute 'alloc_size'. I don't get this error when I use this function on 1 GPU only. But when I use it on several GPUs with cuda.gpus[i] I get this error.
4 replies
@gmarkall is there a method similar to register_attr and lower_getattr that adds indexing capability to a custom CUDA struct?
2 replies
from numba import njit

def div_by_zero(x):
    return x/0


def parent(x):
    return div_by_zero(x)


in example like this, the error message is

----> 15 parent(10)
ZeroDivisionError: division by zero

is there a way (e.g. debug option) to pinpoint to where the error actually happens? return x/0? In larger code base, it gets harder to find where the error actually happens.

8 replies
Guoqiang QI

I hava got so many warnings belows when i run python -m numba.runtests:
something like

..../root/huawei/numba/numba/tests/test_sort.py:934: NumbaPendingDeprecationWarning: 
Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'A' of function 'make_quicksort_impl.<locals>.GET'.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

File "numba/misc/quicksort.py", line 55:
        def GET(A, idx_or_val):



/root/huawei/numba/numba/core/object_mode_passes.py:161: NumbaDeprecationWarning: 
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "numba/tests/test_builtins.py", line 40:

def any_usecase(x, y):

..................................<string>:2: RuntimeWarning: invalid value encountered in arccos
........<string>:2: RuntimeWarning: invalid value encountered in arccos
.........................<string>:2: RuntimeWarning: invalid value encountered in arccosh
........<string>:2: RuntimeWarning: invalid value encountered in arccosh
.........................<string>:2: RuntimeWarning: invalid value encountered in arcsin
........<string>:2: RuntimeWarning: invalid value encountered in arcsin
.............................................................<string>:2: RuntimeWarning: divide by zero encountered in arctan
....<string>:2: RuntimeWarning: divide by zero encountered in arctan
...............................................<string>:2: RuntimeWarning: invalid value encountered in arctanh
........<string>:2: RuntimeWarning: invalid value encountered in arctanh
...................../root/huawei/numba/numba/core/typed_passes.py:331: NumbaPerformanceWarning: 
The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.

To find out why, try turning on parallel diagnostics, see https://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help.
File "numba/tests/test_blackscholes.py", line 27:
def cnd_array(d):
    <source elided>
               (K * (A1 + K * (A2 + K * (A3 + K * (A4 + K * A5))))))
    return np.where(d > 0, 1.0 - ret_val, ret_val)

  def cnd_array(d):
/root/huawei/numba/numba/core/object_mode_passes.py:151: NumbaWarning: Function "cnd_array" was compiled in object mode without forceobj=True.

Is it normal or something was error with my environment, conda list show messages:

packages in environment at /root/miniconda3/envs/numbaenv:
Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
_openmp_mutex             5.1                      51_gnu
blas                      1.0                    openblas
ca-certificates           2021.4.13            hd43f75c_1
certifi                   2020.12.5        py38hd43f75c_0
cffi                      1.14.5           py38hdced402_0
jinja2                    2.11.3             pyhd3eb1b0_0
ld_impl_linux-aarch64     2.36.1               h0ab8de2_3
libffi                    3.3                  h7c1a80f_2
libgcc-ng                 10.2.0              h1234567_51
libgfortran-ng            10.2.0              h9534d94_51
libgfortran5              10.2.0              h1234567_51
libgomp                   10.2.0              h1234567_51
libllvm10                 10.0.1               h0eb0881_6
libopenblas               0.3.13               hf4835c0_1
libstdcxx-ng              10.2.0              h1234567_51
llvmdev                   9.0.1                         0    numba
llvmlite                  0.37.0.dev0              pypi_0    pypi
markupsafe                1.1.1            py38hfd63f10_0
@guoqiangqi many many warnings during Numba testing are normal
some of them test warnings, some of them are expected and so on...
if none of the tests actually fail you are probably good to go
Guoqiang QI
@esc got it, thanks for your reply.

Hi, I am trying to use the normal cdf function with a numba njitted function, so e.g. using numba_scipy - but it fails so far - here my code:

from numba import njit
from scipy.special import ndtr
from scipy.stats import norm as no
# import numba_scipy

a = no.cdf(0.)                      # desired test outcome: 0.5

def test(x):
    a   = ndtr(x)
    b   = no.cdf(x)
    return a,b                   # naive implementations via scipy normal cdf functions


From what I read on numba_scipy, it just would need to be installed... however I get an error

Untyped global name 'ndtr': Cannot determine Numba type of <class 'numpy.ufunc'>

How should I code it? Thanks!

10 replies
Rishi Kulkarni

As I understood it, telling Numba to inline a function is mostly for type inference and doesn't affect performance. However, I have a function that IS getting performance benefits from inlining that I'm not quite understanding. I was wondering if anyone here had any ideas:

@nb.jit(nopython=True, inline='always')
def bounded_uint(ub):
    x = np.random.randint(low=2**32)
    m = ub * x
    l = np.bitwise_and(m, 2**32 - 1)
    if l < ub:
        t = (2**32 - ub) % ub
        while l < t:
            x = np.random.randint(low=2**32)
            m = ub * x
            l = np.bitwise_and(m, 2**32 - 1)            
    return m >> 32

def nb_fast_shuffle(arr):
    i = arr.shape[0] - 1
    while i > 0:
        j = bounded_uint(i + 1)
        arr[i], arr[j] = arr[j], arr[i]
        i -= 1

x = np.arange(1000)

4.25 µs ± 47.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

#change inline to "never"

7.73 µs ± 699 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


1 reply
Hi. I'm wrapping in a minimal way scipy.sparse.csr_matrix . If I make a tuple of arrays, or a jitclass with three fields, will the arrays be copied or passed by reference when I pass the struct/object around?
2 replies

Hello, I hope you are all well. I'm encountering a small and annoying error which must be related to the context used by cuda. Let me explain. I run several kernels and I want to read the data from the cpu while the kernel is running. So far, no problem with the solution proposed by @gmarkall
worked perfectly well to use a managed array and then read the contents of the variables used in the kernel afterwards.

The idea is that I run a kernel in a context using:
with gpu.cuda[gpu-number]:

I read what's inside while the gpu is running. So I can have the evolution of my variables.

And I do the same operation several times. I restart the kernel several times once the previous one has finished.
Only it doesn't work after the first iteration. I can't read the variables. The code wait just after the kernel function is called and prevents me from reading what's inside. It is only after the kernel has finished its calculations that I have access to the variables again.

The problem is a bit unclear on my side, and I have the impression that it is related to the contexts of cuda. Anybody have an idea? Thanks to you.

1 reply
Robin Carter
Hi, I have a function deep within my code decorated with parallel=True. Works great. For some starting configurations of my program I want to use the multiprocessing pool, sometimes it runs almost all of the code in serial. This is intended. However, in that case the numba function will be called from what is already 1 of 8 running processes. What will numba do in that case? Will it know somehow that it should only attempt to use 1 core? Or will it potentially generate 64 processes and incur a performance penalty? In which case do I need to have 2 version of my function, one with parallel=True and one without and pass the configuration data down to where it is called and call the right one of the 2 versions of my function depending on the configuration (multiprocessing pool or serial).
4 replies
Angus Hollands
Hi all,
If I have a type alias e.g. index_list_type = typeof(typed.List.empty_list(types.int64)), is there any way to create an instance of that type inside a jitted fn? i.e. how can I simplify this code?
def _groupby_indices_specialize(group):
    index_list_type = typeof(typed.List.empty_list(types.int64))
    group_type = typeof(group.dtype)

    def groupby(group):
        keys = typed.Dict.empty(group_type, types.uint64)
        groups = typed.List.empty_list(index_list_type)

        for i, group in enumerate(group):
            if group not in keys:
                key = keys[group] = len(keys)

            key = keys[group]
        return groups

    return groupby
4 replies
I'd also like to avoid testing for the key each time, e.g. expect that the condition fails (perf wise)
Hi! What is the correct way of checking the type of a namedtuple inside a generated_jit function? I was hoping something like the following would work:
from collections import namedtuple

import numpy as np
from numba import generated_jit

conf1 = namedtuple("conf1", "opt1 opt2 opt3")
conf2 = namedtuple("conf2", "opt1 opt2")

@generated_jit(nogil=True, nopython=True)
def g(config):

    if isinstance(config, conf1):
        def impl(config):
            print("Using conf1 impl.")
        def impl(config):
            print("Using conf2 impl.")

    return impl

c1 = conf1(np.ones(10), "321", 1)
c2 = conf2("123", 4.5)

10 replies
Could it be that @overload does not work at all for the cuda target? I am getting a mistrious error message No definition for lowering <built-in method sub_impl of _dynfunc._Closure object at 0x000001D6B08FCBE0>(<Signature>)
14 replies
Is there a reliable way to create a Record on the GPU? cuda.local.array((1,), dtype)[0] doesn't work for me.
5 replies
The types provided to a jited function for AOT compilation != the types provided when actually calling the jited function. Is there a place in the code where this mapping is defined so that you can, for instance, check if the user passed in the wrong arguments or is this mapping explicitly done?
2 replies
Pablo Rodríguez Robles
I am having trouble extending the following answer (point 1) to several arguments: https://stackoverflow.com/questions/49683653/how-to-pass-additional-parameters-to-numba-cfunc-passed-as-lowlevelcallable-to-s. The problem is about using a numba integrand with scipy.integrate.quad and passing other arguments to the integrand using LowLevelCallable.