Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
stuartarchibald
@stuartarchibald
np
brandonwillard
@brandonwillard:matrix.org
[m]
@luk-f-a: us in the Aesara group will start looking into the new NumPy RNG API soon, so we should be able to help with a Numba implementation/interface
3 replies
on an unrelated note, can Numba's nditer currently be used to implement a version of broadcast_to?
Rishi Kulkarni
@rishi-kulkarni
Is there a way to turn a heterogeneous tuple to a homogeneous tuple if they can be cast to the same dtype? For example, (float64, int64, int64) -> (float64, float64, float64)
2 replies
Graham Markall
@gmarkall
Anyone interested in automatic differentiation? I wonder if Enzyme could be integrated with Numba to do AD of jitted functions: https://enzyme.mit.edu/getting_started/UsingEnzyme/
Dave Hirschfeld
@dhirschfeld
Doesn't aesara do AD using numba to compile the graph?
4 replies
brandonwillard
@brandonwillard:matrix.org
[m]
it doesn't use AD on the JITed or LLVM IR results, though
it performs AD on its own high-level graph, then uses Numba to JIT the resulting graph
Pau Ramos
@brugalada_gitlab
Hi everyone! I'm extremely new to Numba but we are evaluating the viability of a science project and we want to use the ray tracing core of the new Turing GPUs for something other than actual ray tracing. I wanted to learn if, with Numba, is it possible to access and control the RT-cores? Please, let me know if this is not the place to ask this. Thanks in advance!
44 replies
David Wynter
@davidwynter_gitlab

I have a fn that will not compile. But have checked the dtypes for where it fails and they are correct when not using @njit. Here is the fn.

@njit(float64[:](int64, float64[:], float64[:], float64[:], float64[:], int64, float64[:], float64[:]))
def _density_donoho(j, c_coef, d_coef, x_array, y_array, len_signal, t, phi_t):
    """
    Compute the signal's density (Linear estimator)
    :param j:           scaling parameter [int]
    :param c_coef:      scaling coefficients C [1D array]
    :param x_array:     density index [1D array]
    :return:            density values [1D array]
    """
    k_lim_ = int(2. ** -j * len_signal)
    z_array = np.empty(shape=(len(x_array), len(y_array)))
    cpt = 0.
    perc = 0.
    for i in prange(0, len(x_array)):
        for ii in prange(0, len(y_array)):
            sum_ = 0.
            for k in prange(-k_lim_, k_lim_):
                for k2 in prange(-k_lim_, k_lim_):
                    p1 = _derived_father_wavelet(x_array[i], j, k, t, phi_t)
                    p2 = _derived_father_wavelet(y_array[ii], j, k2, t, phi_t)
                    print(k, k2, k_lim_)
                    cc = c_coef[k + k_lim_][k2 + k_lim_]     # line for error below
                    sum_ = sum_ + cc * p1 * p2

The error relates to the 2nd last line above. And shows, I think, that the first array index is represented as a float64, but when not in @njit mode it shows as a int64

>>> getitem(float64, int64)

There are 22 candidate implementations:
   - Of which 22 did not match due to:
   Overload of function 'getitem': File: <numerous>: Line N/A.
     With argument(s): '(float64, int64)':
    No match.

Can numba change the type of the argument?

6 replies
Rishub
@rishubn
Hi all, is it possible to use globals in a numba @njit function?
5 replies
Rishub
@rishubn
I have an intense for-loop which I am trying to parallelize, but I am getting the wrong results:
@njit(parallel=True)
def gen_ddt():
    ddt = [[0 for _ in range(65536)] for _ in range(65536)]
    sbox = [s16(i) for i in range(0,65536)]
    print('done sbox')
    for di in prange(1000):
        for do in prange(1000):
            ddt[di ^ do][sbox[di] ^ sbox[do]] += 1

    return ddt

def gen_ddt_single():
    ddt = [[0 for _ in range(65536)] for _ in range(65536)]
    sbox = [s16(i) for i in range(0,65536)]
    print('done sbox')
    for di in range(1000):
        for do in range(1000):
            ddt[di ^ do][sbox[di] ^ sbox[do]] += 1

    return ddt
testing both functions, the @njit the value of the returned array at [0][0] is 997, for the non-numba version [0][0] is 1000 (correct)
the s16() function is decorated with @njit
whats going on here? New to numba so not sure if i converted the function correctly
stuartarchibald
@stuartarchibald
What stops multiple threads writing to the same index of ddt at the same time?
Rishub
@rishubn
Nothing I suppose. Though it shouldn't matter at the end right?
the final count is important, not the order
Graham Markall
@gmarkall
stuartarchibald
@stuartarchibald
The updates to the array are not atomic?
Rishub
@rishubn
aha
ya i see now.
Graham Markall
@gmarkall
we need some sort of Stuart / Graham lock here :-)
Rishub
@rishubn
Is there a way to make the increment of an index atomic?
Graham Markall
@gmarkall
There's a feature request for CPU atomics: numba/numba#2988
(not looked at it lately to see what the state if it / thoughts on it) are
Rishub
@rishubn
:/
Graham Markall
@gmarkall
Maybe this could fit your use case (linked from the issue): https://github.com/KatanaGraph/katana/blob/master/python/katana/numba_support/numpy_atomic.py
Rishub
@rishubn
interesting, can I just lift these functions to my file as a short test?
Graham Markall
@gmarkall
or just save the whole file and import it
11 replies
andrewgwalker
@andrewgwalker
I'm having an issue with a memory leak that I suspect is caused by this issue
Is there likely to be any movement for fixing it or is there some sort of workaround?
9 replies
David Wynter
@davidwynter_gitlab
I have a numpy array of arrays, z_array = np.empty(shape=(len(x_array), len(y_array))) after some processing to fill these arrays up the original code used this z_array[z_array < 0] = 0 to set all <0 values in all arrays to 0. This does not work with numba. It gives this error TypeError: unsupported array index type array(bool, 2d, C) in [array(bool, 2d, C)]. This is understandable. But is there a better way than looping thru the arrays and using the same expression on each of the sub arrays?
10 replies
joey00072
@joey00072
import numpy as np
import numba
from numba import jit,njit

from numba import int32, float32 ,float64   # import the types
from numba.experimental import jitclass
spec = [
    ('w', float64[:,:]),               # a simple scalar field
    ('b', float64[:,:]),          # an array field
]

@jitclass(spec)
class Linear:
    def __init__(self,input_shape,output_shape):
        self.w = np.ascontiguousarray(
                                        np.random.rand(input_shape,output_shape)
                                    )
        self.b = np.ascontiguousarray(
                                        np.random.rand(1,output_shape)
                                        )

    def forward(self,x):
        return  x@self.w+ self.b

l = Linear(4,5)

x = np.random.rand(10,4)

l.forward(x)

<string>:3: NumbaPerformanceWarning: '@' is faster on contiguous arrays, called on (array(float64, 2d, C), array(float64, 2d, A))

<string>:3: NumbaPerformanceWarning: '@' is faster on contiguous arrays, called on (array(float64, 2d, C), array(float64, 2d, A))

Can anyone explain me how to fix this
I am using np.ascontiguousarray array and still getting this error
1 reply
Graham Markall
@gmarkall
I think it's because your spec specifies arrays that may not be contiguous, because the last dimension has no step
if you change float64[:,;] to float64[:,::1] does that get rid of the error?
oops, just saw the in-thread reply :-)
2 replies
Jakub Nabaglo
@nbgl
I’m trying to learn more about defining bespoke low-level types for use in Numba. What’s the difference between reflection vs boxing/unboxing?
3 replies
joey00072
@joey00072
@gmarkall I worked , Thank you for help kind sir.
MegaIng
@MegaIng
At more experienced numba developers: when setting _debug_print=True in nrtdynmod.py, the values printed by NRT_Incref make no sense (e.g. I am getting *** NRT_Incref 393217 [...]), while NRT_Decref prints more reasonable values. (e.g. ** NRT_Decref 2 [...]). Is this normal?
20 replies
(Ignoring the fact that the printf strings use %zu for a value I believe to be of type signed int. changing the Format to %hhi does little to solve the problem)
3 replies
Neofelis
@jdenozi
Is it possible to access values used in the kernel when the GPU is running a task? I have seen that you can use cuda.stream only it slows down the GPU a lot.
10 replies
Guoqiang QI
@guoqiangqi

Hi, all. I am a newbie of numba and have been looking chance to contirbute to numba. I was tring to fix issue6949 in the pasted weekend, which is my first try, but i am stuck now and asking for some help.
The result of array inpalce binop(such as +=) is ircorrect when aliases of the lhs in the function expression existed. For example (provided in #6949):

@jit 
def self_addition(x):
    x += x.T
    return x

x = np.array([[1,2],[3,4]])

self_addition(x)
> array([[2, 5],[8, 8]])

In this case, xwill be temporarily changed to [[2,5], [3,4]] since the first row will be calculated first when matrix addition is performed. Therefore, x.T will be [[2,3], [5,4]] (the correct value is [[1,3], [2,4]]) when the second row is calculated. I am still not sure which pass cased this error , does anyone know anything about this?

accygupta
@accygupta
Would it be possible to replicate Python Deque object type in Numba, if it took ONLY numpy 2D array in input and output? I need to use this in Numba's Jitclass.
6 replies
Graham Markall
@gmarkall
@jdenozi try this for an example (requires CC 6.0, not exhaustively tested, YMMV, etc etc):
from numba import cuda
from time import sleep
import numpy as np


@cuda.jit
def report_progress(progress):
    for i in range(1000000):
        cuda.atomic.inc(progress, 0, 10000000)
        cuda.nanosleep(1000)


progress = cuda.managed_array(1, dtype=np.uint64)
progress[0] = 0

report_progress[1, 1](progress)

val = 0
count = 0

while val < 1000000:
    sleep(0.001)
    val = progress[0]
    print(f"{count}: {val}")
    count += 1
19 replies

Gives

0: 369
1: 1119
2: 1825
...
1251: 998488
1252: 999286
1253: 1000000

for me on RTX 8000