import numpy as np import numba as nb from numba import cuda import math hundred_twenty_eight_floats = np.zeros(128) hundred_twenty_eight_floats[:] = list(range(128)) @cuda.jit(nb.void(nb.float64[::1], nb.bool_)) def cycle(vals,update_early): offset = 64 if update_early else 0 for i in range(5000): vals[cuda.grid(1) + offset] = math.sin(vals[cuda.grid(1) + offset]) stream = cuda.stream() stream2 = cuda.stream() for i in range(25): cycle[2, 32, stream](hundred_twenty_eight_floats, False) cycle[2, 32, stream2](hundred_twenty_eight_floats, True)
import numpy as np import numba as nb from numba import cuda import math hundred_twenty_eight_floats_h = np.zeros(128) hundred_twenty_eight_floats_h[:] = list(range(128)) hundred_twenty_eight_floats = cuda.to_device(hundred_twenty_eight_floats_h) @cuda.jit(nb.void(nb.float64[::1], nb.bool_)) def cycle(vals,update_early): offset = 64 if update_early else 0 for i in range(5000): vals[cuda.grid(1) + offset] = math.sin(vals[cuda.grid(1) + offset]) stream = cuda.stream() stream2 = cuda.stream() for i in range(25): cycle[2, 32, stream](hundred_twenty_eight_floats, False) cycle[2, 32, stream2](hundred_twenty_eight_floats, True) hundred_twenty_eight_floats.to_host() print(hundred_twenty_eight_floats_h.sum())
Hi I am trying to install numba through virtualenv (pip install numba) in a jupyter hub (remote instance) but I get the following error:
error: Command "gcc -pthread -B /sw/spack-rhel6/jupyterhub/jupyterhub/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Inumba -I/mnt/lustre01/pf/b/b381465/kernels/data/include -I/sw/spack-rhel6/jupyterhub/jupyterhub/include/python3.8 -c numba/_devicearray.cpp -o build/temp.linux-x86_64-3.8/numba/_devicearray.o -std=c++11" failed with exit status 1
Llvmlite is already installed and operative but I can't never get to install numba..
njitfunction (A) with
prangeinside from a
prangeloop in another
@njit(parallel=True)function (B). Will the
prangein the first function (A) be sequential when called from within the second function (B)?
LoweringError: Failed in nopython mode pipeline (step: nopython mode backend) Operands must be the same type, got (i64, i32)
Here's a small one - cuda device array behaves differently than numpy. Numpy will happily reshape using a -1 to fill in excess, numba-cuda will not.
import numpy as np import numba as nb from numba import cuda sixty_four = np.zeros(64) blocks_of_thirty_two = sixty_four.reshape(-1,32) nb_sixty_four = cuda.to_device(sixty_four) nb_sixty_four.reshape(-1, 32)
double square(double)you can easily call it if you define
extern double square(double) asm ("cfunc._ZN10mymodule11square$2419Ed");.The case of a function like
unicode_type myrepeat(unicode_type s, int count)seems to be a bit more complex. I am not quite sure how the signature of this function should look like and how I should allocate my unicode_type argument (NRT or Py_BuildValue?). Maybe you guys can give me a hint?
import numpy as np from numba import jitclass, typeof counter_dtype = np.dtype([('element', np.int32, 5)]) one_counter = np.zeros(1, dtype=counter_dtype) spec = [ ("counter", typeof(one_counter)) ] @jitclass(spec) class UpdatingStuff: def __init__(self, counter): self.counter = counter UpdatingStuff(one_counter)
NUMBA_DEBUG_CACHEoutput has no useful output about why the cache is recreated. This amounted to a lot of files taking some GB of storage when this program was executed a couple thousand times.
dfq['c'] = dfq['b'] + '|' + dfq['a]')and
c_array = dfq['c'].unique(). I saw that numba now supports str operations. But so far my attempts have failed. It would be useful to combine these 2 operations into a single function.
@njit(parallel=True) def concat_pipe_str(prefix, suffix): for i in range(len(prefix)): prefix[i] = prefix[i] + '|' + suffix[i] return prefix
TypingError: Failed in nopython mode pipeline (step: nopython frontend) non-precise type array(pyobject, 1d, C) During: typing of argument at <ipython-input-2-577501f82f1f> (120) File "<ipython-input-2-577501f82f1f>", line 120: def concat_pipe_str(prefix, suffix): for i in range(len(prefix)): ^
I am projecting a bunch of triangles from a 3D triangular mesh onto a 2D detector with many pixels. I currently handle all triangles in parallel with a cuda kernel. Each triangle is only a little work to project because they are small so their projection overlaps with only a few pixels
However, sometimes my algorithm runs into large triangles, which cover almost the entire detector, causing one cuda thread to loop over all detector pixels sequentially. This is so slow it almost locks the machine.
To solve this I think it might help to change this loop to a new kernel call, but in numba I cannot call a kernel from a kernel. Is there some solution to this in numba? Or do I have to translate the entire thing to C++ and write a wrapper?