Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
stuartarchibald
@stuartarchibald
from numba import njit
import numpy as np

@njit
def set_seed(seed):
    np.random.seed(seed)

@njit
def foo():
    return np.random.rand()

seed = 1234

set_seed(seed)
np.random.seed(seed)

n = 5

for _ in range(n):
    print(foo())
    print(foo.py_func())
Mateus Interciso
@minterciso
hum...to stay on the same thread I need to do a parallel=False?
stuartarchibald
@stuartarchibald
what there you hoping to achieve with a parallel RNG ?
Mateus Interciso
@minterciso
not on the RNG
on the final code
to keep the same seed is only to know that my fitness function is actually evolving as I want
stuartarchibald
@stuartarchibald
ok, where does the RNG call get made ? Is it made upfront to get a load of samples and then they are used in the parallel section ?
Mateus Interciso
@minterciso
no, it's made on the fly when it's needed
stuartarchibald
@stuartarchibald
inside a parallel region ?
Mateus Interciso
@minterciso
yes
for instance
``
for i in prange(10):
for j in prange(20):
if np.random.rand() < pm:
mutated = np.random.choice(pool)
and this is inside a njit(paralell=True) function
but the np.random.seed() is at the start of the python code
If, for getting the same results, is just a matter of setting the parallel=False, and create a sed_seed() njit function, no worries then ;)
stuartarchibald
@stuartarchibald
right, so in the case the seed has no effect
Mateus Interciso
@minterciso
@stuartarchibald exactly
stuartarchibald
@stuartarchibald
because its not really clear what to do and the most useful thing it seems for most users is to have each thread have its own random sequence, this being used in e.g. Monte Carlo
were all threads to inherit the seed from the parent thread then the sequences on each thread would be the same
Mateus Interciso
@minterciso
of course
stuartarchibald
@stuartarchibald
someone was asking about this the other week though an I proposed making it such that if a seed is set in the parent then each child should get seed+thread_ID or something, so there's predicable but random streams
Mateus Interciso
@minterciso
the way I did it sometimes in cuda (before learning how to use curand on the device) was to get all number before hand on the host, and then each cuda thread would pick up the numbers it needed...
that's nice
stuartarchibald
@stuartarchibald
that might be your best bet for now, I appreciate it's not particularly convenient though
stuartarchibald
@stuartarchibald
@minterciso numba/numba#4452
Mateus Interciso
@minterciso
@stuartarchibald no worries
Joshua Adelman
@synapticarbors_twitter
I was answering a question Stackoverflow about typed dicts and there was some confusion about the docs that say, "Acceptable key/value types include but are not limited to: unicode strings, arrays, scalars, tuples." This seems to imply that numpy arrays would be valid keys, which they are not because they are not hashable. Should the docs be clarified?
Pearu Peterson
@pearu
The PR4425 implements hashing for byte and str array items.
But yeah, arrays are not hashable (as these are mutable)
stuartarchibald
@stuartarchibald
@synapticarbors_twitter yes, please feel free to fix the docs, arrays can't be keys, the use of arrays as values is fine.
Joshua Adelman
@synapticarbors_twitter
Would something like "Acceptable key/value types include but are not limited to: unicode
strings, arrays (value only), scalars, tuples." be good, or would the following be better "Acceptable key/value types include but are not limited to: unicode strings, scalars, tuples. Arrays may be used as values only."
stuartarchibald
@stuartarchibald
I think the former is great, thanks!
stuartarchibald
@stuartarchibald
Also, @synapticarbors_twitter, thanks for answering questions on Stackoverflow, much appreciated :)
Greg
@grej
Hello everyone. Is there a way to get the current system time, from within the jitted numba codebase? I am looking for a way to profile bottlenecks in a fairly large numba codebase, and I’d like to store a summary of the time spent in each function. The new numba dictionaries would be ideal for this, but I am uncertain how to get the system time from inside the jitted code.
Pearu Peterson
@pearu
@grej, not sure if usable for reliable timing but here's an example of using llvm readcyclecounter:
from numba import types, njit
from numba.extending import intrinsic
from llvmlite import ir

@intrinsic
def readcyclecounter(typingctx):
    sig = types.int64()
    def codegen(context, builder, signature, args):
        fn = builder.module.declare_intrinsic('llvm.readcyclecounter', fnty = ir.FunctionType(ir.IntType(64), []))
        return builder.call(fn, [])
    return sig, codegen

@njit
def foo():
    start = readcyclecounter()
    # Do something here
    end = readcyclecounter()
    elapsed = end - start
    return elapsed

print(foo())
Hameer Abbasi
@hameerabbasi
Is np.array(typed_list) implemented in master?
Guryanov Alexey
@Goorman
Hi gues
Is it possible to force prange to have limited number of threads in runtime? (i.e. i want parallel function to execute with 1/2/4/8 threads without rewriting code)
stuartarchibald
@stuartarchibald
@grej if you can tolerate jumping back into the interpreter temporarily you can do this:
In [16]: from numba import njit, objmode                                                                                        

In [17]: @njit 
    ...: def time_now(): 
    ...:     with objmode(t='float64'): 
    ...:         t = time() 
    ...:     return t 
    ...:                                                                                                                        

In [18]: time_now()                                                                                                             
Out[18]: 1566205053.2636802

In [19]: time_now()                                                                                                             
Out[19]: 1566205054.8763537

In [20]: time_now()                                                                                                             
Out[20]: 1566205056.2839804

In [21]: @njit 
    ...: def work(): 
    ...:     ts = timer() 
    ...:     acc = 0 
    ...:     for i in range(1000000): 
    ...:         acc += i 
    ...:         acc /= 7. 
    ...:     te = timer() 
    ...:     print("Elapsed = ", te - ts) 
    ...:     return acc 
    ...:                                                                                                                        

In [22]: work()                                                                                                                 
Elapsed =  0.012462377548217773
Out[22]: 166666.47222222222
@hameerabbasi not yet, please open a feature request if this is important to you.
Pearu Peterson
@pearu
@stuartarchibald , @grej, anybody: just wondering what would be needed in order to call directly the posix time or clock_gettimefunction from jitted function? Are there any concerns when doing so?
stuartarchibald
@stuartarchibald
@pearu from an @intrinsic think you'd need to ABI compatibly define the C structs defined in time.h, recreate the glibc/kernel level defined impls of e.g. clk_id, make sure you're on a system that supports these things and then builder.module.get_or_insert_function(type, "c_func_name") and then builder.call that.
Doing this reliably is easier said than done. The most reliable thing is to probably recreate what CPython does in time.time and add C code to Numba _helperlib.c as needed to hide difficulties noted with ABIs.
Bryan Dixon
@javawolfpack_gitlab

So curious if anyone has experience with the numba cuda implementation? Not sure why not all of the results of a matrix operation are in the outputted result...

import numpy as np
from numba import cuda

@cuda.jit
def matadd(A, B, C):
    i, j = cuda.grid(2)
    C[i][j] = A[i][j] + B[i][j]

n=4
a=np.random.uniform(low=-100, high=100, size=(n,n)).astype(np.float32)
b=np.random.uniform(low=-100, high=100, size=(n,n)).astype(np.float32)
result = np.zeros((n,n), dtype=np.float32)
matadd(a,b, result)
print(result)

The result only shows a value in the first row/column the rest are still 0s

For example this as the output
[[-13.121542   0.         0.         0.      ]
 [  0.         0.         0.         0.      ]
 [  0.         0.         0.         0.      ]
 [  0.         0.         0.         0.      ]]
Bryan Dixon
@javawolfpack_gitlab
Seems it's a cuda memory/shared memory detail I was missing...
stuartarchibald
@stuartarchibald
@javawolfpack_gitlab https://numba.pydata.org/numba-doc/dev/cuda/kernels.html#absolute-positions, take a look at the example calls in there, note it is function_name[blocks_per_grid, threads_per_block](args), without a launch configuration it'll default to, I think a 1x1, but something you don't want!
Bryan Dixon
@javawolfpack_gitlab
@stuartarchibald thanks!! definitely what I was missing initially... not sure how best to fine-tune it, been way too long since I last programmed w/ CUDA
stuartarchibald
@stuartarchibald
@javawolfpack_gitlab see https://numba.pydata.org/numba-doc/latest/cuda/kernels.html#choosing-the-block-size, but basically, the best place to look is the CUDA C programming guide: https://docs.nvidia.com/cuda/cuda-c-programming-guide/