I've an array with shape (4, 2, 12)
and ive got a function
def cos(v1 : Vec, v2 : Vec) -> float:
return jnp.sum(v1 * v2) / (jnp.sqrt(jnp.sum(v1**2)) * (jnp.sqrt(jnp.sum(v2**2))) + 0.000001)
for each (2, 12) of the 4 elements, id like to apply the cos function, how can i vectorize it?
e.g this is what it looks like
(4, 2, 12)
[[[0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]
and result should be something like:
[.9 1 1 1]
[cos2(x) for x in arr]
where cos2 = def cos2(v : Vec) -> float:
print(v.shape)
v1 = v[0]
v2 = v[1]
return jnp.sum(v1 * v2) / (jnp.sqrt(jnp.sum(v1**2)) * (jnp.sqrt(jnp.sum(v2**2))) + 0.000001)
__array_ufunc__
__array_function__
interfaces (which are really great) but I have seen many ideas here and there (extensible dtypes, duckarray, context-local and global override API and more). My question is which of those solutions is implemented and available for testing ?
Hey all :),
I would really appreciate your help on optimizing the following code:
_cells = np.copy(self._cells)
for pcells, pstate in self._patterns:
height, width = pcells.shape
for cy, cx in np.ndindex(_cells.shape):
selection = _cells[cy : cy + height, cx : cx + width]
if np.array_equal(selection, pcells):
for sy, sx in np.ndindex(selection.shape):
if selection[sy, sx] == Cell.Alive:
self._cells[cy + sy, cx + sx] = pstate
What it basically does is that it marks certain patterns (of different size) within a big matrix. I wonder if there is a more purely numpy way of doing this? Any ideas how to optimize this? Because this is currently very computational heavy :/
Thank you very much in advance <3
Best,
Felix
In [1]: from threadpoolctl import ThreadpoolController
...: import numpy as np
...: import time
...:
...: controller = ThreadpoolController()
...: a = np.random.random((500, 321, 321))
...: b = np.random.random((500, 321))
In [2]: st = time.perf_counter()
...: _ = np.linalg.solve(a, b)
...: ed = time.perf_counter()
...: print(ed - st)
5.766736300000002
In [3]: st = time.perf_counter()
...: with controller.limit(limits=1, user_api='blas'):
...: _ = np.linalg.solve(a, b)
...: ed = time.perf_counter()
...: print(ed - st)
0.5034510000000019
Hey all, I'm trying to pip install
the requirements.txt specified here, but I'm running into an issue. When pip
gets to the point where scipy-1.4.1 is being installed, the build fails with an error:
Collecting scipy==1.4.1
Using cached scipy-1.4.1.tar.gz (24.6 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... error
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [60 lines of output]
...
Original error was: /tmp/pip-build-env-z8oxljl3/overlay/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so: undefined symbol: cblas_sgemm
The error message indicated that there were many ways in which this issue could arise, so here's some more info for context:
pyenv
to manage python versions. I'm running this on python version 3.9.12, and this error happens with a totally fresh install.uname -rms
: Linux 5.17.8-arch1-1 x86_64
pyenv global 3.9.12
, however, and I've confirmed that I can pip install scipy
without issue.Anyone have any advice?
numpy.convolve
had no hints)data = load_data(...)
data = data[:,1]
dc_part = gaussian(data, sigma)
rectified = abs(data - dc_part)
...
save_dat(...)
but the objects are either backed by file storage (effectively similar to swapping them out, but some of the data is already on disk so that wouldn't need to go through swap space, and accessors might be a bit more wary of preferring sequential access) or even creating a chain of lazily performed operations that get run over windows of the original data when accessed / saved.[array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2], dtype=int32)
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2], dtype=int32)
array([1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0], dtype=int32)
array([0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 2, 0], dtype=int32)
array([0, 1, 0, 1, 1, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0], dtype=int32)
array([0, 0, 1, 1, 0, 0, 1, 2, 0, 0, 0, 2, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 1, 0, 0, 0, 1, 2, 2, 0, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 2, 1, 1, 2, 0, 0, 0, 2, 0, 0, 0, 0], dtype=int32)
array([0, 0, 0, 0, 1, 0, 0, 2, 1, 2, 0, 1, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 2, 0, 1, 2, 1, 0, 2, 0, 0, 0, 0, 0], dtype=int32)]
Why does this return (10,) whenever I print np.shape, and not (10,16)?
NPY_NO_EXPORT
, which is defined by #define NPY_NO_EXPORT NPY_VISIBILITY_HIDDEN
. I saw a lot of usage of NPY_NO_EXPORT
in definition of various c functions and python types, e.g., NPY_NO_EXPORT PyTypeObject PyArray_Type = {
. Any explaination is good for me! Thanks. I'd guess this MACRO means the defined funcs or types are only used in NumPy C world, and not in NumPy Python wrold?
NPY_NO_EXPORT
means the functions are internal to NumPy and are not available for general use. The functions marked with NUMPY_API
are exported via a mechanism that overrides the NPY_NO_EXPORT
macro at build time. All the non-static functions should be marked with NPY_NO_EXPORT
, if there are any public ones they must be expressly declared as such and become part of the NumPy C-API.
-O0 -g
to disable compiler optimization and enable debugging info. Where should I modify in the setup.py
process? NumPy building process and the distutitls
package is kind of overwhelming for me... Emmm... Or any document about this is helpful.
Deferred UFunc evalulation
using C++ expression template, https://numpy.org/neps/nep-0011-deferred-ufunc-evaluation.html. But its status is deferred, I wonder if numpy's already implemented lazy ufunc evalulation using another way? I searched in the numpy source code, I feel numpy doesn't impl this feature.