## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
guydoodmanbro
@guydoodmanbro:matrix.org
[m]

I've an array with shape (4, 2, 12)

and ive got a function

def cos(v1 : Vec, v2 : Vec) -> float:
return jnp.sum(v1 * v2) / (jnp.sqrt(jnp.sum(v1**2)) * (jnp.sqrt(jnp.sum(v2**2))) + 0.000001)

for each (2, 12) of the 4 elements, id like to apply the cos function, how can i vectorize it?

e.g this is what it looks like
(4, 2, 12)

[[[0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]

and result should be something like:
[.9 1 1 1]

so like behaviiour of [cos2(x) for x in arr] where cos2 =
def cos2(v : Vec) -> float:
print(v.shape)
v1 = v[0]
v2 = v[1]
return jnp.sum(v1 * v2) / (jnp.sqrt(jnp.sum(v1**2)) * (jnp.sqrt(jnp.sum(v2**2))) + 0.000001)
falseywinchnet
@falseywinchnet
adding ndims containing rows of floating point numbers and zeros has a deterministically different floating point summation value depending on which row the numbers are in and how many rows of zeros are indexed between them
why is this?
ie depending on the index for the last row in the array, the array will either add up to 0.0 or it will add up with a diff of around 1e-14
falseywinchnet
@falseywinchnet
ok, no. its not a bug, it's a shortcoming inherent to the pairwise summation approach.
numba's sum function does the same thing. Kahan summation solved the issue
reckoner
@reckoner

I can do

python setup.py install

from the main repo to build a custom Numpy, but I don't know how to have my customized Numpy installed next to the regular Numpy I get using conda install numpy.

Any directions?

Matti Picus
@mattip
@reckoner I don't understand what you are asking. When you do python setup.py install; cd /tmp; python -c"import numpy; print(numpy)", you should see that numpy is the one you built.
reckoner
@reckoner
@mattip Yes. I can do that. What I was hoping for was import my_custom_numpy alongside the usual import numpy as np. I don't know how to change the module name in the build process so I don't conflict with an existing Numpy previously installed.
Matti Picus
@mattip
I don' think you can, the name numpy is hardcoded into the library in many places
mocquin
@mocquin
Hello folks, I am looking for options to make my unit package even more numpy-compatible. I already use __array_ufunc__
and __array_function__ interfaces (which are really great) but I have seen many ideas here and there (extensible dtypes, duckarray, context-local and global override API and more). My question is which of those solutions is implemented and available for testing ?
Felix Uellendall
@feluelle

Hey all :),
I would really appreciate your help on optimizing the following code:

        _cells = np.copy(self._cells)

for pcells, pstate in self._patterns:
height, width = pcells.shape
for cy, cx in np.ndindex(_cells.shape):
selection = _cells[cy : cy + height, cx : cx + width]
if np.array_equal(selection, pcells):
for sy, sx in np.ndindex(selection.shape):
if selection[sy, sx] == Cell.Alive:
self._cells[cy + sy, cx + sx] = pstate

What it basically does is that it marks certain patterns (of different size) within a big matrix. I wonder if there is a more purely numpy way of doing this? Any ideas how to optimize this? Because this is currently very computational heavy :/

Thank you very much in advance <3
Best,
Felix

Dave Hirschfeld
@dhirschfeld
You might get more help if you posted a complete, self-contained that others could run (by simply copy/pasting)
e.g. include dummy data of the same size/shape to test the algorithm
Kevin Anderson
@kanderso-nrel
Hello, I'm hoping someone can point me towards reference material regarding the underlying BLAS's threading capabilities. Specifically I'd like to understand this timing difference (numpy 1.22.3 installed from PyPI, so OpenBLAS in this case):
In [1]: from threadpoolctl import ThreadpoolController
...: import numpy as np
...: import time
...:
...: a = np.random.random((500, 321, 321))
...: b = np.random.random((500, 321))

In [2]: st = time.perf_counter()
...: _ = np.linalg.solve(a, b)
...: ed = time.perf_counter()
...: print(ed - st)
5.766736300000002

In [3]: st = time.perf_counter()
...: with controller.limit(limits=1, user_api='blas'):
...:     _ = np.linalg.solve(a, b)
...: ed = time.perf_counter()
...: print(ed - st)
0.5034510000000019
Matti Picus
@mattip
What machine are you running the code on? How many actual cores does it have? When I run this code on my AMD Ryzen 9 3900X 12-Core Processor on ubuntu 20.04, I get 1.2 (without the limit) and 1.7 (with limit=1) seconds
... and what operating system
Kevin Anderson
@kanderso-nrel
Thanks @mattip -- my timings are from Windows 10 on an i7-8850H (6 real cores)
Kevin Anderson
@kanderso-nrel
Worth pointing out that if I install numpy from conda-forge (which afaik uses MKL instead of OpenBLAS) the timing difference goes away. Only the PyPI wheels produce the above times.
Kevin Anderson
@kanderso-nrel
This issue is very relevant: numpy/numpy#15764
Peyton Murray
@peytondmurray

Hey all, I'm trying to pip install the requirements.txt specified here, but I'm running into an issue. When pip gets to the point where scipy-1.4.1 is being installed, the build fails with an error:

Collecting scipy==1.4.1
Using cached scipy-1.4.1.tar.gz (24.6 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
error: subprocess-exited-with-error

× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [60 lines of output]
...

Original error was: /tmp/pip-build-env-z8oxljl3/overlay/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so: undefined symbol: cblas_sgemm

The error message indicated that there were many ways in which this issue could arise, so here's some more info for context:

• I'm using pyenv to manage python versions. I'm running this on python version 3.9.12, and this error happens with a totally fresh install.
• This is being run on arch linux, with uname -rms: Linux 5.17.8-arch1-1 x86_64
• I do currently have multiple versions of python (system version; 3.9.12; and 3.10.2). I've set pyenv global 3.9.12, however, and I've confirmed that I can pip install scipy without issue.
• If it matters, I'm running gcc 12.1.0

chrysn
@chrysn:matrix.org
[m]
I'm processing analog signals from high-bitrate audio files into digital data, and am reaching RAM limits. (40 minutes of 192ksamples/s represented as a complex float each are almost 4GB, I can fit that in RAM once but not multiple times).
I'm looking to do things more efficiently along two approaches, for none of which I found support in numpy so far:
• Can I do things like convolution (I'm low-pass filtering the signal after multiplying in a complex carrier) in-place? (numpy.convolve had no hints)
• Is there a format in which I can program sequentially, but it'll stash out data into memory maps, or even evaluate lazily? (In the image space I think that GEGL might do that for me). The idea is that I'd write
data = load_data(...)
data = data[:,1]
dc_part = gaussian(data, sigma)
rectified = abs(data - dc_part)
...
save_dat(...)
but the objects are either backed by file storage (effectively similar to swapping them out, but some of the data is already on disk so that wouldn't need to go through swap space, and accessors might be a bit more wary of preferring sequential access) or even creating a chain of lazily performed operations that get run over windows of the original data when accessed / saved.
What I do right now is to do in-place what can be done in-place, and (around the big costly low-pass filtering of the IQ multiplied signal) del everything and reload after I've dropped the input data to the filtering, but that's a rather tedious manual process
Matti Picus
@mattip
@chrysn:matrix.org ray should release the pinning for scipy 1.4.1, which is very old
sorry, @peytondmurray ^^^^
@chrysn:matrix.org one technique is to use chunking and process the data in smaller pieces. You need to be careful about edge effects.
chrysn
@chrysn:matrix.org
[m]
yeah, chunking would create edges for my gaussians. i'd need to work concurrently on several chunks, and that's something i'd rather do with an existing library than doing it locally.
chrysn:matrix.org @chrysn:matrix.org is looking into dask -- that could just do the right thing
Matti Picus
@mattip
right, dask is a good tool for chunking
chrysn
@chrysn:matrix.org
[m]
i might need to do my gaussians manually (as scipy's ndimage.gaussian_filter1d appears to create some numpy arrays internally), but that shouldn't be an issue -- there seems to be a path toward .convolve() there
prittjam
@prittjam
It seems that in NumPy that it is more natural to right-multiply matrices vs left-multiply, e.g., x.T @ A.T vs A @ x. In particular, when vectorizing linear algebra operations, there is often the need to reshape or add dimensions. Row vectors seem to be a more natural fit with the assumptions in the NumPy API than column vectors for the types of operations. E.g., adding dimensions pre-pends rather than post-pends. For easy array reshaping and slicing, and speed consideration, and for interoperability with other libraries, should row vector representation be preferred?
prittjam
@prittjam
I have an example. E.g., the atleast_2d, at_least_3d functions don't have an axis argument; they always preprend. So if you're working with column vectors, you need to re-implement these to get columns for linear algebra operations.
Bigyan Karki
@bigyankarki
[array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2], dtype=int32)
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2], dtype=int32)
array([1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0], dtype=int32)
array([0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 2, 0], dtype=int32)
array([0, 1, 0, 1, 1, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0], dtype=int32)
array([0, 0, 1, 1, 0, 0, 1, 2, 0, 0, 0, 2, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 1, 0, 0, 0, 1, 2, 2, 0, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 2, 1, 1, 2, 0, 0, 0, 2, 0, 0, 0, 0], dtype=int32)
array([0, 0, 0, 0, 1, 0, 0, 2, 1, 2, 0, 1, 2, 0, 0, 0], dtype=int32)
array([0, 0, 0, 1, 2, 0, 1, 2, 1, 0, 2, 0, 0, 0, 0, 0], dtype=int32)]
Why does this return (10,) whenever I print np.shape, and not (10,16)?
Jody Klymak
@jklymak
Because it’s a list of 10 arrays
Bigyan Karki
@bigyankarki
hmm is there anyway i can reshape it to 10x16 array?
Jody Klymak
@jklymak
No.array the whole thing
Bigyan Karki
@bigyankarki
wym by array the whole thing?
Bigyan Karki
@bigyankarki
If you meant np.array(arr), it didnt work
Angus Hollands
@agoose77:matrix.org
[m]
@bigyankarki: where are you getting this list from? You probably should modify that so that you generate the correct shape to begin with. But, what do you mean by "it didn't work"? Did you get an error?
Romit Maulik
@Romit-Maulik
Hello - I was wondering if anyone had experience with calling a Python module using numpy from C++. Specifically - I am trying to do some simple task parallelism from C++ using std::thread and while (I'd like to believe) my coupling works appropriately when I do not use numpy - I seem to have a deadlock issue when calling any numpy function (such as np.sum(x)) etc.
I'd be happy to send across a minimum working example as an attachment
Da Li
@dlee992
Hi, guys. I want to know the meaning of NPY_NO_EXPORT, which is defined by #define NPY_NO_EXPORT NPY_VISIBILITY_HIDDEN. I saw a lot of usage of NPY_NO_EXPORT in definition of various c functions and python types, e.g., NPY_NO_EXPORT PyTypeObject PyArray_Type = {. Any explaination is good for me! Thanks. I'd guess this MACRO means the defined funcs or types are only used in NumPy C world, and not in NumPy Python wrold?
Matti Picus
@mattip
NPY_NO_EXPORT means the functions are internal to NumPy and are not available for general use. The functions marked with NUMPY_API are exported via a mechanism that overrides the NPY_NO_EXPORT macro at build time. All the non-static functions should be marked with NPY_NO_EXPORT, if there are any public ones they must be expressly declared as such and become part of the NumPy C-API.
Da Li
@dlee992
Thanks, matti! Got it.
Da Li
@dlee992
Hi, guys. I want to compile NumPy extensions with flags -O0 -g to disable compiler optimization and enable debugging info. Where should I modify in the setup.py process? NumPy building process and the distutitls package is kind of overwhelming for me... Emmm... Or any document about this is helpful.
hi! I was wondering how the acceptance of pep-646 (variadic generics) will affect the module numpy.typing. Are there any plans to implement static type checking for array shapes?
Hi, guys. I found an old NEP about Deferred UFunc evalulation using C++ expression template, https://numpy.org/neps/nep-0011-deferred-ufunc-evaluation.html. But its status is deferred, I wonder if numpy's already implemented lazy ufunc evalulation using another way? I searched in the numpy source code, I feel numpy doesn't impl this feature.