- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

- 13:29francisco-dlp commented #2256
- 13:26JoonatanL commented #2255
- 13:24thomasaarholt commented #2240
- 13:23francisco-dlp labeled #2255
- 13:22francisco-dlp labeled #2255
- 13:22francisco-dlp labeled #2255
- 13:22francisco-dlp commented #2255
- 13:19thomasaarholt edited #2256
- 13:19thomasaarholt edited #2256
- 13:18thomasaarholt opened #2256
- 13:15dnjohnstone commented #2255
- 12:42JoonatanL opened #2255
- Sep 18 11:08magnunor commented #1497
- Sep 18 09:58dnjohnstone commented #1497
- Sep 17 13:36dnjohnstone commented #1497
- Sep 17 06:33AEljarrat commented #2251
- Sep 17 06:33AEljarrat commented #2251
- Sep 16 20:32thomasaarholt commented #2240
- Sep 16 13:41thomasaarholt commented #2251
- Sep 16 07:12AEljarrat commented #2251

hiya, back to my loading big EDS maps question from above. After using a hack the @sem-geologist suggested over on Github, I can get all my BCF files to load. but now I want to start exploring the data using the decomposition tools of hyperspy. I have loaded using the lazy signal,```
sig=hs.load('hdf5/BA_map_*.hspy',stack=True, lazy=True)
sig
```

which gives me an object:`<LazyEDSSEMSpectrum, title: hdf5, dimensions: (999, 999, 108|2048)>`

but when I try to run my stack trhough PCA I keep getting the error

`Axis value must be an integer, got range(0, 3)`

any suggestions on what is causing the problem. I will dump the full error code below.

here is the full error:

`----> 2 sig.decomposition(True, algorithm='PCA', output_dimension=20)

3 #sigt.plot_decomposition_results()

4 sig.plot_explained_variance_ratio(log=False)

5 sig.plot_explained_variance_ratio(log=True)

~\AppData\Local\conda\conda\envs\hyperspy2\lib\site-packages\hyperspy_signals\lazy.py in decomposition(self, normalize_poissonian_noise, algorithm, output_dimension, signal_mask, navigation_mask, get, num_chunks, reproject, bounds, **kwargs)

762 sdim = self.axes_manager.signal_dimension

763 bH, aG = da.compute(

--> 764 data.sum(axis=range(ndim)),

765 data.sum(axis=range(ndim, ndim + sdim)))

766 bH = da.where(sm, bH, 1)

~\AppData\Local\conda\conda\envs\hyperspy2\lib\site-packages\dask\array\core.py in sum(self, axis, dtype, keepdims, split_every, out)

1754 from .reductions import sum

1755 return sum(self, axis=axis, dtype=dtype, keepdims=keepdims,

-> 1756 split_every=split_every, out=out)

1757

1758 @derived_from(np.ndarray)

~\AppData\Local\conda\conda\envs\hyperspy2\lib\site-packages\dask\array\reductions.py in sum(a, axis, dtype, keepdims, split_every, out)

229 dt = getattr(np.empty((1,), dtype=a.dtype).sum(), 'dtype', object)

230 return reduction(a, chunk.sum, chunk.sum, axis=axis, keepdims=keepdims,

--> 231 dtype=dt, split_every=split_every, out=out)

232

233

~\AppData\Local\conda\conda\envs\hyperspy2\lib\site-packages\dask\array\reductions.py in reduction(x, chunk, aggregate, axis, keepdims, dtype, split_every, combine, name, out, concatenate, output_size)

127 if isinstance(axis, int):

128 axis = (axis,)

--> 129 axis = validate_axis(axis, x.ndim)

130

131 if dtype is None:

~\AppData\Local\conda\conda\envs\hyperspy2\lib\site-packages\dask\array\utils.py in validate_axis(axis, ndim)

142 return tuple(validate_axis(ax, ndim) for ax in axis)

143 if not isinstance(axis, numbers.Integral):

--> 144 raise TypeError("Axis value must be an integer, got %s" % axis)

145 if axis < -ndim or axis >= ndim:

146 raise AxisError("Axis %d is out of bounds for array of dimension %d"

TypeError: Axis value must be an integer, got range(0, 3)`

@jeinsle Could you try slicing the data to get a tiny dataset (sig2 = sig.inav[:10,:10,:5] should be fine) and then running

`sig2.compute()`

followed by `sig2.decomposition(True, algorithm='PCA', output_dimension=20)`

? I'm just wondering if the problem is with the lazy part or the regular decomposition part.
@jeinsle, PCA doesn't work properly in lazy mode even when it works. We need to fix that. NMF should work better, but setting the different parameters is not straightforward. I would tryp optimizing the NMF parameters in a small section of the dataset and then go for the whole thing.

@MurilooMoreira, you get the explained variance in

`explained_variance`

only when setting `centre=True`

and `normalize_poissonian_nose=False`

, this is the only decomposition that should be called PCA. But I don't advice you to do that, since then the decomposition will be worst (more components and more noise). The thing is that, when we use different settings (typically `centre=False`

) we still call it PCA, but that's not PCA, just plain SVD. Therefore, what you get in the wrongly name attribute `explained_variance`

is the singular values squared and divided by the number of components. This becomes the explained variance only when using standard PCA by using the settings mentioned above.
ahh that is great to know. I can maybe do that. thing for me is to figure out how many of the 108 tiles are representative to work with. Do any of the community here have methods documented for working with sections of a dataset and then generalizing?

`sig2.compute()`

followed by`sig2.decomposition(True, algorithm='PCA', output_dimension=20)`

? I'm just wondering if the problem is with the lazy part or the regular decomposition part.

Yeah this cropped down version seems to be running. so I will now look into how to break down big datasets and piece together solutions.

Hi, I'm trying to get the intensity values for different diffraction spots. I'm using interactive ROI to select the spots:

```
roi = hs.roi.CircleROI(20, 20, 20, r_inner=0)
signal.plot()
roi_circ = roi.interactive(signal, color='red')
```

and then using

roi_circ.events.data_changed.trigger(roi_circ)

roi_circ.data.sum()

to return the intensity values. Currently I am re-running the cell to get the intensity values. Is there a way to continuously return the roi_circ.data.sum() every time I move the ROI?

Thanks

Trying to investigate the failing tests of #2240, I'm noticing that on master, running

`pytest --mpl hyperspy\tests\drawing\test_plot_signal2d.py`

results in a lot of failed image comparisons.
The following are the original, test-generated and difference images:

Are mpl-tests platform-dependent?

no errors or warnings

@DrYGuo

The reason that happens is that jupyter notebook defaults to a *non-interactive* "backend" for matplotlib.

Unfortunately, this doesn't seem to be working in (the otherwise very impressive) google colab environment.

When I call the former, there is just no output.

The attached video shows how it should look with a simple example.

Sorry for being off topic- but we're looking for someone to do data analysis and development across electron and x-ray imaging using hyperspy and related tools. If you could please forward to anyone you think might be interested:

https://vacancies.diamond.ac.uk/vacancy/data-analysis-scientist-xray-and-electron-scanning-microscopy-391631.html

https://vacancies.diamond.ac.uk/vacancy/data-analysis-scientist-xray-and-electron-scanning-microscopy-391631.html

```
s2 = hs.datasets.example_signals.object_hologram()
line1 = hs.roi.Line2DROI(200, 200, 400, 400)
s2.plot()
r = line1.interactive(s2, color='red')
```

HyperSpy loads it as a complex2D signal

plotting it like so:

whereas DM shows:

DM displays the log of the modulus of complex values by default, with bilinear interpolation

there may be some value in making such a display the default for power spectra