jorana on master
very small simplification of pb… make static methods static for … do same extra patching on tqdm and 5 more (compare)
WenzDaniel on stable
remove operators from mongo st… Added keep_columns into docstri… Parse requirements for testing and 10 more (compare)
Of course records are supposed to be fixed-length and sorted by time, so this should still not happen.
The data is fixed length not the length length, is not it?
Dear all who are currently reviewing the nveto plugins and hitlet functions. Thanks a lot. Please note, that I uploaded a new notebook which explains in detail the purpose of the different hitlet functions and how they are embedded in the nveto plugins. You can find the corresponding comment here https://github.com/AxFoundation/strax/pull/275#issuecomment-656025873
The very same notebook also serves as an mini-introduction to straxn(en) for the nveto-subgroup. Hence some comments might be less interesting for the reviewers.
All issues which were raised until now were solved. I hope this new notebook will help to speed up the reviewing process such that more people of the nveto subgroup can start playing with the data without installing their own strax(en).
But it's easy to check: change that line to strax.endtime(self.data).max() and reprocess the run. If this hypothesis is right it will crash earlier on some raw_records / records chunk
@JelleAalbers you are right. We end up with this:
ValueError: Attempt to create chunk [008543.raw_records: 1594281342sec 999999000 ns - 1594281348sec 499999000 ns, 5836752 items, 258.9 MB/s] whose data ends late at 1594281348499999220
alright thanks Jelle and Daniel! It seems fixed. Here are the corresponding PRs:
AxFoundation/strax#281
XENONnT/straxen#146
Looks fine, but maybe we can do a bit better for raw_records at least.
def cleanup(self,iters, wait_for):
for d in iters.keys():
if not self._fetch_chunk(d, iters):
print(f'Source {d} is finished.')
else:
print(f'Source {d} is not exhausted, but was stopped early since another source finished first')
#shut down the thread for source d
Hi Jason, as you likely guessed there is no nice support for this in strax yet; we assume each plugin always exhausts all its inputs. Even if your plugin doesn't, some other plugin or saver may want all of the data for that input, so it's risky to let just one plugin flip a 'kill switch'.
Of course shutting things down is always possible. You could define a custom exception class (class JasonsKillSwitchActivated(Exception): pass
) and .throw()
it into the iterator/generator for the input you no longer need. That will kill that input's mailbox, which will shut down the thread writing to it, and anyone else as soon as they try to read from it.
You might have to modify the mailbox code to propagate the exception to mailboxes further upstream, if you have any (i.e. use kill with upstream = True when your exception comes along). If the exception shows up on the screen, you could look where the printing happens and similarly bypass it when your custom exception is caught. If there is another plugin or saver reading the forcefully stopped datatype, it will show the exception in its metadata, but you probably want that behavior (the data is now incomplete). If/when we replace the mailbox system with another concurrency backend, these modifications would have to be revisited.
st.set_config()
and specify your config settings within there... then when you run st.make()
or st.get_array()
etc, then you should trigger the reprocessing of the data because you reset your context. Every time I've had plugins share a config setting this method worked for me.
st=straxen.contexts.your_context(some_settings)
, as well. I just always do it after for ease.
@strax.takes_config(...)
I set defaults for the config options