Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Sep 24 15:20
    jayant91089 starred Neuraxio/Neuraxle
  • Sep 23 09:18
    lahwran starred Neuraxio/Neuraxle
  • Sep 22 00:07
    adriangb starred Neuraxio/Neuraxle
  • Sep 16 02:51
    patrickctrf closed #406
  • Sep 12 21:50
    stale[bot] labeled #288
  • Sep 12 21:50
    stale[bot] labeled #289
  • Sep 12 21:50
    stale[bot] commented #288
  • Sep 12 21:50
    stale[bot] commented #289
  • Sep 07 22:29
    stale[bot] labeled #280
  • Sep 07 22:29
    stale[bot] labeled #282
  • Sep 07 22:29
    stale[bot] commented #280
  • Sep 07 22:29
    stale[bot] commented #282
  • Sep 06 01:28
    stale[bot] labeled #281
  • Sep 06 01:28
    stale[bot] commented #281
  • Sep 04 07:32
    imloualvaro starred Neuraxio/Neuraxle
  • Sep 04 04:18
    ebigram starred Neuraxio/Neuraxle
  • Sep 02 10:47
    woj-i starred Neuraxio/Neuraxle
  • Aug 30 01:45
    stale[bot] labeled #277
  • Aug 30 01:45
    stale[bot] commented #277
  • Aug 29 07:36
    tacohiddink starred Neuraxio/Neuraxle
Sameroom
@sameroom-bot
[Adam Delgado, Neuraxio] Looking forward to your presentation. Thanks for the invitation to the slack channel :)
Sameroom
@sameroom-bot
[ajay vamsi, Neuraxio] hello everyone !
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] Reminder: I will present at the CAMDEA Digital Forum between approx 6:25pm- 6:45pm! Link in my previous comment above.
Sameroom
@sameroom-bot
[aihang, Neuraxio] :dancer: Just arrived!
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Hi, just presenting myself I'm Antoine Tremblay working at https://www.shutterstock.com I'm a principal engineer there working in the AI Eng team... basically my role is to assist the Data Science team by developing data , training and inference pipelines mainly related to Computer Vision scenarios.. some challenges:
• Scale: We don't use local disks except for non-mandatory caching
• Making sense of all the possible tools and ways to go about anything (The most difficult part imo)
• Performance and cost efficiency (on Cloud vs on Prem, CPU vs GPU intensive tasks)
We currently have a small pipeline framework I made but are looking at using neuralexe... main reasons are:
• Having a framework agnostic code
• Something we are not the only ones to dev, and can collaborate on
• You guys understand how to make good code which can by rare in the python / DS world unfortunately
[Antoine Tremblay, Neuraxio] These days I'm toying with the idea of having pipelines that could be powered by a suite of services potentially using https://developer.nvidia.com/nvidia-triton-inference-server since in a CV pipeline many steps are CPU intense while others are GPU and that having proper machines on AWS that can do both is not really efficient, I'm wondering if it would be a good arch to have transformers be stand alone in a grpc service for example and build service driven pipelines like:
Get Images -> Resize Images -> InceptionV2 -> Train Some model
[Antoine Tremblay, Neuraxio] Any ideas / references / comments welcome :)
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Oh and while I'm at it if you feel like this is something you'd like do professionally we're recurting
Sameroom
@sameroom-bot

[Guillaume Chevalier, Neuraxio] Hi @Antoine Tremblay ! I'm glad that you enjoyed our Clean Machine Learning training back in March, and that it is solving practical business problems for you and your team. Note that at Neuraxio, we can offer some coaching on the side of our other activities if you'd like us to prepare you a plan : Computer Vision is not our main focus, although I've personnaly worked on 5+ (if not 10) Computer Vision projects and that we have the capacity to offer you a detailed plan/design for your pipeline, that your internal ressources & employees could implement. You can write to me in private to request a few coaching hours as such.

In case you missed my last conference in mid-September, here is the recording link. Note that it is completely different than the Clean Machine Learning Training you attended in March:
https://www.youtube.com/watch?v=OFe223rUBY8

Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Was there ever thought of having Neuralexe pipelines running as metaflow steps, or as kubeflow pipelines ?
Sameroom
@sameroom-bot
[Adam Delgado, Neuraxio] Hey Gang!
CAMDEA Digital Forum will have a booth at the Hackernest Tech Social event on Saturday 14th November from 2pm-4pm.
You can check out their website here: https://hackernest.com/
Hackernest is a not-for-profit technical networking and social event and has 16k+ members on their meetup group: https://www.meetup.com/hackernest/
Both Hackernest & CAMDEA Digital Forum are focused on adding value to the Canadian tech community in a not-for-profit model so it's a great partnership.
Perhaps you'd like to join us on Saturday?
Stay Healthy & Happy!
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Question for the Nauraxle lib... say I wanted a Step to not pass data to the next step and just end processing of that batch and pass to the next one ... what would be the proper way ?
Context is that I'm doing some feature extraction and want to skip processing if I find the feature already stored...
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Humm weird thing if I do like:
pipeline = SequentialQueuedPipeline([ SomeTransformer() ], batch_size=10, max_queue_size=10, n_workers_per_step=1).with_context(self._context) data = DataContainer(data_inputs=some_data, current_ids=ids) pipeline.transform_data_container(data)
I get :
AttributeError: 'StepWithContext' object has no attribute 'transform_data_container'
the same with `pipeline.transform(some_data) works fine, or if I use:
pipeline._transform_data_container(data, self._context)
that works too...
Sameroom
@sameroom-bot

[I P, Neuraxio] Hello! i'm beginning to enjoy neuraxle....thanks for the great library.

I have a question:
How do i enable multiclass label support in AutoML, as I keep getting the following error in the callback...and I don't know where to set this parameter

ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
Thank you!

Sameroom
@sameroom-bot
[I P, Neuraxio] let me know if you have any ideas about this...thanks!
Sameroom
@sameroom-bot

[I P, Neuraxio] I have another question.....how do i convert the best model returned from auto_ml.get_best_model() to the sklearn object? i need this to access for example feature_importances_

thanks once again!

[I P, Neuraxio] i'm sorry if my questions sound newbie
[I P, Neuraxio] AttributeError: 'Pipeline' object has no attribute 'feature_importances_'
I get this error when i do it the naiive way
Sameroom
@sameroom-bot
[Jens E. Pedersen, Neuraxio] Hello everyone. I just found this Slack space via your brilliant [SNN/PyTorch exposition](https://github.com/guillaume-chevalier/Spiking-Neural-Network-SNN-with-PyTorch-where-Backpropagation-engenders-STDP) @. Thanks for that (and the pretty weird [YouTube video](https://www.youtube.com/watch?v=Jo6dkHgT6TI))).
A brief introduction: I'm one of the authors of the PyTorch SNN library [Norse](https://github.com/norse/norse/) and am working as a PhD at KTH, Sweden, working on neurorobotics and RL. Looking forward to following this chat :-)
Sameroom
@sameroom-bot
[Yassine Hamdaoui, Neuraxio] Hello everyone , i want to introduce @Elyes Manai, Elyes is a Data Science Consultant & Google Developer Expert in ML that is specialized in efficient NLP and wishes to pursue a Ph.D. in AI
Sameroom
@sameroom-bot
[Elyes Manai, Neuraxio] thank you @Yassine Hamdaoui for the introduction and hello everyone, glad to be part of this :)
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] If there's an error in a transformer like an assert failing in neuraxle... in a SequentialQueuedPipeline is there a way to make the pipeline fail ? seems like it just exits the child processes and stalls there in the main process...?
[Antoine Tremblay, Neuraxio] not sure it exits that process...
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] @Antoine Tremblay Neuraxio/Neuraxle#418
[Antoine Tremblay, Neuraxio] k :( thx
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Question on neuraxle... let's assume you have a pipeline with steps that can fail for some input elements for example:
FetchSomeImagesFromIds -> Resize -> DoSomethingElse
In this case the 1st step downloads 10 out of a 100 images... and passes those to resize... and the pipeline works
But i'm looking for suggestions on how to report or handle this missing data....?
My current implementation removes the missing keys from current_keys so that the key -> data mapping is kept
And actually exits the whole program if there's anything missing.. given the previous problem with issue/418
Thoughts?
[Antoine Tremblay, Neuraxio] (I'm not looking to fill the values, more to report the problem)
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] ^ posted on SO .. I had forgotten that
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Just curious I see some doc pointing to https://github.com/Neuraxio/Neuraxle-PyTorch ... but that's a 404 now... was that deleted or ?
[Antoine Tremblay, Neuraxio] ( that doc was outdated , still curious...)
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Questions for Neuraxle lib... Should the names of the steps in a SequentialQueuedPipeline be set on the actual steps ? , as it is the QueueWorker steps take the name but not the wrapped steps...
[Antoine Tremblay, Neuraxio] this gives things like:
[('QueueWorkerskip', QueueWorker( wrapped=SkipExistingS3( name=SkipExistingS3, hyperparameters=HyperparameterSamples() ), For pipeline ('skip', SkipExistingS3()}
[Antoine Tremblay, Neuraxio] I got confused trying to refer to my step's name inside of it... it was SkipExistingS3 instead of the expected 'skip' ... even that would have been wrong however since it's QueueWorkerskip in the end...
[Antoine Tremblay, Neuraxio] As a workaround I'll set it directly in the Step Init
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] This also has the side effect that wrapped Steps names can be identical...
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Maybe I'm not looking at this right but:
:param include_incomplete_batch: (Optional.) A bool representing whether the last batch should be dropped in the case it has fewer than `batch\_size\` elements; the default behavior is not to drop the smaller batch.
In a queued sequential pipeline, this actually defaults to False so maybe the doc is wrong ?
Also if I think if get_n_batches returns 0 because match.floor ( len / batch) is 0 and include_incomplete_batch is False then I get exceptions like:
` File "/usr/local/lib/python3.7/site-packages/neuraxle/base.py", line 2477, in handle\_transform data\_container = super\(\).handle\_transform\(data\_container, context\) File "/usr/local/lib/python3.7/site-packages/neuraxle/base.py", line 775, in handle\_transform data\_container = self.\_transform\_data\_container\(data\_container, context\) File "/usr/local/lib/python3.7/site-packages/neuraxle/base.py", line 2492, in \_transform\_data\_container data\_container = self.wrapped.handle\_transform\(data\_container, context\) File "/usr/local/lib/python3.7/site-packages/neuraxle/base.py", line 1210, in handle\_transform data\_container = self.transform\_data\_container\(data\_container, context\) File "/usr/local/lib/python3.7/site-packages/neuraxle/distributed/streaming.py", line 526, in transform\_data\_container data\_container = self\[-1\].join\(original\_data\_container=data\_container\) File "/usr/local/lib/python3.7/site-packages/neuraxle/distributed/streaming.py", line 733, in join data\_containers = self.\_join\_all\_step\_results\(\) File "/usr/local/lib/python3.7/site-packages/neuraxle/distributed/streaming.py", line 745, in \_join\_all\_step\_results self.\_raise\_exception\_throwned\_by\_workers\_if\_needed\(data\_containers\) File "/usr/local/lib/python3.7/site-packages/neuraxle/distributed/streaming.py", line 756, in \_raise\_exception\_throwned\_by\_workers\_if\_needed raise exception AttributeError: 'AttributeError' object has no attribute 'summary\_id' Setting up task environment.`
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] Doh I missread that about include_incomplete_batch nevermind ... I still got that error however but unclear what it is yet
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] It's kinda confusing having an option that says include_incomplete_batch ... you may think that you can set that to True to Include it... but it actually does the opposite
[Antoine Tremblay, Neuraxio] In fact looking at the code:
def get_n_batches(self, batch_size: int, include_incomplete_batch: bool = False) -> int: if include_incomplete_batch: return math.ceil(len(self.data_inputs) / batch_size) else: return math.floor(len(self.data_inputs) / batch_size)
Sameroom
@sameroom-bot
[Vincent Antaki, Neuraxio] Good catch Antoine! I've included the fix in a pr that we planned to merge this week.
Sameroom
@sameroom-bot
[Antoine Tremblay, Neuraxio] thanks! :)
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] Hi all! If some of you would like to contribute to Neuraxle by improving the TPE AutoML algorithm and publish a paper related to that, you can join the #automl channel where we will post updates and share resources on this project. We are already five to collaborate on this project. Cheers!
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] CAMDEA.png
Sameroom
@sameroom-bot

[Vishal Kajjam, Neuraxio] :wave: I am new to neuraxle. I am trying to use the MiniBatchSequentialPipeline. Here is my example script -
```from neuraxle.base import NonTransformableMixin, Identity, BaseStep, NonFittableMixin, MetaStepMixin
from neuraxle.pipeline import MiniBatchSequentialPipeline, Joiner

from neuraxle.steps.numpy import MultiplyByN

import random
import string

class VectorizeStep(NonFittableMixin, BaseStep):
def init(self):
BaseStep.init(self)
NonFittableMixin.init(self)

def transform(self, ids):
    print(f'----ids: {ids}----')
    letters = string.ascii_lowercase
    vecs = []
    for i in ids:
        vecs.append(''.join(random.choice(letters) for i in range(10)))
    return ids, vecs

def main():
p = MiniBatchSequentialPipeline([

    # MultiplyByN()
    VectorizeStep(),
    # Joiner(batch_size=2)
# ])
], batch_size=2)
out = p.transform(list(range(10)))
print(out)
import pdb; pdb.set_trace()

if name == 'main':
main()This is the output of the pipeline -[[0, 1], ['qjcoatbxqd', 'otbundmkqx'], [2, 3], ['vxnswvrlfr', 'maucumfqnb'], [4, 5], ['ixesikzfiq', 'uklnfeskdh'], [6, 7], ['jxqunaymok', 'waizgsnlvj'], [8, 9], ['comyezliyx', 'jrzioeibgt']]How can I get the output of the pipeline to be -[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], ['qjcoatbxqd', 'otbundmkqx', 'vxnswvrlfr', 'maucumfqnb', 'ixesikzfiq', 'uklnfeskdh', 'jxqunaymok', 'waizgsnlvj', 'comyezliyx', 'jrzioeibgt']]`` I tried using theJoinerstep but that did not work. Would_transform_data_container` be a good approach?

Thanks in advance.
cc: @

Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] Check out this job at Neuraxio: https://www.linkedin.com/jobs/view/2596236251
Sameroom
@sameroom-bot

[Rohith, Neuraxio] Below is the final logic, i have also updated the PR accordingly with the new logic. I have tried to optimise as much as possible.
Neuraxio/Neuraxle#502
```from collections import OrderedDict, Counter
from typing import List

_separator = ""
_flat_hps = OrderedDict([
("Pipeline
ABCDtestcopy", True), # copy, testcopy, ABCDtestcopy, PipelineABCDtestcopy
("PipelineABCDtest1copy", False), # copy, test1copy, ABCDtest1copy, PipelineABCDtest1copy
("Pipeline
parent1testcopy", True), # copy, testcopy, parent1testcopy
("Pipeline
parent2testcopy", False),
("Pipelineparent1hps", 23),
("Pipelineparent1abcdtest", "parallel"),
("Pipeline
abcdtest", None),
("Sklearn
done", "abcd")
])

wildcard_hps_variant_default = [
"ABCDtestcopy",
"
test1copy",
"parent1copy",
"parent2copy",
'hps',
'
parent1*test',
'Pipeline
abcd__test',
"*done"
]

def _generate_reverse_suffix_strings(array: List[str]):

# """
# Generates reverse suffix set of strings for an given input array of strings
# :param array:
# :param separator:
# :return:
#
# >>>ip_array = ["a","b","c"]
# >>>separator = '__'
# >>>generate_reverse_suffix_strings(ip_array,separator)
# ...{'a__b__c', 'b__c', 'c'}
# """
suffix_strings = set()
for idx in range(1, len(array) + 1):
    suffix_strings.add(f"{_separator}".join(array[-idx:]))
return suffix_strings

def _generate_string_pairs(array: List[str]):

# """
# Generates set of string pairs tuple
# :param array:
# :return:
#
# >>>temp_array = ["a","b","c"]
# >>>generate_string_pairs(temp_array)
# ...{('a','b'),('a','c')}
# """
string_pairs = set()
y = array[-1]
for x in array[:-1]:
    string_pairs.add((x, y))
return string_pairs

def _can_be_pruned_further(unique_str: List[str], string_pairs_freq: Counter) -> bool:
return string_pairs_freq[(unique_str[0], unique_str[-1])] <= 1

reverse_suffix_strings_freq: Counter = Counter()
all_string_pairs_freq: Counter = Counter()
for hp in _flat_hps:
temp = hp.split(_separator)
reverse_suffix_strings_freq.update(
_generate_reverse_suffix_strings(temp)) # phase1 preprocessing step
all_string_pairs_freq.update(_generate_string_pairs(temp)) # phase2 preprocessing step

selected_string_key_for_each_hp: dict = {}
wild_card_compressed_strings: OrderedDict = OrderedDict()
for hp in _flat_hps:
fitted_list = hp.split(_separator)
for i in range(1,
len(fitted_list) + 1): # phase1: selection of unique strings for each absolute

    # hyperparameter path
    temp = f"{_separator}".join(fitted_list[-i:])
    if reverse_suffix_strings_freq[temp] <= 1:
        selected_string_key_for_each_hp[hp] = temp
        break
compressed_key = hp
# phase2: Pruning(Converting to wild card format) the output from phase1
if compressed_key == selected_string_key_for_each_hp[hp]:  # if `selected\_string` is same has absolute hp
    wild_card_compressed_strings[compressed_key] = _flat_hps[hp]
    continue
selected_hp: List[str] = selected_string_key_for_each_hp[hp].split(_separator)
current_selected_hp_len: int = len(selected_hp)
compressed_key = f"*{selected_string_key_for_each_hp[hp]}"
if _can_be_pruned_further(selected_hp, all_string_pairs_freq) and current_selected_hp_len >= 3:
    compressed_key = f"*{selected_hp[0]}*{selected_hp[-1]}"
wild_card_compressed_strings[compressed_key] = _flat_hps[hp]

assert list(wild_card_compressed_strings.keys()) == wildcard_hps_variant_default
print(wild_card_compressed_strings.keys())```

Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] https://news.ycombinator.com/item?id=27815520
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] You may now try to run the code in you browser for the AutoML for Time Series Processing example notebook of the video conference pasted just above:
https://mybinder.org/v2/gh/Neuraxio/Neuraxle/d0ab8c088f880d9ff655a0bfd9c8a3e3c44972[…]=examples/Introduction%20to%20Time%20Series%20Processing.ipynb
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] New article: How to unit test machine learning code?
https://www.neuraxio.com/blogs/news/how-to-unit-test-machine-learning-code
Sameroom
@sameroom-bot
[Guillaume Chevalier, Neuraxio] Presenting at the CAMDEA in a few minutes tonight on the topic of: "How to be a better AI programmer?"
https://www.crowdcast.io/e/camdea-technology--mlops