Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 11:51
    zhanghaooo opened #140
  • Sep 16 22:49
    smastelini commented #128
  • Sep 16 22:47
    jacobmontiel labeled #139
  • Sep 16 22:47
    jacobmontiel labeled #139
  • Sep 16 22:47
    jacobmontiel opened #139
  • Sep 16 21:21
    jacobmontiel commented #27
  • Sep 16 21:18
    jacobmontiel commented #77
  • Sep 16 21:04
    jacobmontiel commented #128
  • Sep 16 21:01
    jacobmontiel milestoned #131
  • Sep 16 21:01
    jacobmontiel milestoned #138
  • Sep 16 08:44
    jmread commented #133
  • Sep 16 06:31
    jacobmontiel commented #133
  • Sep 16 06:30
    jacobmontiel unlabeled #133
  • Sep 16 06:22
    jacobmontiel labeled #133
  • Sep 16 06:22
    jacobmontiel labeled #133
  • Sep 16 06:18
    jacobmontiel labeled #138
  • Sep 16 06:18
    jacobmontiel labeled #138
  • Sep 16 06:17
    jacobmontiel opened #138
  • Sep 15 23:40
    codecov[bot] commented #137
  • Sep 15 23:39
    codecov[bot] commented #137
Jacob
@jacobmontiel
I see, you mean that the training data comes from one stream and prediction data from a different stream. Unfortunately skmultiflow does not provide native support for this scenario. However you can create your own evaluator using the individual components. For example in the skmultiflow.metrics module you will find some methods to track performance.
Liyan
@sunnysong14
good information~ thanks Jacob.
hmm, I can see some time difference in our group, hah
Jacob
@jacobmontiel
Yes, I believe there are members from multiple places and timezones :)
Liyan
@sunnysong14
magic modern life, hah
Jacob
@jacobmontiel
We are introducing a new/improved base class in release 0.3.0. Please refer to the corresponding post in the project's webpage for detailed information about this change and its impact on your workflow.
Liyan
@sunnysong14
would the doc for release 0.2.0 still available?
bec. i am based on 0.2.0 and do not want to shift to the new version lately :-(
Jacob
@jacobmontiel
This is a valid point. Our current host (gh-pages) does not provide 'versioning', I will look into what options we have. If somebody has a suggestion please let me know. Worst case scenario we can make the documentation for 0.2.0 available in a zip file.
Liyan
@sunnysong14
that would be really better if we could get access to the documentations of all versions (I support it to be a continuous support from your nice group :-P)
I am not good at them so sadly have any suggestion for the solutions. :-(
fwille
@fwille
Does the data_points_for_classification option of EvaluatePrequential() only work for certain classifiers or do I need to specify any other option too? When I try to use it, the plot shows but is empty and the model isn't trained any further
Could you maybe provide a small working example for this option? I wold highly appreciate that! :)
Jacob
@jacobmontiel
A new release of scikit-multiflow is available! Please refer to the corresponding changelog entry for detailed information.
Check the new (package map)[https://scikit-multiflow.github.io/scikit-multiflow/package_map.html], designed to help navigate the available methods in scikit-multiflow.
Documentation for previous releases is kept as zip files from version 0.2.0. Links to documentation files are available in the (previous versions)[https://scikit-multiflow.github.io/scikit-multiflow/versions.html] page in the documentation.
@fwille can you share a MWE to trigger the empty plot?
fwille
@fwille
creating the MWE actually showed me what the problem is, whenever the output_file is also specified the problem occurs
MWE:
from skmultiflow.data import FileStream
from skmultiflow.evaluation import EvaluatePrequential
from skmultiflow.trees import HoeffdingTree


# setup data stream
stream = FileStream("skmultiflow/data/datasets/elec.csv")

stream.prepare_for_use()


# 2. Instantiate the HoeffdingTree classifier
ht = HoeffdingTree()

# evaluate prediction results
evaluator = EvaluatePrequential(show_plot=True, pretrain_size=1000, data_points_for_classification=True, max_samples=30000, output_file="test.csv")

evaluator.evaluate(stream=stream, model=[ht], model_names=['HT'])
Jacob
@jacobmontiel
I see, can you create an issue to track this?
fwille
@fwille
sure! :)
Liyan
@sunnysong14
hi Jacob (I guess :-P), here goes the questions regarding Hoeffding tree.
does the version 0.2.0 support active learning. i.e. I have some pre-trained Hoeffding tree along with data streams for training and testing.
Can we initialize Hoeffding tree that pre-trained tree and those data streams?
thanks a lot ;-)
Jacob
@jacobmontiel
there is no native support for initializing a new Hoeffding tree with a pre-trained model, however you can use the deepcopy method from the standard library to make a copy of the object with the trained model
Liyan
@sunnysong14
thanks. while can I keep using Hoeffding tree ? I did not find any input argument passing such info
Liyan
@sunnysong14
hmm, if there is no way to pass the pre-trained Hoeffding tree to the new Hoeffding tree, little meaning in deepcopy(), right?
Jacob
@jacobmontiel
in this case you could continue training one of the objects whithout "loosing" the pre-trained model
in case you want to try different paths you can make multiple instances (copies)
Liyan
@sunnysong14
cheers, i think i know how to make it
by using partial_fit()
Liyan
@sunnysong14
hello Jacob,
is there any method to save and load a OzaBagging based model pls?
sorry that i am quite new to python ;-(
Jacob
@jacobmontiel
I haven't tried, but this should be possible using Python's built-in tools
Liyan
@sunnysong14
i find, in version 0.3.0, skmultiflow.core.clone.
oh, thx, let me see this link,
Jacob
@jacobmontiel
the clone method in version 0.3.0 generates a new instance (clone) of an object with the same configuration as the original but with an empty model, I think this is different to what you want
fwille
@fwille
Hello Jacob,
I realized, that the evaluation_visualizer.py is missing the subplot caption creating elif blocks for the new metrics precision, recall and F1, resulting in 'Unknown metric' as caption in the evaluation plot when using the new metrics. It's basically only the 9 lines of code missing in the def __configure():
                elif metric_id == constants.PRECISION:
                    plot_tracker.sub_plot_obj.set_title('Precision')
                    plot_tracker.sub_plot_obj.set_ylabel('precision')
                elif metric_id == constants.RECALL:
                    plot_tracker.sub_plot_obj.set_title('Recall')
                    plot_tracker.sub_plot_obj.set_ylabel('recall')
                elif metric_id == constants.F1_SCORE:
                    plot_tracker.sub_plot_obj.set_title('F1 Score')
                    plot_tracker.sub_plot_obj.set_ylabel('f1')
Jacob
@jacobmontiel
Thanks @fwille , could you open an issue on github so we can fix it?
fwille
@fwille

I have a question regarding the output of get_model_description() called on an instance of a Hoeffding Tree: for instance, one of the leaves of my HT looks like this:

Leaf = Class 1 | {0: 50.70403538482331, 1: 843.5275737873848}

Could you explain the meaning of this?

Saulo Martiello Mastelini
@smastelini

Hi @fwille. The specific portion of the HT code that generates the line you sent is:
https://github.com/smastelini/scikit-multiflow/blob/52a94913ba1692b1b6bc545d0ebb824d188b0d1d/src/skmultiflow/trees/hoeffding_tree.py#L256-L258

I get from your example your task is a binary classification problem. According to the command I sent, the HT is calculating which one is the majority class in that leaf.

The HT basically shows 'Majority class | dictionary showing the weights for each class'
Is it clear now? Don't hesitate in asking me! :D
Jacob
@jacobmontiel
@all A new release v0.4.0 is available!
Changelog
  • Feature | Robust Soft Learning Vector Quantization classifier prototype.RobustSoftLearningVectorQuantization
  • Feature | Stacked Single-Target Hoeffding Tree regressor trees.StackedSingleTargetHoeffdingTreeRegressor
  • Feature | Half-Space Trees one-class classifier for anomaly detection anomaly_detection.HalfSpaceTrees
  • Fix | Fix bug in data.HyperplaneGenerator which resulted in corrupted data when using batch_size > 1.
  • Enhancement | Documentation improvements.
Jacob
@jacobmontiel
@/all A new (patch) release v0.4.1 is available
This patch includes a fix for a bug in the calculation of Precision and Recall that impacted F1 and Geometric-mean scores. The bug was only triggered under specific circumstances depending on the arrival order of class-values.