Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 26 19:43

    beckgael on master

    refactoring (compare)

  • Dec 04 2020 12:44

    beckgael on 0.11.0

    (compare)

  • Dec 01 2020 14:14

    beckgael on master

    Update README.md (#14) additio… (compare)

  • Dec 01 2020 14:14
    beckgael closed #14
  • Dec 01 2020 14:13
    beckgael synchronize #14
  • Dec 01 2020 14:13

    beckgael on lebbah-patch-1

    add scala version of Gaussian M… deploy Gaussian Mixtures and OP… Merge branch 'master' into lebb… (compare)

  • Nov 27 2020 17:57

    beckgael on master

    deploy Gaussian Mixtures and OP… (compare)

  • Nov 25 2020 19:33

    beckgael on master

    add scala version of Gaussian M… (compare)

  • Nov 13 2020 10:52
    lebbah opened #14
  • Nov 13 2020 10:51

    lebbah on lebbah-patch-1

    Update README.md additional fu… (compare)

  • Oct 18 2020 08:32

    beckgael on master

    Add OPTICS (compare)

  • Oct 18 2020 08:31

    beckgael on pcaW

    (compare)

  • Oct 18 2020 08:31

    beckgael on pMaster

    (compare)

  • Oct 18 2020 08:31

    beckgael on pMaster

    Add OPTICS (compare)

  • May 03 2020 19:35

    beckgael on master

    deploy 0.9.8 with SG2Stream (compare)

  • Apr 20 2020 19:44

    beckgael on master

    UMAP refactor (compare)

  • Apr 19 2020 15:13

    beckgael on master

    update LICENSE (compare)

  • Apr 19 2020 15:12

    beckgael on master

    update LICENCE (compare)

  • Apr 19 2020 15:07

    beckgael on master

    cleanup (compare)

  • Apr 19 2020 08:39

    beckgael on master

    add S2GStream streaming algorit… (compare)

Beck Gaël
@beckgael
Hello first visitor. I hope you will enjoy your trip in our lands :)
vikas gautam
@vikasgautam18
Do you have some examples for k-prototype clustering?
Beck Gaël
@beckgael
Hello @vikasgautam18, unfortunately no but principle is the same than with KMeans KModes which have available notebooks
vikas gautam
@vikasgautam18
Thanks @beckgael
vikas gautam
@vikasgautam18
Hello Beck.. thanks for the response. I have another related question. How do you evaluate the clustering model?
In the Apache spark API, they have something like below -
// Evaluate clustering by computing Within Set Sum of Squared Errors
val WSSSE = clusters.computeCost(parsedData)
println(s"Within Set Sum of Squared Errors = $WSSSE")
do you have something similar to computeCost above? If not, any pointers as to how we could do this would be really helpful.
Beck Gaël
@beckgael
Hi @vikasgautam18
We chose to use another stopping criteria which is the distance epsilon under which every prototypes should move less than this threshold in order to consider convergence achieved. Others prefer WSSSE which also require a % threshold. We could imagine to propose user to choose how it decide to stop the algorithm between epsilon or WSSSE, this last is easy to implemant once you have prototypes and assignations clusters, you can implement easily...
Look at the link : https://discuss.analyticsvidhya.com/t/what-is-within-cluster-sum-of-squares-by-cluster-in-k-means/2706/2
vikas gautam
@vikasgautam18
thanks Beck!!
gitgitwhat
@gitgitwhat
Your BinNNMS paper says that the algorithm can be found in your GitHub page (Clustering4Ever). However, I don't see any mention of BinNNMS. Could you point me to the right place?
Beck Gaël
@beckgael
Hi @gitgitwhat
It is a combination of Binary Gradient Ascent and Binary-Epsilon-proximity
Hope it help
gitgitwhat
@gitgitwhat
Thanks @beckgael. Now I just need to figure out how to convert to Python and I'll be good.
Nicola
@120534
Hello, Beck. Is there an Apache Spark based Jenks natural breaks algorithm? I have no idea to rewrite the code with Apache Spark API
Beck Gaël
@beckgael
Hi @120534 , sorry but I have no idea about some distributed implementation.
Nicola
@120534
@beckgael Thx for your reply, I'll go deep into the algorithm and Spark implementation.
lukasstreit
@lukasstreit
Hi Beck,
whoops sent that a bit early. I saw your answer in
lukasstreit
@lukasstreit
a stackexchange about image segmentation using Mean shift clustering. I'm trying to segment art images (e.g. Oil paintings) and would like to try your approach out. Do you have some pointers with alrogithm of this repository to pick for that task? And do you think this could reasonably work for art as well? Sorry about the three messages, I'm posting this from the gitter website on mobile and don't see how to delete or edit my messages. Best regards, Lukas
Beck Gaël
@beckgael
Hi @lukasstreit ,
Combine the scalable version (spark ones) of gradient ascent and epsilon-proximity with euclidean distance, prefer LUV space to RGB. I applied this algorithm on image independently, not to compare many pictures between them, then i m curious about its application, let me know if you encounter issues ;)
lukasstreit
@lukasstreit
Thanks for the advice! Just to clarify - I'm not trying to group images, just trying to get segments on each distinct image. I'm currently working for my exams so I don't have much time to work on this but I'll let you know how it goes :)
Mariana
@Maki_AG_twitter
Hi, I'm trying to use the kmodes model on version 9.6 ... but I'm struggling to get the structure of the data like this ... required: GS[Cz[O,org.clustering4ever.vectors.BinaryVector]]
I try to follow the examples of the notebook, but I think that notebook was for version 8.4
Beck Gaël
@beckgael

Hi @Maki_AG_twitter , try something like that:

val parData = rawData.zipWithIndex.par.map{ case (v, id) => EasyClusterizable(id.toLong, BinaryVector(v)) }

where v is an Array[Int]

Mariana
@Maki_AG_twitter
I think that, this method works just for 8.4 version. But I'm trying to use 9.6 ... @beckgael do you have some example for this version ?
Beck Gaël
@beckgael
@Maki_AG_twitter I've checked EasyClusterizable compagnion object and method apply works in 0.9.6 as previously exposed.
http://www.clustering4ever.org/API%20Documentation/0.9.6/#org.clustering4ever.clusterizables.EasyClusterizable$
What is your error ?
Mariana
@Maki_AG_twitter
@beckgael I was doing wrong the object EasyClusterizable, I correct it and I've no problem now. Thanks !
Josiah Blaisdell
@josiahblaisdell_twitter
Hi, I have been working on an implementation of something called a "Probabilistic Self-Organizing Map" (based on work by Anouar et al. 1997 and Lebbah 2015). I found out about Clustering4Ever while googling for M. Lebbah, the author of the more recent prSOM paper. My implementation is in C++ and Qt, I visualize the training process using OpenGL. My implementation seems to be working, I get similar results to what I have seen in some other work. But when I use a trained prSOM to predict the class of a new observation, I think there may be a bug in my code that is making the prediction incorrect, or possibly overestimating the likelihood in some cases because my prediction doesn't include covariance, it assumes all the neurons are spherical gaussians. Does Clustering4Ever have any interest or plan to release an implementation of the prSOM?