by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 03 19:35

    beckgael on master

    deploy 0.9.8 with SG2Stream (compare)

  • Apr 20 19:44

    beckgael on master

    UMAP refactor (compare)

  • Apr 19 15:13

    beckgael on master

    update LICENSE (compare)

  • Apr 19 15:12

    beckgael on master

    update LICENCE (compare)

  • Apr 19 15:07

    beckgael on master

    cleanup (compare)

  • Apr 19 08:39

    beckgael on master

    add S2GStream streaming algorit… (compare)

  • Apr 18 15:02

    beckgael on master

    add S2GStream streaming algorit… (compare)

  • Apr 14 17:50

    beckgael on 0.9.7

    0.9.7 is available with speedUp… (compare)

  • Apr 14 17:48

    beckgael on master

    0.9.7 is available with speedUp… (compare)

  • Apr 14 17:25

    beckgael on master

    Speed improvement on K-Centers … (compare)

  • Apr 11 14:52

    beckgael on master

    Clusterwise refactor (compare)

  • Apr 05 23:03

    beckgael on master

    add DC-DPM clustering algorithm… (compare)

  • Apr 05 16:13

    beckgael on master

    add DC-DPM clustering algorithm… (compare)

  • Mar 08 12:24

    beckgael on master

    SpeedUp UMAP (compare)

  • Feb 12 13:41

    beckgael on lebbah-patch-4

    (compare)

  • Feb 12 13:41

    beckgael on lebbah-patch-3

    (compare)

  • Feb 12 13:40

    beckgael on lebbah-patch-2

    (compare)

  • Feb 12 13:40

    beckgael on lebbah-patch-1

    (compare)

  • Oct 26 2019 12:42

    beckgael on master

    Add RDD KCenters++ initializati… (compare)

  • Oct 04 2019 08:22

    beckgael on master

    missing bold on @YazidJanati re… (compare)

Beck Gaël
@beckgael
Hello first visitor. I hope you will enjoy your trip in our lands :)
vikas gautam
@vikasgautam18
Do you have some examples for k-prototype clustering?
Beck Gaël
@beckgael
Hello @vikasgautam18, unfortunately no but principle is the same than with KMeans KModes which have available notebooks
vikas gautam
@vikasgautam18
Thanks @beckgael
vikas gautam
@vikasgautam18
Hello Beck.. thanks for the response. I have another related question. How do you evaluate the clustering model?
In the Apache spark API, they have something like below -
// Evaluate clustering by computing Within Set Sum of Squared Errors
val WSSSE = clusters.computeCost(parsedData)
println(s"Within Set Sum of Squared Errors = $WSSSE")
do you have something similar to computeCost above? If not, any pointers as to how we could do this would be really helpful.
Beck Gaël
@beckgael
Hi @vikasgautam18
We chose to use another stopping criteria which is the distance epsilon under which every prototypes should move less than this threshold in order to consider convergence achieved. Others prefer WSSSE which also require a % threshold. We could imagine to propose user to choose how it decide to stop the algorithm between epsilon or WSSSE, this last is easy to implemant once you have prototypes and assignations clusters, you can implement easily...
Look at the link : https://discuss.analyticsvidhya.com/t/what-is-within-cluster-sum-of-squares-by-cluster-in-k-means/2706/2
vikas gautam
@vikasgautam18
thanks Beck!!
gitgitwhat
@gitgitwhat
Your BinNNMS paper says that the algorithm can be found in your GitHub page (Clustering4Ever). However, I don't see any mention of BinNNMS. Could you point me to the right place?
Beck Gaël
@beckgael
Hi @gitgitwhat
It is a combination of Binary Gradient Ascent and Binary-Epsilon-proximity
Hope it help
gitgitwhat
@gitgitwhat
Thanks @beckgael. Now I just need to figure out how to convert to Python and I'll be good.
Nicola
@120534
Hello, Beck. Is there an Apache Spark based Jenks natural breaks algorithm? I have no idea to rewrite the code with Apache Spark API
Beck Gaël
@beckgael
Hi @120534 , sorry but I have no idea about some distributed implementation.
Nicola
@120534
@beckgael Thx for your reply, I'll go deep into the algorithm and Spark implementation.
lukasstreit
@lukasstreit
Hi Beck,
whoops sent that a bit early. I saw your answer in
lukasstreit
@lukasstreit
a stackexchange about image segmentation using Mean shift clustering. I'm trying to segment art images (e.g. Oil paintings) and would like to try your approach out. Do you have some pointers with alrogithm of this repository to pick for that task? And do you think this could reasonably work for art as well? Sorry about the three messages, I'm posting this from the gitter website on mobile and don't see how to delete or edit my messages. Best regards, Lukas
Beck Gaël
@beckgael
Hi @lukasstreit ,
Combine the scalable version (spark ones) of gradient ascent and epsilon-proximity with euclidean distance, prefer LUV space to RGB. I applied this algorithm on image independently, not to compare many pictures between them, then i m curious about its application, let me know if you encounter issues ;)
lukasstreit
@lukasstreit
Thanks for the advice! Just to clarify - I'm not trying to group images, just trying to get segments on each distinct image. I'm currently working for my exams so I don't have much time to work on this but I'll let you know how it goes :)
Mariana
@Maki_AG_twitter
Hi, I'm trying to use the kmodes model on version 9.6 ... but I'm struggling to get the structure of the data like this ... required: GS[Cz[O,org.clustering4ever.vectors.BinaryVector]]
I try to follow the examples of the notebook, but I think that notebook was for version 8.4
Beck Gaël
@beckgael

Hi @Maki_AG_twitter , try something like that:

val parData = rawData.zipWithIndex.par.map{ case (v, id) => EasyClusterizable(id.toLong, BinaryVector(v)) }

where v is an Array[Int]

Mariana
@Maki_AG_twitter
I think that, this method works just for 8.4 version. But I'm trying to use 9.6 ... @beckgael do you have some example for this version ?
Beck Gaël
@beckgael
@Maki_AG_twitter I've checked EasyClusterizable compagnion object and method apply works in 0.9.6 as previously exposed.
http://www.clustering4ever.org/API%20Documentation/0.9.6/#org.clustering4ever.clusterizables.EasyClusterizable$
What is your error ?
Mariana
@Maki_AG_twitter
@beckgael I was doing wrong the object EasyClusterizable, I correct it and I've no problem now. Thanks !