These are chat archives for ldnclj/chat

15th
May 2015
Thomas
@thomas-shares
May 15 2015 07:49
नमस्कार
Thomas
@thomas-shares
May 15 2015 10:07
Just found out there is a local OSM mapping group here in So'ton.
Bruce Durling
@otfrom
May 15 2015 11:55
I would have thought w/Univerity of Southampton doing so much open data w/Sir Nigel that there would be quite a lot going on there with OSM
Thomas
@thomas-shares
May 15 2015 12:07
Ordnance Survey is based here as well of course... so there is local "competition" so to speak off.
Thomas
@thomas-shares
May 15 2015 13:57
Clojure O'Clock again.
core.typed today
Korny Sietsma
@kornysietsma
May 15 2015 15:43
ok, a friday puzzle - given a lazy sequence of email addresses, not guaranteed unique, produce a collection of n unique emails. For bonus points, produce a lazy sequence of unique emails :)
(I'm doing this now - using faker to make the emails - they're pretty random, but still have collisions if you make enough of them)
Benedek Fazekas
@benedekfazekas
May 15 2015 15:47
(take n (set emails)) ?
Korny Sietsma
@kornysietsma
May 15 2015 15:48
I don't think set is lazy
thattommyhall
@thattommyhall
May 15 2015 15:48
does not work on infinite list
Korny Sietsma
@kornysietsma
May 15 2015 15:48
I'm looking at (distinct emails) ...
Benedek Fazekas
@benedekfazekas
May 15 2015 15:49
you said that was for bonus. always do the brute force first, right? ;)
thattommyhall
@thattommyhall
May 15 2015 15:49
I would not have thought distinct was lazy either
Benedek Fazekas
@benedekfazekas
May 15 2015 15:49
seems it is...
Korny Sietsma
@kornysietsma
May 15 2015 15:51
yep - looks like (take 10 (distinct (emails))) works.
Thomas
@thomas-shares
May 15 2015 15:51
wouldn't that be a potential memory leak...first email address might come again after 100M...
Korny Sietsma
@kornysietsma
May 15 2015 15:52
definitely it's leaky, but that's unavoidable - something has to track all the emails you've seen so far.
Thomas
@thomas-shares
May 15 2015 15:53
ok, that's what I thought
Korny Sietsma
@kornysietsma
May 15 2015 15:53
interesting to read the source:
Thomas
@thomas-shares
May 15 2015 15:53
good exercise though
Benedek Fazekas
@benedekfazekas
May 15 2015 15:53
it is local distinct ;)
Korny Sietsma
@kornysietsma
May 15 2015 15:53
(defn distinct
  "Returns a lazy sequence of the elements of coll with duplicates removed"
  {:added "1.0"
   :static true}
  [coll]
    (let [step (fn step [xs seen]
                   (lazy-seq
                    ((fn [[f :as xs] seen]
                      (when-let [s (seq xs)]
                        (if (contains? seen f) 
                          (recur (rest s) seen)
                          (cons f (step (rest s) (conj seen f))))))
                     xs seen)))]
      (step coll #{})))
thattommyhall
@thattommyhall
May 15 2015 16:00
nice
thats how I would have done the orig problem
Korny Sietsma
@kornysietsma
May 15 2015 16:04
That's how I like to think I would have done it... probably after an hour of head-scratching and trying to remember how lazy-seq works.
Thomas
@thomas-shares
May 15 2015 16:19
going home now... ttfn
otfrom @otfrom wonders if you need a bloomfilter to keep the memory down
Korny Sietsma
@kornysietsma
May 15 2015 16:46
I'm building a one-off data seeder so can be pretty hacky, thankfully.