These are chat archives for thunder-project/thunder

5th
Apr 2016
lilumb
@lilumb
Apr 05 2016 18:20
@lilumb With apologies for pestering @freeman-lab and others, but ... Can I assume that the ability to cross-correlate two time series will be included in a future version of Thunder? I'm writing up an article, and want to ensure I don't misrepresent Thunder on this point.
Davis Bennett
@d-v-b
Apr 05 2016 18:30
@lilumb you can do this in spark already
and therefore in thunder
lilumb
@lilumb
Apr 05 2016 18:33
@d-v-b Thank you for your response. My original question had to do with Thunder's own capability for cross-correlation (http://thunder-project.org/thunder/docs/generated/thunder.TimeSeries.html#thunder.TimeSeries.crossCorr).
Davis Bennett
@d-v-b
Apr 05 2016 18:37
@lilumb as of now cross correlation is a method on Series objects in the upcoming thunder 1.0 release
lilumb
@lilumb
Apr 05 2016 18:44
@d-v-b So Thunder's support remains the same as described at https://gitter.im/thunder-project/thunder?at=55c6349b2ee3da6275c3345e ... the emphasis is on correlating all time series to a single target, not correlating a pair of time series.
Davis Bennett
@d-v-b
Apr 05 2016 18:50
@lilumb I think what you are describing is taking two series objects a and b, each with the same number of records, and then generating a third series object where each record is the result of xcorr(a,b)?
lilumb
@lilumb
Apr 05 2016 18:51
@d-v-b Precisely! Others have expressed similar interests here ;-)
Davis Bennett
@d-v-b
Apr 05 2016 18:53
I don't know if anyone is planning on making this in thunder, but I have done this myself using the underlying rdd objects
lilumb
@lilumb
Apr 05 2016 18:54
@d-v-b Great to know this is indeed possible. Any tips, references, examples, etc. would be greatly appreciated.
Davis Bennett
@d-v-b
Apr 05 2016 19:02
def fun(a,b):
  # do your cross-correlation here
  from scipy.signal import fftconvolve
  return fftconvolve(a, b[::-1])

xcorr_rdd = series_a.rdd.join(series_b.rdd).mapValues(fun(a,b))
lilumb
@lilumb
Apr 05 2016 19:04
@d-v-b Thank you kindly!!!
Davis Bennett
@d-v-b
Apr 05 2016 19:04
no guarantee that this actually works :) I re-arranged something I wrote somewhere else
lilumb
@lilumb
Apr 05 2016 19:04
@d-v-b Of course not ;-)
Davis Bennett
@d-v-b
Apr 05 2016 19:05
but a general rule is that if thunder doesn't have some functionality you want, it's usually possible to get that functionality easily with the interface to the underlying rdd
lilumb
@lilumb
Apr 05 2016 19:11
@d-v-b Absolutely! If this isn't drawn out in the docs already, perhaps it should be???
Davis Bennett
@d-v-b
Apr 05 2016 19:13
it's all drawn out in the pyspark docs http://spark.apache.org/docs/latest/api/python/
lilumb
@lilumb
Apr 05 2016 19:17
@d-v-b ... a reference I am aware of - obviously, not deeply enough ... RTFM, @lilumb! ;-)