Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Matthew Powers
    @MrPowers
    @nvander1 Sweet, just joined the Mill Gitter channel
    Nik Vanderhoof
    @nvander1
    We'll also need to figure out how to do build matrices with different spark/scala versions. To backport the higher order functions api, I think we need slightly differing implementations depending on the spark version people use
    Sbt has support for cross-scala builds out of the box, but I'm trying to find out if people also build against different versions of dependencies
    Hoping mill turns out to be at least as speedy, because I know how to do those types of matrices in mill
    Matthew Powers
    @MrPowers
    Yea, might not be worth the complexity in that case, haha. I'm all about having a spark-daria philosophy of "upgrade your Spark to the latest version if you want the latest features"
    Nik Vanderhoof
    @nvander1
    Yeah that's another option
    Matthew Powers
    @MrPowers
    @nvander1 - I need to sign off now. Can we jump on a Hangout at some point to brainstorm next steps?
    Nik Vanderhoof
    @nvander1
    Yeah not today though. Sometime during the week will work for me though.
    Matthew Powers
    @MrPowers
    Cool, sounds good.
    Nik Vanderhoof
    @nvander1
    @manuzhang I probably won't have the example of the higher order functions pushed until we get the Mill / Maven stuff sorted
    Manu Zhang
    @manuzhang
    @nvander1 Take your time. It's not must-have since users can always leverage UDAFs as in MrPowers/spark-daria#79. Also following @MrPowers's philosophy, we can merge in your Spark PR apache/spark#24232 for latest Spark version if it doesn't get into the main tree.
    Manu Zhang
    @manuzhang
    I've also done some experiment with Mill in https://github.com/gearpump/gearpump-externals. My impression is that Mill has made it easier for developers to dig into and figure things out while SBT is always like a mystery.
    On the other hand, it has surprised me as in lihaoyi/mill#385
    Manu Zhang
    @manuzhang
    My concern is that Mill is more like one-man's project while SBT has a larger community, a company behind it and many plugins.
    Matthew Powers
    @MrPowers
    It’s a good think spark-daria is such a simple project. It’s probably easy to use either build tool with spark-daria.
    Joaquín Chemile
    @jchemile
    Hello! Greetings from Buenos Aires!!
    Matthew Powers
    @MrPowers
    Welcome @jchemile :)
    Matthew Powers
    @MrPowers
    I released v0.32.0 and scoverage was causing my downstream CIs to throw this error (when spark-daria was included in other projects): scoverage/sbt-scoverage#228 Super annoying. I set coverageEnabled := false in the spark-daria build.sbt file to fix this and did a v0.32.1 release.
    Nik Vanderhoof
    @nvander1
    I don't think we can switch to mill soon. Not until it supports shading. I was taking a look down into https://github.com/shevek/jarjar and into Mill. If I find the time, I'll try to get a PR on mill for it.
    Nik Vanderhoof
    @nvander1
    @MrPowers Do you manually update the docs for spark-daria?
    Maybe we could add a hook in travis to build the docs for each tagged commit to push to the docs to an appropriate github pages?
    ie
    Nik Vanderhoof
    @nvander1
    @MrPowers @manuzhang RE long name, I think daria._ is an acceptable name, only thing to be cautious of is forcing users to change existing imports, although it should just be a simple find and replace for them
    Nik Vanderhoof
    @nvander1
    Nik Vanderhoof
    @nvander1
    mill mill.scalalib.PublishModule/publishAll --sonatypeCreds "$SONATYPE_USER:$SONATYPE_PASS" --publishArtifacts __.publishArtifacts --release false
    Matthew Powers
    @MrPowers
    mill mill.scalalib.PublishModule/publishAll --sonatypeCreds X:Y --publishArtifacts __.publishArtifacts --release false
    [218/218] mill.scalalib.PublishModule.publishAll
    1 targets failed
    mill.scalalib.PublishModule.publishAll os.SubprocessException: CommandResult 2
    os.proc.call(ProcessOps.scala:74)
    mill.scalalib.publish.SonatypePublisher.poorMansSign(SonatypePublisher.scala:146)
    mill.scalalib.publish.SonatypePublisher.$anonfun$publishAll$4(SonatypePublisher.scala:33)
    scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
    scala.collection.Iterator.foreach(Iterator.scala:941)
    scala.collection.Iterator.foreach$(Iterator.scala:941)
    scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
    scala.collection.IterableLike.foreach(IterableLike.scala:74)
    scala.collection.IterableLike.foreach$(IterableLike.scala:73)
    scala.collection.AbstractIterable.foreach(Iterable.scala:56)
    scala.collection.TraversableLike.map(TraversableLike.scala:237)
    scala.collection.TraversableLike.map$(TraversableLike.scala:230)
    scala.collection.AbstractTraversable.map(Traversable.scala:108)
    mill.scalalib.publish.SonatypePublisher.$anonfun$publishAll$2(SonatypePublisher.scala:32)
    scala.collection.TraversableLike$WithFilter.$anonfun$map$2(TraversableLike.scala:742)
    scala.collection.Iterator.foreach(Iterator.scala:941)
    scala.collection.Iterator.foreach$(Iterator.scala:941)
    scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
    scala.collection.IterableLike.foreach(IterableLike.scala:74)
    scala.collection.IterableLike.foreach$(IterableLike.scala:73)
    scala.collection.AbstractIterable.foreach(Iterable.scala:56)
    scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:741)
    mill.scalalib.publish.SonatypePublisher.publishAll(SonatypePublisher.scala:24)
    mill.scalalib.PublishModule$.$anonfun$publishAll$2(PublishModule.scala:117)
    scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    Manu Zhang
    @manuzhang
    @nvander1 :thumbsup: hopefully this could get into the main repo
    Matthew Powers
    @MrPowers
    @nvander1 @manuzhang - I wrote a blog post on dependency injection: https://medium.com/@mrpowers/dependency-injection-with-spark-8367b6956343 Let me know what you think!
    @nvander1 - I created a Giter8 template to easily create Spark SBT projects: https://github.com/MrPowers/spark-sbt.g8 Do you think we should make a Giter8 template project for Mill? That’d hopefully make it easier for other developers to start using Mill.
    Manu Zhang
    @manuzhang
    @MrPowers
    Considering the following codes, where do we get that spark session ? Another thing is I'm not sure it's a good style to have such a long default parameter
    def withStateFullNameInjectDF(
      stateMappingsDF: DataFrame = spark
        .read
        .option("header", true)
        .csv(Config.get("stateMappingsPath"))
    )(df: DataFrame): DataFrame = {
      df
        .join(
          broadcast(stateMappingsDF),
          df("state") <=> stateMappingsDF("state_abbreviation"),
          "left_outer"
        )
        .drop("state_abbreviation")
    }
    Manu Zhang
    @manuzhang
    @MrPowers Please check out my reply on MrPowers/spark-daria#104
    I also fixed a small issue on README. Since the change was minor, I didn't send a pull request
    By the way, have you seen java.lang.OutOfMemoryError: Metaspace error when building the project. I have to add -XX:MaxMetaspaceSize=1024M to java options
    Matthew Powers
    @MrPowers
    @manuzhang - sorry for not responding earlier… I haven’t been on Gitter in a really long time. Sorry!!