Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 10:45
    lukaseder edited #12380
  • 10:44
    lrytz synchronize #9579
  • 10:44
    lukaseder opened #12380
  • 10:15
    lrytz commented #9579
  • 10:13
    lrytz commented #9579
  • 10:12
    lrytz ready_for_review #9579
  • 09:49
    scala-jenkins milestoned #9586
  • 09:49
    anatoliykmetyuk opened #9586
  • 09:02

    lrytz on travis-jdk16

    Handle a few JDK deprecations allow reflective access to java… Fix tests for JDK 16 and 3 more (compare)

  • 09:02

    lrytz on travis-jdk16

    (compare)

  • 09:01
    lrytz synchronize #9579
  • 08:55

    lrytz on travis-jdk16

    Handle a few JDK deprecations allow reflective access to java… Fix tests for JDK 16 and 2 more (compare)

  • 08:55

    lrytz on travis-jdk16

    (compare)

  • 08:53
    lrytz synchronize #9579
  • 08:45
    lrytz milestoned #12379
  • 08:45
    lrytz opened #12379
  • 08:45
    lrytz labeled #12379
  • 08:45
    lrytz labeled #12379
  • 08:37
    lrytz commented #3613
  • 08:09
    michelou commented #12378
Rob Norris
@tpolecat
I believe it's just a common convention. Haoyi stuff and Coursier use :: and sbt uses %%
D Cameron Mauch
@DCameronMauch
Cats add catchNonFatal method to Either. Any idea why there is no equivalent for Option?
Rob Norris
@tpolecat
Such a method would lose information in the error case. You can use .toOption on your Either if you want to swallow the error.
D Cameron Mauch
@DCameronMauch
Gotcha. I see the idea. But from another perspective, I would be choosing to lose that information, just by using it.
Seems kinda a waste to make an Either just to turn it immediately into an Option
Rob Norris
@tpolecat
cats might take a PR adding this, dunno
experience has shown that swallowing exceptions is bad
D Cameron Mauch
@DCameronMauch

Upgrading our code from Spark 2.4.7 to 3.0.1, and getting this error: object typed in package scalalang is deprecated (since 3.0.0): please use untyped builtin aggregate functions in this code:

    val meanSores: Dataset[(Int, Double)] = classified
      .filter(e => e._2 >= 0.0 && e._2 <= 1.0)
      .groupByKey { case (typeId, _) => typeId }
      .agg(typed.avg(_._2))

But I can’t figure out what it’s supposed to look like. All the examples I could find online use that typed.someAggregateFunction pattern...

The type of classified in this example is Dataset[(Int, Double)]
Hanns Holger Rutz
@Sciss
Does anyone know of a Mastodon bot written in Scala? Like posting toots, preferably with image attachments.
D Cameron Mauch
@DCameronMauch
I have to write my own avg function? Seems like a pretty standard kinda thing to do. Just can’t find any avg with a signature like A => B, where A is the dataset type and B is the simple numberic type to average.
Hanns Holger Rutz
@Sciss
Huh, this https://github.com/d6y/brighton-tide-post still seems to be posting, so I guess the API is up to date: https://mastodon.social/@brightontide
Luis Miguel Mejía Suárez
@BalmungSan
@DCameronMauch those should already exist, not sure how to use it.
Maybe just write plain SQL? Spark is deprecating everything typed anyways.
D Cameron Mauch
@DCameronMauch
I can only find avg that takes a String or a Column
Luis Miguel Mejía Suárez
@BalmungSan
Yeah you should pass the column name.
D Cameron Mauch
@DCameronMauch
I don’t have a column name
This is a DataSet, not Dataframe
Luis Miguel Mejía Suárez
@BalmungSan
You can give it one before the agg
D Cameron Mauch
@DCameronMauch
What would that look like?
Luis Miguel Mejía Suárez
@BalmungSan
There is a method to name columns, I do not remember which one it was.
alias I believe.
Also, it may be better to just use DataFrames?
Ethan
@esuntag:matrix.org
[m]
Maybe frameless can help? It's at least an attempt to bring some order to spark. I think a datarame is just a Dataset[Row] in the scala api, so that conversation should be possible if it's helpful
D Cameron Mauch
@DCameronMauch
Yeah, I got it working, but only by converting it to a DataFrame, and then doing groupBy and agg(avg(…)), and back to Dataset… I feel dirty
Jez, I would probably prefer the custom aggregate UDF
Ethan
@esuntag:matrix.org
[m]
:point_up: Edit: Maybe frameless can help? It's at least an attempt to bring some order to spark. I think a Dataframe is just a Dataset[Row] in the scala api, so that conversation should be possible if it's helpful
karlmcphee
@karlmcphee
I'm trying to train a ml model on imagenet. Are there specific spark frameworks or language integrations that work well with very large datasets?
Ethan
@esuntag:matrix.org
[m]
Define large dataset. If you're talking matrix transformations across multi machine clusters, spark is probably your best option. If you're on one machine I think breeze is supposed to be good
zeroexcuses
@zeroexcuses

If you are training on a single GPU, the most important thing is keeping the GPU 'fed' (i.e. sending it enough data so the compute units aren't waiting on memory DMA).

If you are training on a cluster of GPUs -- I don't have the budget for this -- but the most important thing is probably figuring out how to update the training weights (across multiple GPUs).

karlmcphee
@karlmcphee
Is spark faster than keras when they're both running an ml script on one gpu? And not counting resources, is there a particular way to run spark to make it less likely to crash on large datasets?
Ethan
@esuntag:matrix.org
[m]
I've never used GPU with spark before, although apparently you can set it up to do so. If I had to guess, spark will probably lose every benchmark in a one machine situation against anything, it's really only useful if you need a cluster
karlmcphee
@karlmcphee
Alright, thanks for the help.
Ethan
@esuntag:matrix.org
[m]
It looks like there's guides on using spark and keras together too
Leif Warner
@LeifW
So, SBT generates JUnit XML test reports by itself these days - should I just use those, and not Scatest's JUnit XML generation?
Matt Hicks
@darkfrog26
is there a performance impact from calling toVector on an Array to avoid mutability?
of course, there's always an impact from any operation, but for a very large array is it more than an inconsequential impact?
Ethan
@esuntag:matrix.org
[m]
@darkfrog26: I'm fairly certain it will have to copy the entire array. Ultimately going to need to profile though, how consequential it is depends on what else you're doing.
Nik
@your-psychiatrist:ellipsen.net
[m]
is there some cool scala merch like shirts etc.?
except redbubble
Luis Miguel Mejía Suárez
@BalmungSan
@darkfrog26 why not using ArraySeq?
Rob Norris
@tpolecat
When we start having in-person conferences again the swag will start flowing. I don’t know of any online store.
daenyth
@daenyth:matrix.org
[m]
@darkfrog26: you might be able to use fs2.Chunk instead, which should wrap the mutable array safely
Rob Norris
@tpolecat
What do you need to do with it once it’s immutable? Do you really need a collection type?
peterstorm
@peterstorm:matrix.org
[m]
Um, if I package a Scala program as a fatJar or something, and want to call it from from Java code, would that be trivial?
Using IOApp from CE as my main
Rob Norris
@tpolecat
Yes, you can certainly call main from Java since it’s part of the JVM spec and it has a perfectly normal Java type.
You can’t call into Scala code from Java in general but if you’re calling a method or constructor with types that make sense in Java (like main) it will work fine.
peterstorm
@peterstorm:matrix.org
[m]
Ok, so if I use something like sbt-assembly to package a jar, and import that into my Java app, I should be able to call my main from that?