Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 07:59
    bishabosha commented #2205
  • 07:55
    bishabosha commented #2205
  • 07:55
    bishabosha commented #2205
  • 07:00
    julienrf closed #2208
  • 07:00

    julienrf on main

    Update Scala version to 3.1.0 Merge pull request #2208 from K… (compare)

  • 06:41
    krasinski opened #12475
  • 02:04
    SethTisue demilestoned #12434
  • 02:04
    SethTisue milestoned #12434
  • 01:17

    SethTisue on main

    remove dead link (compare)

  • Oct 21 22:50
    peter-nuernberg synchronize #2207
  • Oct 21 18:15
    SethTisue milestoned #12474
  • Oct 21 18:15
    SethTisue demilestoned #12474
  • Oct 21 18:15
    SethTisue demilestoned #12470
  • Oct 21 18:15
    SethTisue milestoned #12470
  • Oct 21 18:15
    SethTisue milestoned #12463
  • Oct 21 18:15
    SethTisue demilestoned #12463
  • Oct 21 18:15
    SethTisue milestoned #12438
  • Oct 21 18:15
    SethTisue demilestoned #12438
  • Oct 21 18:15
    SethTisue milestoned #12437
  • Oct 21 18:15
    SethTisue demilestoned #12437
Martijn
@martijnhoekstra:matrix.org
[m]
which we added to the collection
Rob Norris
@tpolecat
i feel like you're starting to channel som snytt
Martijn
@martijnhoekstra:matrix.org
[m]
I may only dream on one day reaching their insightful / understandable ratio
Henri Cook
@henricook
Hi all, i'm a noob to scalamock, i've been a Mockito guy for years. I'm trying to convert a Mockito unit test to a scalamock one and i can't see how to mock a val of a class - can anyone help me out? I feel like it's probably a really simple question but I can't see it anywhere in the docs
e.g. I have class Foo { val aThing: AObject, val bThing: BObject, val cThing: CObject } and I want a mock[Foo] where I can set the value of aThing only
Rohan Sircar
@rohan-sircar

and the real collection was the friends we made along the way
which we added to the collection

aha!

Jim Newton
@jimka2001
I have a set of test suites written using scalatest. When I run all the tests (at least from within IntelliJ), the tests fail for lack of heap space. However, when I run each of the tests in isolation, there is no out of heap error. I'd love to hear suggestions about how to solve this problem?
Rob Norris
@tpolecat
Have you tried increasing your heap space? The default isn't very much.
-Xmx<size> like -Xmx8000m
actually I think you can say -Xmx8g now
Jim Newton
@jimka2001
Sorry, but I don't know where to put -Xmx8g at
Rob Norris
@tpolecat
On the commandline this is what you do. I don't know how to tell IntelliJ to do it. My guess is that it's in the run configuration for your tests.
Someone here who uses IJ can tell you.
D Cameron Mauch
@DCameronMauch
I’m trying to figure out how to create a method with this signature: def fields[T](): List[String] where the output is a list of the field names in a case class T
Rob Norris
@tpolecat
You can do that with shapeless.
D Cameron Mauch
@DCameronMauch
@jimka2001 “IntelliJ IDEA” -> “preferences” -> “build, execute, deploy” -> “compiler” -> “scala compiler” -> “scala compile server"
Jim Newton
@jimka2001
I can add -Xmx8g to the VM options? beginning? end? doesn't matter?
Rob Norris
@tpolecat
Doesn't matter.
D Cameron Mauch
@DCameronMauch
There is the setting right there for maximum heap size
I just change that
Thanks for the pointer
Rob Norris
@tpolecat
In Scala 3 you can do it with Mirror but for Scala 2 you need Shapeless.
An awful lot of questions here (and elsewhere) are about things that are at the edge of what the language can do, and these are exactly the kinds of things that changed in Scala 3. So I think a lot of questions are going to have two answers for a while.
D Cameron Mauch
@DCameronMauch
This seems to work:
implicit class DatasetOps[T: Encoder](ds: Dataset[T]) {
    def asCleaned[U: Encoder](): Dataset[U] = {
      ds.select(
        classOf[U]
          .getDeclaredFields
          .toList
          .map(_.getName)
          .map(col): _*
      ).as[U]
    }
}
Spark is weird. If you have a DataFrame with 10 columns, and convert it to a Dataset of some case class with 6 fields, those extra 4 column are still there. Taking space, slowing down shuffles, etc. This is my attempt at removing all the crud.
D Cameron Mauch
@DCameronMauch
Ah, looks like the above doesn’t compile, though IntelliJ is not showing any errors
Rob Norris
@tpolecat
The declared fields are not necessarily the same thing as the primary constructor arguments.
Which I assume is what you meant by "fields"
D Cameron Mauch
@DCameronMauch
This compiles:
    def asCleaned[U: ClassTag: Encoder](): Dataset[U] = {
      val fields: List[Column] = implicitly[ClassTag[U]].runtimeClass.getDeclaredFields.toList.map(_.getName).map(col)
      ds.select(fields: _*).as[U]
    }
1 reply
It seems to generate the expected list, except in DataBricks, which add some $outer fields to the end...
I’m not sure I understand the difference
I didn’t create some alternative apply
Spark seems to also get this list of fields, and map each column with the right name to a field to then construct the class instance
Rob Norris
@tpolecat
This may work for your specific case but it is very fragile in general.
D Cameron Mauch
@DCameronMauch
So best to stick with the Shapeless solution?
Rob Norris
@tpolecat
I think that would probably be safer.
@ implicitly[ClassTag[String]].runtimeClass.getDeclaredFields.toList.map(_.getName) 
res1: List[String] = List(
  "value",
  "coder",
  "hash",
  "serialVersionUID",
  "COMPACT_STRINGS",
  "serialPersistentFields",
  "CASE_INSENSITIVE_ORDER",
  "LATIN1",
  "UTF16"
)
Those are certainly not the names of the fields of the string constructor.
D Cameron Mauch
@DCameronMauch
Oy, dang
Okay, Shapeless it is
Eric K Richardson
@ekrich
How many lines of code does it take to just select the fields you want and map them into case classes? Or you have so much of it that getting rid of that code is important?
D Cameron Mauch
@DCameronMauch
I was trying to come up with a generic solution, such that a developer could take any case class T, and do something like df.as[T], without having to do something like have a companion object with the list of fields. Though that would be much more straight forward.
Alessandro
@ImGrayMouser_twitter

Hi everyone,
I was playing with a coding challenge. Basically need to remove duplicates.
My first implementation used mutable Array (because the given method signature was providing and expecting Array ).
To make it short, it all boils down to the following. Given

val a1 = Array(1,2,3)
val a2 = Array(1,2,3)

val l1 = List(1,2,3)
val l2 = List(1,2,3)

scala> a1 == a2
res153: Boolean = false

scala> l1 == l2
res154: Boolean = true

Consequently happens this:

scala> val s1 = Set(a1,a2)
s1: scala.collection.immutable.Set[Array[Int]] = Set(Array(1, 2, 3), Array(1, 2, 3))

scala> val s2 = Set(l1,l2)
s2: scala.collection.immutable.Set[List[Int]] = Set(List(1, 2, 3))

Why Array(s) with same elements do not compare equally or, conversely why List(s) do ???

Is there a way for having Set work as expected also with Array(s) ????
Thanks

Luis Miguel Mejía Suárez
@BalmungSan
@DCameronMauch wow, really? Maybe it would be good to open an issue in Spark, I am pretty sure that isn't the intended behaviour.
@ImGrayMouser_twitter because Arrays are not real collections; they are JVM primitives.
And you shouldn't use them, specially when learning. They are only useful for performance sensitive code.
They are mutable, they are invariant, they do not have a pretty toString nor a sensible equals
Let me guess, you are trying to solve some letcode excercises?
1 reply
D Cameron Mauch
@DCameronMauch
People here have said before that a Dataset is more like a view/projection kind of thing. The underlying data structure is still there, no matter how you view it.