Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 16:56
    Aaronontheweb synchronize #4128
  • 16:46

    Aaronontheweb on explain-sharding-path

    (compare)

  • 16:46

    Aaronontheweb on dev

    Update cluster-sharding.md (#41… (compare)

  • 16:46
    Aaronontheweb closed #4164
  • 16:46
    Aaronontheweb closed #4161
  • 16:44
    Aaronontheweb opened #4164
  • 16:43

    Aaronontheweb on explain-sharding-path

    Update cluster-sharding.md (compare)

  • 16:38
    Aaronontheweb assigned #4161
  • 16:38
    Aaronontheweb labeled #4161
  • 16:38
    Aaronontheweb commented #4161
  • 16:37
    Aaronontheweb unlabeled #4078
  • 16:37
    Aaronontheweb closed #4078
  • 16:37
    Aaronontheweb commented #4078
  • 16:01
    Aaronontheweb commented #4126
  • 16:00
    Aaronontheweb synchronize #4126
  • 15:54

    Aaronontheweb on dev

    Add inner exceptions to spec fa… (compare)

  • 15:54
    Aaronontheweb closed #4163
  • 15:54
    Aaronontheweb closed #4162
  • 15:54
    Aaronontheweb labeled #4162
  • 15:54
    Aaronontheweb labeled #4162
Aaron Stannard
@Aaronontheweb
as @vasily-kirichenko pointed out it's probably less effort to use an off-the-shelf tool like Spark for doing deep learning style stuff
since you can leverage a lot of built-in infrastructure and ecosystem knowledge that way
there might be legitimate reasons for using Akka.NET to do it
(one I can think of is needing to execute real-time reactions to deep learnings discovered in real time, but that might be a bit contrived)
lots of users do use Akka.NET for real-time machine learning, but it's not using the deep multi-level networks and stuff (edit: as far as I am personally aware)
they're doing simpler problems like real-time classification
Vasily Kirichenko
@vasily-kirichenko
spark parallelize data and its processing automagically, gives fault tolerance and checkpoining etc. You can implement it yourself, but it's a lot of work. Also, spark is JVM and it means that you have nearly bug free clients for all data storages, message brokers, kafka, etc, etc. and lots of libs. .NET is faaaar behind and I doubt it will ever catch up.
Aaron Stannard
@Aaronontheweb
my experience working with some of our customers who use Spark
traditionally .NET companies with large amounts of data that is still being processed using really old OLAP systems
is that ones who already have the personnel or time to invest in learning enough of the JVM ecosystem are usually happy with the results
others decide that it's worth the trouble of rolling something in-house because there's too many unknowns all at once
going down the Spark route
not issuing a right/wrong judgment on either
but saying that the key success factor in adopting Spark has been being able to commit to supporting the JVM platform long-term, even in a majority .NET shop
we used Hive in addition to Akka.NET at my last company and had great success with it, but that's because we'd committed to understanding the JVM ecosystem years earlier when we went all-in on Apache Cassandra for our storage solution
later on we were able to port those Hive jobs to a very early version of Spark
anyway, bit of a tangent - but you should think critically about what's the right tool for the job both today and years from now
Vasily Kirichenko
@vasily-kirichenko
I absolutely agree. Our company decided to do all new projects in Scala (and there is no rush, it's a strategic decision) because we are really tired with the state of big data in .NET.
Aaron Stannard
@Aaronontheweb
that's why stuff like Mobius exists in the first place
even Microsoft threw in the towel on their own big data solutions
Dryad et al
they decided it was easier to port all of those old C# queries to run on top of the JVM via an adapter layer like Mobius
and leverage the benefits of thousands of man-years worth of work there
Vasily Kirichenko
@vasily-kirichenko
I've not seen great activity in Mobius repo tho.
Aaron Stannard
@Aaronontheweb
the Mobius project itself is basically a series of transpilation hacks
and yeah
I agree with you there
I think with many of these OSS projects Microsoft has released lately
stuff that's not core to their business
or to their customers
Vasily Kirichenko
@vasily-kirichenko
and it's still scary to use it in production. One of the main points of using Spark is that it's used by lots and lots of large companies, so there is a great chance it will work with zero problems for you.
Aaron Stannard
@Aaronontheweb
i.e. Mobius being a good example
they let it languish once they get it to a state where it solves MSFT's internal problems
and don't really commit to supporting it
social capital is a huge part of the value of an OSS ecosystem in general
Vasily Kirichenko
@vasily-kirichenko
TBH, I like the java/scala community a lot more, than .NET and MS as a whole.
(so far :) )
Aaron Stannard
@Aaronontheweb
lol
I haven't been on the contribution side of any real JVM project much
Vasily Kirichenko
@vasily-kirichenko
I'm either ;)
but the "MS will do everything for us" mindset is unhealthy.
Aaron Stannard
@Aaronontheweb
totally agree
looking at some of the stuff MSFT is doing around .NET Core
i.e. killing off the need for third party libraries for things like dependency injection
helps create that mentality too
Vasily Kirichenko
@vasily-kirichenko
"oh, no, another shitty open source library..." - I heard this from C# devs several times.
yes.
Another good thing about jvm - a lot of libraries. It's shocking for the first time, especially after years writing F# :)
Aaron Stannard
@Aaronontheweb
there does seem to be an awfully large amount of NIH-ing in .NET sometimes
like there is some effort threshhold