my experience working with some of our customers who use Spark
traditionally .NET companies with large amounts of data that is still being processed using really old OLAP systems
is that ones who already have the personnel or time to invest in learning enough of the JVM ecosystem are usually happy with the results
others decide that it's worth the trouble of rolling something in-house because there's too many unknowns all at once
going down the Spark route
not issuing a right/wrong judgment on either
but saying that the key success factor in adopting Spark has been being able to commit to supporting the JVM platform long-term, even in a majority .NET shop
we used Hive in addition to Akka.NET at my last company and had great success with it, but that's because we'd committed to understanding the JVM ecosystem years earlier when we went all-in on Apache Cassandra for our storage solution
later on we were able to port those Hive jobs to a very early version of Spark
anyway, bit of a tangent - but you should think critically about what's the right tool for the job both today and years from now
I absolutely agree. Our company decided to do all new projects in Scala (and there is no rush, it's a strategic decision) because we are really tired with the state of big data in .NET.
that's why stuff like Mobius exists in the first place
even Microsoft threw in the towel on their own big data solutions
Dryad et al
they decided it was easier to port all of those old C# queries to run on top of the JVM via an adapter layer like Mobius
and leverage the benefits of thousands of man-years worth of work there
I've not seen great activity in Mobius repo tho.
the Mobius project itself is basically a series of transpilation hacks
I agree with you there
I think with many of these OSS projects Microsoft has released lately
stuff that's not core to their business
or to their customers
and it's still scary to use it in production. One of the main points of using Spark is that it's used by lots and lots of large companies, so there is a great chance it will work with zero problems for you.
i.e. Mobius being a good example
they let it languish once they get it to a state where it solves MSFT's internal problems
and don't really commit to supporting it
social capital is a huge part of the value of an OSS ecosystem in general
TBH, I like the java/scala community a lot more, than .NET and MS as a whole.
(so far :) )
I haven't been on the contribution side of any real JVM project much
I'm either ;)
but the "MS will do everything for us" mindset is unhealthy.
looking at some of the stuff MSFT is doing around .NET Core
i.e. killing off the need for third party libraries for things like dependency injection
helps create that mentality too
"oh, no, another shitty open source library..." - I heard this from C# devs several times.
Another good thing about jvm - a lot of libraries. It's shocking for the first time, especially after years writing F# :)
there does seem to be an awfully large amount of NIH-ing in .NET sometimes
like there is some effort threshhold
where if the effort is low enough to just roll your own library, people will do that rather than coalesce around a standard
that's why there's like 30+ Zipkin client implementations in .NET, all missing different features / feature-completeness
there should still be some multiple choices / alternatives for libraries - helps keep the ecosystem healthy
but the ecosystem can never really take off if there isn't some knowledge institutionalized in the use of key ones
I'm really glad stuff like DotNetty and Akka.NET exists today; back when I first needed those tools five years ago I had to make them myself