Repo info
    I've just come across this project, and as a long-time .NET dev who's just getting started with Cassandra/Spark deployments, this is very compelling. I have a few questions: is the Mobius project going to support .NET Core on Linux (or does it already)? We've made an investment into DNX already, and it would be interesting to reuse this within Spark. And, additionally, is it possible to incorporate external libraries into the Spark worker - for example, processing a video frame using ffmpeg?
    Kaarthik Sivashanmugam
    @kirk-marple Mobius is built in Linux and unit tests are reunion as a part of CI. We leverage Mono for this. Beyond this we have not done any validation in Linux. With Mobius you should be able to use existing.NET
    Libraries. If your video processing library is built for.NET you can use it
    when I am calling var sparkContext = new SparkContext(new SparkConf().SetAppName("Sample App")), i get this exception, "System memory 259522560 must be at least 4.718592E8. Please use a larger heap size." Is there anything i need to set
    Kaarthik Sivashanmugam
    Are you using Xmx setting for JVM?
    No, I think. I am using SparkClrSubmit.cmd command what we have in SparkCLR solution
    Scott Lyons
    Hello from #sparksummit
    Kaarthik Sivashanmugam
    Hi Scott - I missed the opportunity to chat with you after the talk. There were lot of questions after the talk and I guess you had to leave for your next session.
    Scott Lyons
    Unfortunately I had to catch my flight or I would have stayed
    Kaarthik Sivashanmugam
    @slyons - good that we at least had the opportunity to say Hi before the talk...thanks for stopping by and introducing yourself

    Hi Kaarthik. I just cloned your Git repo. I am not really a Spark developer, so my question might be fairly elemental. I got everything building, and the unit tests run just fine. What I want to do is debug the Pi sample from within Visual Studio 2015. I first opened up a Visual Studio Developer's Command Prompt and as per your debug instructions, ran the command "sparkclr-submit.cmd debug". I got the message "[CSharpRunner.main] Backend running debug mode. Press enter to exit" .... so far, so good. Then, I opened the SparkCLR.sln solution in Visual Studio, set the startup program to the "Samples" project, and in the project properties, I set the arguments to "--torun pi*". I then started the debug session. After the 300,000 integer array was initialized, I got these errors:

    JVM method execution failed: Static method collectAndServe failed for class org.apache.spark.api.python.PythonRDD when called with 1 parameters ([Index=1, Type=JvmObjectReference, Value=13], )
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketException: Connection reset by peer: socket write error

    Do you have any idea? I am sure that I need to set some jar file path or class path somewhere.

    Kaarthik Sivashanmugam
    @magmasystems - it looks like you have not set CSharpWorkerPath in app.config. Refer to https://github.com/Microsoft/Mobius/blob/master/notes/running-mobius-app.md#debug-mode for instructions
    If you are running RDD samples/examples in Mobius repo in debug mode, CSharpWorkerPath is required. Running DataFrame examples that do not use UDF do not require CSharpWorkerPath value to be specified
    Thanks very much, @skaarthik . That did the trick. I did have the CSharpWorkerPath value specified, but the path was to another version of CSharpWorker.exe in the root directory.
    Hi all
    Hi, I am new to Spark. How can I submit the word count(Mobius driver) program in Azure HDInsight Spark?
    Stefan Mršević
    I am experiencing the similar problem as the @magmasystems. I have CsharpWorkPath set up as well as the port, I have tried with the version from my /bin/debug and the driver version. I am trying to run Pi program.
    The exact error is:
    {"JVM method execution failed: Static method collectAndServe failed for class org.apache.spark.api.python.PythonRDD when called with 1 parameters ([Index=1, Type=JvmObjectReference, Value=13], )"}
    I have also tried with the temp parameter but it does not do any good as far as I have seen. I have completely disabled my AV software and I have also disabled MS Defender.
    Loch Wansbrough
    Is it possible to use Mobius to create a custom RDD? And if so, is there an example of that? I essentially need spark-redis but written in C# with Mobius.
    Loch Wansbrough
    I see now that Mobius is essentially proxying requests to Spark. So I believe I'd need to have Mobius build spark with spark-redis, then I'd have to write the proxying code. Correct?
    Hi guys, I want to do some math using Mobius. I got the rdd, similar to the Pi example and I want to compute factorial of each value in the Map function. But it seems that I cannot include the for loop inside the map function. Is there any way to compute the factorial?
    Kaarthik Sivashanmugam
    @lwansbrough - correct
    Kaarthik Sivashanmugam
    @MarWin93 - try using Reduce() on your RDD
    Hi Karthik..
    StreamingContext.GetOrCreate method throws an exception when checkpointpath is empty.
    Ron DeFreitas
    Hi all, I'm curious if anyone is working on getting Mobius working on Spark 2.1? I'm targeting the latest HDInsight and would love to leverage C# for building Spark applications.
    hi All, Anyone tried Mobius with Azure Data ware house?..
    hi All, Anyone tried Mobius with Azure Data ware house or with Azure cosmos?
    Bill Berry
    Hey All ... just created this issue and would love some assistance: Microsoft/Mobius#679
    Hi All, i have a large amount of data in a SQL server. However I want to migrate that historical data to cassandra. I already have a tool (written in C#) to migrate that data to cassandra (multi-thread) however it is really slow. Can i use mobius (spark) to get that information faster from sql server and put it to cassandra? Is that possible? How can i do it? ty all
    @joaocepa94 Hi,Are you Sloved your problem?
    Hi. I'm curious why there are no basic string functions like "trim" included in the Functions class to be used with the .WithColumn method? I think it's supported in apache-spark...https://spark.apache.org/docs/2.0.0/api/java/org/apache/spark/sql/functions.html
    Apostolos Daniel Apostolidis

    Hi all, looking to try out mobius as a way of trying out spark to build a simple etl. Our data sources are mostly SQL Server, a little bit Oracle and some CSV files. I have just started reading up on Mobius but was curious to see if this is a solution that could work. I should also mention that we use Entity Framework extensively. However, my understanding is that pure SQL or json files is how Mobius/Spark is used.

    Many thanks in advance!


    I'm having trouble using the C# mobius Spark package. I'm fairly new to C# and am using .net core. I'm getting the error

    Unhandled Exception: System.MissingMethodException: Method not found: 'System.AppDomainSetup System.AppDomain.get_SetupInformation()'.
    at Microsoft.Spark.CSharp.Services.LoggerServiceFactory.GetDefaultLogger()
    at System.Lazy1.ViaFactory(LazyThreadSafetyMode mode) at System.Lazy1.ExecutionAndPublication(LazyHelper executionAndPublication, Boolean useDefaultConstructor)
    at System.Lazy`1.CreateValue()
    at Microsoft.Spark.CSharp.Core.SparkConf..ctor(Boolean loadDefaults)
    at Microsoft.Spark.CSharp.Examples.SparkProcessor.Main(String[] args) in /Users/jokim/workspace/IFFParserSpark/SparkProcessor/SparkProcessor/Program.cs:line 20

    is the package not compatible with .net core? am I doing something wrong where I can change the framework>?

    I tried using mono but i keep getting

    Program.cs(5,17): error CS0234: The type or namespace name 'Spark' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
    Program.cs(6,17): error CS0234: The type or namespace name 'Spark' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)

    when doing csc Program.cs

    using Microsoft.Spark.CSharp.Core;
    using Microsoft.Spark.CSharp.Streaming;

    I added the package and it exists... so compiling wise nothing is red and everything works but when I try to run it using mono I keep getting that error. I'm using MAc

    I have a requirement of using c# in Azure Databricks notebooks
    How can I do that?
    Can you reply at the earliest?