Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Pierre Gronlier
    @ticapix
    Hello
    Can I specify a number of executor (VM) with ovh-spark-submit ?
    (at the moment, the bottleneck of my job is the network and fetching the data from swift. If I can have more VM, I can increase my total BW)
    Mojtaba Imani
    @mojtabaimani
    Hello, No, --num-executors option is just for YARN only. Here we use "standalone" cluster management system
    You can have more VMs by --total-executors-cores option.
    Pierre Gronlier
    @ticapix
    by default, do we know what kind of VM it is spawing ?
    Mojtaba Imani
    @mojtabaimani
    b2-15
    each VM has 4 cores
    always +1 VM for master node. You can do the math
    Pierre Gronlier
    @ticapix
    ok
    ovh-spark-submit --properties-file config.conf --total-executors-cores 64 --class DemoWiki target/scala-2.11/DemoWiki-assembly-1.0.jar
    flag provided but not defined: -total-executors-cores
    Mojtaba Imani
    @mojtabaimani
    remove the "s" after executor
    --total-executor-cores
    Pierre Gronlier
    @ticapix
    yep, better
    Mojtaba Imani
    @mojtabaimani
    just make sure that you have enough quota to create the quantity of servers you like
    Pierre Gronlier
    @ticapix

    Creating servers...
    Error creating server: 3915b6f5-1e3b-4e1d-922a-b5863800b163
    Error creating server

    +-----------------------------------------+--------+----------+
    | Name | Status | Networks |
    +-----------------------------------------+--------+----------+
    | 3915b6f5-1e3b-4e1d-922a-b5863800b163-17 | ERROR | |
    | 3915b6f5-1e3b-4e1d-922a-b5863800b163-16 | ERROR | |
    ....
    | 3915b6f5-1e3b-4e1d-922a-b5863800b163-2 | ERROR | |
    | 3915b6f5-1e3b-4e1d-922a-b5863800b163-1 | ERROR | |
    +-----------------------------------------+--------+----------+
    Server creating error.
    Deleting all servers...

    Mojtaba Imani
    @mojtabaimani
    Do you have enough quota?
    Pierre Gronlier
    @ticapix
    checking
    yes, quota looks good
    my guess would be a underlying 500 error from openstack when spawing the vm
    Pierre Gronlier
    @ticapix
    maybe not. It also fails with 32
    also failing with 16
    Mojtaba Imani
    @mojtabaimani
    Do you succeed in running the SparkPI sample?
    Pierre Gronlier
    @ticapix
    IIRC, yes.
    Mojtaba Imani
    @mojtabaimani
    Would you please repeat SparkPI test with 16 or 32 cores?
    Mojtaba Imani
    @mojtabaimani
    maybe there would be problem from PCI or specific region. Would you please try another region?
    You can change your region by editing your openrc.sh and OS_REGION_NAME environment variable
    I just tried UK1 and it worked.
    Pierre Gronlier
    @ticapix
    indeed, sample seems to work in SBG5
    I was using GRA5
    Mojtaba Imani
    @mojtabaimani
    I received error in GRA5 as well.
    Can you run your cluster in other region that GRA5?
    Mojtaba Imani
    @mojtabaimani
    Did you succeed to create servers in other region?
    Pierre Gronlier
    @ticapix
    yes
    Mojtaba Imani
    @mojtabaimani
    ok, good
    Pierre Gronlier
    @ticapix
    Hello :)
    I managed to count the number of wikipedia english pages. It took 10mins with 16x b2-15
    and there are 19441539 pages :)
    Server creation time =98
    Spark setup time = 78
    Cluster creation time (Server creation + Spark Setup) = 176
    Spark application time = 439
    Total time = 634
    Mojtaba Imani
    @mojtabaimani
    Great
    so 3 minutes for cluster creation and 7 minutes for spark computation
    Rémi Alvado
    @remialvado_gitlab
    Hello. I'm going to do some tests with analytics data compute to stream data from our mongodb cluster to Google BigQuery for analytics. Our MongoDB cluster only listen on our vrack. I know we can attach a public cloud instance to a vrack but is it possible to do it with ovh-spark-command ?
    baaastijn
    @baaastijn
    hi @remialvado_gitlab , we’ve seen your question and you’ll have a full answer tomorrow
    Rémi Alvado
    @remialvado_gitlab
    Hi @baaastijn and thanks :)
    Mojtaba Imani
    @mojtabaimani
    Hello @remialvado_gitlab, in ADC there is possibility to create the spark cluster inside your vRack, so the cluster will have access to your resources in that vRack.
    Rémi Alvado
    @remialvado_gitlab
    nice ! Is there any piece of documentation about this feature ?
    Mojtaba Imani
    @mojtabaimani
    First I need your tenant ID to give you some quota for floating IP. Would you please send me you tenant ID?
    Mojtaba Imani
    @mojtabaimani
    For creating Spark Cluster in vRack private network, we currently use floating IP and Virtual Router technology which is in beta version in OVH now and they are not activated by default for users. But we can give this features to a customer if needed. So we can activate it for a tenant ID
    Mateusz Worotyński
    @worotyns_twitter
    Hi everyone, have a nice day :)
    Mojtaba Imani
    @mojtabaimani
    Hello