Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    christophe mogentale
    @cmogentale_twitter

    Hello,

    I'm seeking for advice, as i've been struggling for days with problems under heavy load of phoenix query server.
    To explain a bit more : i have 3 phoenix query servers behind knox. I load balanced them through the ha role in knox.
    I'm accessing phoenix through php webservices using the simba's odbc driver. When testing each webservice, everything works fine.
    (Only curious thing is that it takes nearly one second to establish first connection (odbc_connect) to phoenix, then the odbc_connect is very fast.)
    When i open my website, which takes quite a load (but nothing that a single mysql server used to handle, so i'm wondering why), in one minute, i get my apache log filled with some errors regarding the failed connections to phoenix :
    mainly to different errors (i use the error_log php function to log in these files what odbc_error_message returns:)

    after a failed odbc_connect:
    S1000 ## [unixODBC][Hortonworks][Phoenix] (40) Error with HTTP request, response code: 500
    after a failed cluster query (which might result for the previous error):
    S1000 ## [Hortonworks][Phoenix] (2100) An error occured while preparing statement: \n8org.apache.calcite.avatica.proto.Responses$ErrorResponse\x12\x1a\x13\n\x1a\x12org.apache.calcite.avatica.NoSuchConnectionException\n\tat org.apache.calcite.avatica.jdbc.JdbcMeta.getConnection(JdbcMeta.java:565)\n\tat org.apache.calcite.avatica.jdbc.JdbcMeta.prepare(JdbcMeta.java:690)\n\tat org.apache.calcite.avatica.remote.LocalService.apply(LocalService.java:209)\n\tat org.apache.calcite.avatica.remote.Service$PrepareRequest.accept(Service.java:1199)\n\ta##select next value for "akinator_device_tags_id" as "nextvalue"

    Would you have any clue or insight about that ? Am i missing some important phoenix config options ?
    At first i thought about write latency, linked to index update in phoenix, but i also get the errors even when deactivating upsert queries, allowing only select ones.

    Hugo Larcher
    @Hugoch
    Hello Christophe! Knox load balancing is just for HA. In case of back end failure it switches to next available backend. So 1 Phoenix server is effectively handling all the load. Depending on the number of queries it may queue them leading to timeout or connection pool exhaustion.
    christophe mogentale
    @cmogentale_twitter
    HelloHugo, thank you for your answer. Do you think the message error i take is a symptom to a lack of loadbalancing ? Do you know if there are some parameters to increase maximum queue size on phoenix ? I saw parameters link to threading but i'm not very confident changing them ^^
    Hugo Larcher
    @Hugoch
    Hey @cmogentale_twitter , first I would try to disable AVATICA HA in Knox. NoSuchConnection exception may mean that Knox switches from one Phoenix to another, so an existing connection on Phoenix 1 is then used on Phoenix 2 leading to NoSuchConnectionException. If error disappears, it means that loadbalancing (if needed) must be done in front of Knox. You must spawn multiple Knox servers, and install a loadbalancer in front of them (you may use existing Nginx on your LB)
    christophe mogentale
    @cmogentale_twitter
    Hello Hugo, thank you very much, i will try to disable ha first
    christophe mogentale
    @cmogentale_twitter
    Hello, i recently ran into a problem when launching queries fetching massive amount of data in phoenix from the cluster. I had to increase, in nginx.conf the proxy_timeout values, and i also had to add parameters in the custom gateway site for knox : gateway.httpclient.connectionTimeout,
    gateway.httpclient.socketTimeout,
    httpclient.connectionTimeout,
    httpclient.socketTimeout, all set to 600000 to avoid the problem. I thought it might be helpful to add these settings in your default config for future clusters
    Hugo Larcher
    @Hugoch
    Hello Christophe! Right, timeout can be difficult to deal with for such big queries! It totally make sense to upgrade then for long polling endpoint. Just take care of possible connection pool exhaustion if you have many clients 😊 we will had that to the documentation, thx !
    Alejandro Drago
    @Aleesminombre_twitter
    s
    Vincent Rodier
    @RodierVincent_twitter
    Hello, I deployed a data anlytics platform with ovh api and capture the bastion ip to connect to the cluster and other nodes. Is there a way to retrieve or generate free ipa admin pass from the ovh api. Because I need this to create an account to gain access on other cluster nodes. As I need to automate this I cannot use my Public Cloud web interface manually to do this. Thanks
    impolitepanda
    @impolitepanda
    Hello ! The generation of passwords is done only through the webpage. This is done for security reasons, as this way we, at OVHcloud, never have access to your passwords in clear.
    I'll find the javascript code used to generate what you need. This way, you'll be able to adapt it for your tool
    This is the link to the opensource code of the page generating the credentials. It should help you in designing what you need.
    Vincent Rodier
    @RodierVincent_twitter
    thanks, I will check that
    Vincent Rodier
    @RodierVincent_twitter
    Hello, I have another question. When deploying an analytics platform by default ssh on bastion node is in interactive mode only. How can I automate the conf change to allow ssh non interactive with commands or scp? I can do it manually in /etc/sshd_config but is there any kind of ovh action script to do it automatically when the node starts?
    thanks
    Vincent
    impolitepanda
    @impolitepanda
    Hello @RodierVincent_twitter . For audit purposes, the bastion logs everything that goes through it, and that's why SSH is in interactive mode only. For this reason, we don't have any tool/script to change it to non interactive, sorry
    Vincent Rodier
    @RodierVincent_twitter
    Hello, ok. I saw on the api /cloud/project/{serviceName}/instance a userData field. Is it for this purpose? run script when instance first launching? If yes is it possible to have this in the future for analytics platform instance?
    thanks
    because to the moment it's impossible to automate an analytics platform deployment + install scripts on nodes without manual action
    Pirionfr
    @Pirionfr
    hello @RodierVincent_twitter. we may found a solution for you. we are doing some test about it.
    Pirionfr
    @Pirionfr

    Hello @RodierVincent_twitter,

    Interactive mode is required for audits logs on bastion for security purpose. The problem with this feature is that it breaks TCP Forwarding andyou can't use tools like ansible to deploy/update tools onto instances of this ADP cluster.

    You have two options to bypass that:

    1. deactivate interactive mode on bastion. That means, no more audit.
    2. add a new host in the same private network dedicated to your deployment for example.

    1- Deactivate interactive mode on bastion - Step-by-step guide

    1. Edit /etc/ssh/sshd_config
    2. Comment these lines
      #ForceCommand /opt/bastion/bastion
      #AllowTcpForwarding no
    3. Restart sshd
      systemctl restart sshd
      This action is permanent. No code to force interactive mode at boot time.

    Of course this deactivation can be done temporarily. An automated procedure based on Ansible for example can be setup to deactivate this interactive mode just for the time it takes to apply some changes:

    • Connect to bastion and disable interactive mode (See above)
    • Apply your changes onto instances of your ADP cluster
    • Reactivate the interactive mode

    2 - Create a new bastion - Step-by-step guide

    1. Create a new instance in Horizon with these requirements
      • same Public cloud project
      • one public network interface and use the same private network (vrack) for the second one
      • same region
      • same OS (Centos 7) - not mandatory
      • same public key as the one used to spawn your ADP cluster
    2. Now, you can use this new instance as bastion

    This instance is not registered in Freeipa. So, DNS resolution and connection setup can take a long time.

    If you use tools like Ansible to automate deployment, first you have to increase ssh connection timeout or add an entry with the private ip of this new bastion in all /etc/hosts of instances you want to connect to.

    Vincent Rodier
    @RodierVincent_twitter
    Hello, thanks the option 1 is what I was doing. But my problem is it needs a manual intervention. I was wondering if there was a kind of cloud-init script available to run theses operations on bastion. My goal is to link data platform deployment and our ansible run without any intervention.
    Best regards
    Vincent
    6 replies
    Vincent Rodier
    @RodierVincent_twitter

    This is the link to the opensource code of the page generating the credentials. It should help you in designing what you need.

    Hello, and for the generating password on the cluster part. I saw the code on your github. It is the code to create the passwords, but I didn't find the endpoint to send the passwords to my ovh servicename (analytics cluster id). Is this endpoint exists to send a post or put request?

    1 reply
    Vincent
    Arthur Lutz
    @arthurlogilab_gitlab
    Hi, am interrested in the jupyter notebook feature mentioned in a #ecosystemexperience presentation
    is there a way to follow that somewhere ?
    brugere
    @brugere
    Hello Arthur ! I guess you found the right place for follow-up :)
    If you have specific needs I invite you to describe it so that we can add it to incoming work
    for data processing specific needs you have https://gitter.im/ovh/data-processing
    baaastijn
    @baaastijn
    also @arthurlogilab_gitlab just to be sure : are you the one who requested notebooks for NLP ?
    because we will have in 3 weeks AI training release
    jupyter notebooks + plugged to GPU + dev environment installed. (tensorflow and co)
    not made for Apache Spark but still, don’t hesitate if you also want to talk about that !
    Nabil Ameziane
    @nameziane_gitlab
    Helllo,
    there is a possibility to use HDP 3.X.X version on Data Analytics Platform ? Or to upgrade the service in the future ?
    baaastijn
    @baaastijn
    @nameziane_gitlab hello ! it’s not possible so far and not planned to provide it directly
    but your are admin (root) on the platform, so it’s possible to urgrade it manually
    upgrade* sorry
    Nabil Ameziane
    @nameziane_gitlab
    @baaastijn Thank you for your answer . i wanted to know also if there is possibility to connect "Hive" on OVH-HDP to "Object Storage from exemple ?
    baaastijn
    @baaastijn
    yes, you can do that via S3 protocol (our object storage is compliant)
    Nabil Ameziane
    @nameziane_gitlab
    @baaastijn Ah ok . thank you for your answer
    baaastijn
    @baaastijn
    Create S3 credentials : https://docs.ovh.com/gb/en/public-cloud/getting_started_with_the_swift_S3_API/
    then,
    In your core-site.xml, add this code :

    `<configuration>
    <property>
    <name>fs.s3a.access.key</name>
    <value>myS3AccessKey</value>
    </property>

    <property>
    <name>fs.s3a.secret.key</name>
    <value>myS3SecretKey</value>
    </property>

    <property>
    <name>fs.s3a.endpoint</name>
    <value>https://s3.<public cloud region>.cloud.ovh.net</value>
    </property>

    <property>
    <name>fs.s3a.path.style.access</name>
    <value>true</value>
    </property>
    </configuration>`

    Nabil Ameziane
    @nameziane_gitlab
    @baaastijn Nice thank you :)
    baaastijn
    @baaastijn
    do the same configuration in hive-site.xml and restart HDFS+Hive
    you can then access your S3 data via :
    CREATE EXTERNAL TABLE IF NOT EXISTS <schema_name.table_name>
    (<column_name> STRING)
    LOCATION 's3a://<your-S3-bucket>/';
    have a good day !
    Nabil Ameziane
    @nameziane_gitlab
    @baaastijn Thank you , it's works !
    baaastijn
    @baaastijn
    np :)