Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Hugo Larcher
    @Hugoch
    you then need to restart Knox, I am going to send you the command ... 1 sec
    christophe mogentale
    @cmogentale_twitter
    thank you
    christophe mogentale
    @cmogentale_twitter
    I managed to restart know by logging as knox user and launching /usr/hdp/2.6..../knox/bin/gateway.sh start, ambari is displaying something again but acting as if no service was installed, i land on a page where i must choose services to install
    mmm ambari finally returned. I'm trying to "start all services", finger crossed ... thank you Hugo
    Hugo Larcher
    @Hugoch

    ok so normally knox must be restarted through ambari api, you can use

    export CLUSTER_ID=XXXXX
    export CLUSTER_NAME=XXXXXX
    
    echo '{"RequestInfo":{"command":"RESTART","context":"OVH - Restart Knox gateway","operation_level":{"level":"SERVICE","cluster_name":"$CLUSTER_NAME","service_name":"KNOX"}},"Requests/resource_filters":[{"service_name":"KNOX","component_name":"KNOX_GATEWAY","hosts":"ovh-mnode0.$CLUSTER_ID.datalake.ovh"}]}'|sed "s/\$CLUSTER_NAME/$CLUSTER_NAME/g"|sed "s/\$CLUSTER_ID/$CLUSTER_ID/g"|curl http://ovh-mnode0.$CLUSTER_ID.datalake.ovh:8080/api/v1/clusters/$CLUSTER_NAME/requests -H "Content-type: text/plain" -H "X-Requested-By: ambari" -d @-

    replace XXXXX with your cluster id (an uuid) and your cluster name

    the way you restarted Knox kinda "works" but uses wrong JVM so it may not work as expected
    christophe mogentale
    @cmogentale_twitter
    Thank you Hugo, i will do as you recommend then
    Hugo Larcher
    @Hugoch
    perfect, wait for services to restart, then you will restart Knox from Ambari
    take care when you restart all services, the problem is that is restarts Knox
    christophe mogentale
    @cmogentale_twitter
    I will be more careful next time i promise !
    i was a bit scared ^^
    Hugo Larcher
    @Hugoch
    so you will loose connectivity with Ambari while knox restarts (ie. you restart knox task seems blocked)
    in that case wait a few minutes, then just refresh the page
    yeah, I get it ^^
    christophe mogentale
    @cmogentale_twitter
    Thank you Hugo, i will follow your tips
    Hugo Larcher
    @Hugoch
    your welcome
    let me know if it works
    christophe mogentale
    @cmogentale_twitter
    you saved my day :)
    I will
    christophe mogentale
    @cmogentale_twitter
    It seems ok except for flume agents. It displays 0 live, 0 dead, and i can only stop the service
    Hugo Larcher
    @Hugoch
    well if you did not install any Flume agent that seems ok
    christophe mogentale
    @cmogentale_twitter
    before i had 4/4 (i think it was this way from the start)
    i'm not using flume but it bothers me if it can indicate i missed something
    christophe mogentale
    @cmogentale_twitter

    Hello,

    I'm seeking for advice, as i've been struggling for days with problems under heavy load of phoenix query server.
    To explain a bit more : i have 3 phoenix query servers behind knox. I load balanced them through the ha role in knox.
    I'm accessing phoenix through php webservices using the simba's odbc driver. When testing each webservice, everything works fine.
    (Only curious thing is that it takes nearly one second to establish first connection (odbc_connect) to phoenix, then the odbc_connect is very fast.)
    When i open my website, which takes quite a load (but nothing that a single mysql server used to handle, so i'm wondering why), in one minute, i get my apache log filled with some errors regarding the failed connections to phoenix :
    mainly to different errors (i use the error_log php function to log in these files what odbc_error_message returns:)

    after a failed odbc_connect:
    S1000 ## [unixODBC][Hortonworks][Phoenix] (40) Error with HTTP request, response code: 500
    after a failed cluster query (which might result for the previous error):
    S1000 ## [Hortonworks][Phoenix] (2100) An error occured while preparing statement: \n8org.apache.calcite.avatica.proto.Responses$ErrorResponse\x12\x1a\x13\n\x1a\x12org.apache.calcite.avatica.NoSuchConnectionException\n\tat org.apache.calcite.avatica.jdbc.JdbcMeta.getConnection(JdbcMeta.java:565)\n\tat org.apache.calcite.avatica.jdbc.JdbcMeta.prepare(JdbcMeta.java:690)\n\tat org.apache.calcite.avatica.remote.LocalService.apply(LocalService.java:209)\n\tat org.apache.calcite.avatica.remote.Service$PrepareRequest.accept(Service.java:1199)\n\ta##select next value for "akinator_device_tags_id" as "nextvalue"

    Would you have any clue or insight about that ? Am i missing some important phoenix config options ?
    At first i thought about write latency, linked to index update in phoenix, but i also get the errors even when deactivating upsert queries, allowing only select ones.

    Hugo Larcher
    @Hugoch
    Hello Christophe! Knox load balancing is just for HA. In case of back end failure it switches to next available backend. So 1 Phoenix server is effectively handling all the load. Depending on the number of queries it may queue them leading to timeout or connection pool exhaustion.
    christophe mogentale
    @cmogentale_twitter
    HelloHugo, thank you for your answer. Do you think the message error i take is a symptom to a lack of loadbalancing ? Do you know if there are some parameters to increase maximum queue size on phoenix ? I saw parameters link to threading but i'm not very confident changing them ^^
    Hugo Larcher
    @Hugoch
    Hey @cmogentale_twitter , first I would try to disable AVATICA HA in Knox. NoSuchConnection exception may mean that Knox switches from one Phoenix to another, so an existing connection on Phoenix 1 is then used on Phoenix 2 leading to NoSuchConnectionException. If error disappears, it means that loadbalancing (if needed) must be done in front of Knox. You must spawn multiple Knox servers, and install a loadbalancer in front of them (you may use existing Nginx on your LB)
    christophe mogentale
    @cmogentale_twitter
    Hello Hugo, thank you very much, i will try to disable ha first
    christophe mogentale
    @cmogentale_twitter
    Hello, i recently ran into a problem when launching queries fetching massive amount of data in phoenix from the cluster. I had to increase, in nginx.conf the proxy_timeout values, and i also had to add parameters in the custom gateway site for knox : gateway.httpclient.connectionTimeout,
    gateway.httpclient.socketTimeout,
    httpclient.connectionTimeout,
    httpclient.socketTimeout, all set to 600000 to avoid the problem. I thought it might be helpful to add these settings in your default config for future clusters
    Hugo Larcher
    @Hugoch
    Hello Christophe! Right, timeout can be difficult to deal with for such big queries! It totally make sense to upgrade then for long polling endpoint. Just take care of possible connection pool exhaustion if you have many clients 😊 we will had that to the documentation, thx !
    Alejandro Drago
    @Aleesminombre_twitter
    s
    Vincent Rodier
    @RodierVincent_twitter
    Hello, I deployed a data anlytics platform with ovh api and capture the bastion ip to connect to the cluster and other nodes. Is there a way to retrieve or generate free ipa admin pass from the ovh api. Because I need this to create an account to gain access on other cluster nodes. As I need to automate this I cannot use my Public Cloud web interface manually to do this. Thanks
    impolitepanda
    @impolitepanda
    Hello ! The generation of passwords is done only through the webpage. This is done for security reasons, as this way we, at OVHcloud, never have access to your passwords in clear.
    I'll find the javascript code used to generate what you need. This way, you'll be able to adapt it for your tool
    This is the link to the opensource code of the page generating the credentials. It should help you in designing what you need.
    Vincent Rodier
    @RodierVincent_twitter
    thanks, I will check that
    Vincent Rodier
    @RodierVincent_twitter
    Hello, I have another question. When deploying an analytics platform by default ssh on bastion node is in interactive mode only. How can I automate the conf change to allow ssh non interactive with commands or scp? I can do it manually in /etc/sshd_config but is there any kind of ovh action script to do it automatically when the node starts?
    thanks
    Vincent
    impolitepanda
    @impolitepanda
    Hello @RodierVincent_twitter . For audit purposes, the bastion logs everything that goes through it, and that's why SSH is in interactive mode only. For this reason, we don't have any tool/script to change it to non interactive, sorry
    Vincent Rodier
    @RodierVincent_twitter
    Hello, ok. I saw on the api /cloud/project/{serviceName}/instance a userData field. Is it for this purpose? run script when instance first launching? If yes is it possible to have this in the future for analytics platform instance?
    thanks
    because to the moment it's impossible to automate an analytics platform deployment + install scripts on nodes without manual action
    Pirionfr
    @Pirionfr
    hello @RodierVincent_twitter. we may found a solution for you. we are doing some test about it.
    Pirionfr
    @Pirionfr

    Hello @RodierVincent_twitter,

    Interactive mode is required for audits logs on bastion for security purpose. The problem with this feature is that it breaks TCP Forwarding andyou can't use tools like ansible to deploy/update tools onto instances of this ADP cluster.

    You have two options to bypass that:

    1. deactivate interactive mode on bastion. That means, no more audit.
    2. add a new host in the same private network dedicated to your deployment for example.

    1- Deactivate interactive mode on bastion - Step-by-step guide

    1. Edit /etc/ssh/sshd_config
    2. Comment these lines
      #ForceCommand /opt/bastion/bastion
      #AllowTcpForwarding no
    3. Restart sshd
      systemctl restart sshd
      This action is permanent. No code to force interactive mode at boot time.

    Of course this deactivation can be done temporarily. An automated procedure based on Ansible for example can be setup to deactivate this interactive mode just for the time it takes to apply some changes:

    • Connect to bastion and disable interactive mode (See above)
    • Apply your changes onto instances of your ADP cluster
    • Reactivate the interactive mode

    2 - Create a new bastion - Step-by-step guide

    1. Create a new instance in Horizon with these requirements
      • same Public cloud project
      • one public network interface and use the same private network (vrack) for the second one
      • same region
      • same OS (Centos 7) - not mandatory
      • same public key as the one used to spawn your ADP cluster
    2. Now, you can use this new instance as bastion

    This instance is not registered in Freeipa. So, DNS resolution and connection setup can take a long time.

    If you use tools like Ansible to automate deployment, first you have to increase ssh connection timeout or add an entry with the private ip of this new bastion in all /etc/hosts of instances you want to connect to.

    Vincent Rodier
    @RodierVincent_twitter
    Hello, thanks the option 1 is what I was doing. But my problem is it needs a manual intervention. I was wondering if there was a kind of cloud-init script available to run theses operations on bastion. My goal is to link data platform deployment and our ansible run without any intervention.
    Best regards
    Vincent
    6 replies
    Vincent Rodier
    @RodierVincent_twitter

    This is the link to the opensource code of the page generating the credentials. It should help you in designing what you need.

    Hello, and for the generating password on the cluster part. I saw the code on your github. It is the code to create the passwords, but I didn't find the endpoint to send the passwords to my ovh servicename (analytics cluster id). Is this endpoint exists to send a post or put request?

    1 reply
    Vincent
    Arthur Lutz
    @arthurlogilab_gitlab
    Hi, am interrested in the jupyter notebook feature mentioned in a #ecosystemexperience presentation
    is there a way to follow that somewhere ?
    brugere
    @brugere
    Hello Arthur ! I guess you found the right place for follow-up :)