Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Illya Chekrygin
    @ichekrygin
    same thing after installing rbd on master
    bzub
    @bzub
    ok, hmm
    I see --allow-privileged=true for the kubelet, and I assume they are all like that. Double check that option is set for the API service
    Illya Chekrygin
    @ichekrygin
    ohh.. kube-api doesn't have PATH set at all..., will add it and rerun
    bzub
    @bzub
    ah good idea
    about your earlier comment, I think you should indeed see a related error in one of the kubelet logs
    Steve Leon
    @kokhang
    according to https://coreos.com/kubernetes/docs/latest/kubelet-wrapper.html, you need to have 1.3.6 or higher coreos running
    bzub
    @bzub
    he's not using kubelet-wrapper
    Steve Leon
    @kokhang
    ahh @bzub already pointed that out
    bzub
    @bzub
    apparantly kube-aws doesn't us it
    bzub
    @bzub
    @ichekrygin if that doesn't work, the only thing I can think of right now is to try installing all the Ceph clients like you did with /opt/bin/rbd
    this should do it: sudo /usr/bin/docker run --rm -v /opt/bin:/opt/bin quay.io/coffeepac/ceph-install
    actually, I wouldn't use that image
    I don't know why they have that still there, the official docker image is kept up to date, same command though:
    /usr/bin/docker run --rm -v /opt/bin:/opt/bin ceph/install-utils
    Steve Leon
    @kokhang
    @ichekrygin if you want to get started with Rook quickly, you can try to use rook's coreos K8s vagrant. https://github.com/rook/coreos-kubernetes
    This has all the modprobe volumes added
    bzub
    @bzub
    http://stackoverflow.com/a/39180049 similar issue on stackoverflow. They show their rbd script and said it worked after upgrading CoreOS. hmm
    Illya Chekrygin
    @ichekrygin
    hi, just to followup, finally got it working, needed to add rbd to the PATH in kube-controller-manager.service
    bzub
    @bzub
    wow ok. Good to know thanks!
    Bassam Tabbara
    @bassam
    Nice. Our proposed volume plugin for Rook will make sysfs calls directly and remove the need for rbd and an installation of ceph on the host/kubelet level
    Illya Chekrygin
    @ichekrygin
    totally excited, got tired of the mounting EBS nonsense
    Bassam Tabbara
    @bassam
    @ichekrygin would love to understand your scenarios of running rook in AWS. we've been assuming that most people would be using rook on-premise and not necessarily in AWS/Azure etc.
    Illya Chekrygin
    @ichekrygin
    hi, to circle back, after running Prometheus w/ Rook on K8S for last 12 hrs
    blob
    last gap, is me forcing the Prometheus pod to start on another EC2 instance
    In comparison to:
    blob
    Running Prometheus w/ EBS on K8S: first gap EC2 with Prometheus pod is being killed by AWS, second: Prometheus pod restart
    Bassam Tabbara
    @bassam
    whats the Y axis?
    Bassam Tabbara
    @bassam
    @ichekrygin not sure I follow the significance of the gaps
    Illya Chekrygin
    @ichekrygin
    time it took to restart a pod
    y-axis is just a random metric (in this case CPU usage)
    the significance is the x-axis (time), specifically gap in metrics due to pod not being able to start
    Bassam Tabbara
    @bassam
    @ichekrygin what do you conclude from this?
    Illya Chekrygin
    @ichekrygin
    w/ Rook, i see no more than a 60 sec delay, with EBS i see at times delay over full hour
    DanKerns
    @DanKerns
    How can it be so slow on EBS?
    Illya Chekrygin
    @ichekrygin
    it took a while for EBS to be marked as detached, before it could be re-mounted on new instance
    Bassam Tabbara
    @bassam
    oh wow
    thats bad
    Illya Chekrygin
    @ichekrygin
    an hour, was an extreme case, but on average, we see 5-10 min delay just on pod restart
    with EBS, pod must be started on the node running in the same AWS AZ
    DanKerns
    @DanKerns
    There is an opportunity for Rook!
    Illya Chekrygin
    @ichekrygin
    with Rook, it could be started on any node
    i will late it bake for few more days, to accumulate more data (larger volume) and will repeat my test
    Bassam Tabbara
    @bassam
    @ichekrygin thanks for the insights here.
    next question: how do we get it to 0 delay on pod restart
    Illya Chekrygin
    @ichekrygin
    i think it's a tall order with Prometheus specifically, since it need to read (reload) all the data back from the disk before becoming operational, hence, there is an inherent delay. EBS was adding an additional delay on top of that.