Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    actually, I wouldn't use that image
    I don't know why they have that still there, the official docker image is kept up to date, same command though:
    /usr/bin/docker run --rm -v /opt/bin:/opt/bin ceph/install-utils
    Steve Leon
    @kokhang
    @ichekrygin if you want to get started with Rook quickly, you can try to use rook's coreos K8s vagrant. https://github.com/rook/coreos-kubernetes
    This has all the modprobe volumes added
    bzub
    @bzub
    http://stackoverflow.com/a/39180049 similar issue on stackoverflow. They show their rbd script and said it worked after upgrading CoreOS. hmm
    Illya Chekrygin
    @ichekrygin
    hi, just to followup, finally got it working, needed to add rbd to the PATH in kube-controller-manager.service
    bzub
    @bzub
    wow ok. Good to know thanks!
    Bassam Tabbara
    @bassam
    Nice. Our proposed volume plugin for Rook will make sysfs calls directly and remove the need for rbd and an installation of ceph on the host/kubelet level
    Illya Chekrygin
    @ichekrygin
    totally excited, got tired of the mounting EBS nonsense
    Bassam Tabbara
    @bassam
    @ichekrygin would love to understand your scenarios of running rook in AWS. we've been assuming that most people would be using rook on-premise and not necessarily in AWS/Azure etc.
    Illya Chekrygin
    @ichekrygin
    hi, to circle back, after running Prometheus w/ Rook on K8S for last 12 hrs
    blob
    last gap, is me forcing the Prometheus pod to start on another EC2 instance
    In comparison to:
    blob
    Running Prometheus w/ EBS on K8S: first gap EC2 with Prometheus pod is being killed by AWS, second: Prometheus pod restart
    Bassam Tabbara
    @bassam
    whats the Y axis?
    Bassam Tabbara
    @bassam
    @ichekrygin not sure I follow the significance of the gaps
    Illya Chekrygin
    @ichekrygin
    time it took to restart a pod
    y-axis is just a random metric (in this case CPU usage)
    the significance is the x-axis (time), specifically gap in metrics due to pod not being able to start
    Bassam Tabbara
    @bassam
    @ichekrygin what do you conclude from this?
    Illya Chekrygin
    @ichekrygin
    w/ Rook, i see no more than a 60 sec delay, with EBS i see at times delay over full hour
    DanKerns
    @DanKerns
    How can it be so slow on EBS?
    Illya Chekrygin
    @ichekrygin
    it took a while for EBS to be marked as detached, before it could be re-mounted on new instance
    Bassam Tabbara
    @bassam
    oh wow
    thats bad
    Illya Chekrygin
    @ichekrygin
    an hour, was an extreme case, but on average, we see 5-10 min delay just on pod restart
    with EBS, pod must be started on the node running in the same AWS AZ
    DanKerns
    @DanKerns
    There is an opportunity for Rook!
    Illya Chekrygin
    @ichekrygin
    with Rook, it could be started on any node
    i will late it bake for few more days, to accumulate more data (larger volume) and will repeat my test
    Bassam Tabbara
    @bassam
    @ichekrygin thanks for the insights here.
    next question: how do we get it to 0 delay on pod restart
    Illya Chekrygin
    @ichekrygin
    i think it's a tall order with Prometheus specifically, since it need to read (reload) all the data back from the disk before becoming operational, hence, there is an inherent delay. EBS was adding an additional delay on top of that.
    Elson Rodriguez
    @elsonrodriguez_twitter

    Heya, I tried rook, but I’m getting a sad cluster:

    OSDs:
    TOTAL     UP        IN        FULL      NEAR FULL
    4         1         1         false     false

    The OSDs are all complaining along these lines:

    2017-03-06 10:08:02.340442 I | osd1: 2017-03-06 10:07:50.174195 f439bf80700 -1  Processor -- bind was unable to bind. Trying again in 5 seconds 
    2017-03-06 10:08:02.340456 I | osd1: 2017-03-06 10:07:55.192786 f439bf80700 -1  Processor -- bind unable to bind to 172.16.2.250:7300/1023 on any port in range 6800-7300: (99) Cannot assign requested address

    Any tips?

    172.16.2.250 is the Host IP , not the Pod IP
    bzub
    @bzub
    Are you using v0.3.0 of rook-operator?
    Elson Rodriguez
    @elsonrodriguez_twitter
    Yes
    bzub
    @bzub
    When I run it, it doesn't use hostNetworking, hm
    should be pod IPs
    Elson Rodriguez
    @elsonrodriguez_twitter
    Correct
    Odd pods have a proper pod ip
    Travis Nielsen
    @travisn
    Rook doesn't specify which ip to use, so Ceph tries to find one on the same subnet
    the OSD pods run as privileged, which means they could find the host ip
    bzub
    @bzub
    interesting. I wonder what's different on your cluster then,
    should still be in the pod network namespace I would think without hostNetworking