Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Apr 08 2019 14:17
    that wasn't a real error, just palying
  • Apr 08 2019 14:16
    oops
  • Sep 25 2018 09:50
    bot test
Min RK
@minrk
@callummole I'd be happy to. Just about to get food, are you free in ~1hr?
Min RK
@minrk
Back and free any time for the next couple of hours
1 reply
minrk
@minrk:matrix.org
[m]
Sorry, missed you. I’m free most of tomorrow
1 reply
mriduls
@mriduls:matrix.org
[m]
The binderhub version being picked up after jupyterhub/mybinder.org-deploy#2328 seems to be stuck at jupyterhub/binderhub@0990726. I just upgraded to 0.2.0-n988.h72e1852 on GESIS servers and it's picking up jupyterhub/binderhub@4ce6dc1
When I say picked up I am looking at https://gke.mybinder.org/versions
Callum Mole
@callummole
Can some please post some example logs of the proxy pod in a healthy cluster?
Sarah Gibson
@sgibson91
@mriduls:matrix.org is there an issue with GESIS's Docker Hub? I noticed an issue a couple of days ago with rate-limiting and someone else just reported an issue in the binder channel too https://gitter.im/jupyterhub/binder
mriduls
@mriduls:matrix.org
[m]
thanks for flagging this, one of the nodes didn't have the dockerhub creds, probably the request went to that node. Added the creds now.
Sarah Gibson
@sgibson91
Thanks for following up! :)
rcurtin
@ryan:ratml.org
[m]
Hi there everyone, I am trying to track down what appears to be a strange firewall issue that appears only on OVH binder notebook instances. I run the mlpack open-source machine learning library, and many of the examples in our examples repository first fetch data from datasets.mlpack.org. But I am finding specifically that when on an OVH instance (like e.g. a notebook running on 51.178.95.56), connections to datasets.mlpack.org (209.195.13.98) simply time out. I've checked the firewall configuration on datasets.mlpack.org and found no issues there; notebooks running on other non-OVH servers seem to be able to connect fine. It seems likely to me that there is some OVH firewall rule blocking access to datasets.mlpack.org. Could someone here help with that---or point me to the right place to get the issue resolved? Thanks so much! (and sorry for the wall of text :))
Erik Sundell
@consideRatio

Using the mybinder.org federation I ended up on turing, but my server never started and got stuck after the "image already available".

Is the turing federation member down without mybinder.org understanding it is down?

1 reply
Min RK
@minrk
@DivineOkorhi_twitter it's okay for multiple people to work on the same task. We're assembling some follow-up tasks now.
Okorhi Divine
@DivineOkorhi_twitter
@minrk thanks so much, I'm really grateful.
rcurtin
@ryan:ratml.org
[m]
Hey, I just wanted to ping again about the OVH firewall issue I mentioned above. Is there a better place that I should report this issue? Thanks!
Erik Sundell
@consideRatio
That is a very clear problem statement of technical nature worth having a github issue for. Couls you more or less copy paste that to github.com/jupyterhub/mybinder.org-deploy @rprimet ? Thanks for the clear description and investigative work! To me, it seems wrong that ita a difference between ovh and other mybinder.org members, so i consider it a clear and reproducible bug
rcurtin
@ryan:ratml.org
[m]
Thanks @consideratio! I will do that shortly. Is there anyone specific I should tag on the issue?
Min RK
@minrk
got a Google notice about miners on GKE again, checking it out.
Min RK
@minrk
I don't see any suspicious activity right now
from the notice, it was on node gke-prod-user-202201-0769dbf8-jxwd, looking at logs for minesweeper-ptlgl
Min RK
@minrk
Don't really see anything during the affected window. cryptnono has no logs, no pods using excessive CPU
Sarah Gibson
@sgibson91
I'm always semi-dubious about Google alerts. It's not very transparent what they count as "suspicious activity". How does it differentiate between cryptomining and just a long-running task, for instance?
Min RK
@minrk
I believe they are analyzing outbound network traffic, since they identified IP addresses being connected to
Sarah Gibson
@sgibson91
So is the workflow to confirm that those processes are cryptomining and then add the identified IP addresses to ban.py?
Min RK
@minrk
processes aren't identified, only originating VMs
10 replies
basically, they have "it looks like this VM is sending mining traffic to these IPs" and then we can look at our logs to see if we can identify the processes/pods doing it.
Min RK
@minrk
But I don't see anything in our logs that points to any mining traffic. When I poke around the destinations, though, they are clearly proxies run in the cloud to obfuscate traffic.
Sarah Gibson
@sgibson91
Right, so probable but unconfirmed
manics
@manics:matrix.org
[m]
jupyterhub/mybinder.org-deploy#2360 from a month ago- a request for resources, £50 charge for attendees to cover other costs, 300-600 people. Thoughts?
minrk
@minrk:matrix.org
[m]
Seems fine to me
YuviPanda
@Yuvi:matrix.org
[m]
https://archive.analytics.mybinder.org/ looks like it stopped last week
Erik Sundell
@consideRatio
analytics-publisher                 0/1     1            0           2y2d
The analytics-publisher isn't ready. It has been restarted every ~20 second for 9 days.
Archiving today's events 2022-10-30
Finding blobs with prefix binderhub-events-text/2022/10/30
Traceback (most recent call last):
  File "/srv/run.py", line 34, in <module>
    archive_events(
  File "/srv/archiver.py", line 54, in archive_events
    event = json.loads(json.loads(line)["jsonPayload"]["message"])
KeyError: 'message'
Opening issue.
Erik Sundell
@consideRatio
manics
@manics:matrix.org
[m]
Does anyone know what the maximum image size (GKE, for building, and for running) is? https://discourse.jupyter.org/t/unknown-parent-image-id/16774
Min RK
@minrk
I don't know!
I wonder if it's hitting some garbage collection as a large untagged layer
I think the new OVH cluster is ready to be added to CI: jupyterhub/mybinder.org-deploy#2414
once that's confirmed to be working, I can do a separate PR to add it to the federation
Min RK
@minrk
We deploy grafana on all our clusters, but we really only use it on gke-prod. Should we maybe remove grafana from most federation members? Would lighten core resources a bit.
rcurtin
@ryan:ratml.org
[m]
@minrk: I am trying to test the jupyterhub/mybinder.org-deploy#2378 on ovh2.mybinder.org, but builds give Error resolving ref for gh:mlpack/examples/HEAD: HTTP 599: Could not resolve host: api.github.com. Is OVH2 in a working state or should I give it a little while? Thanks!
Erik Sundell
@consideRatio
Thanks for helping out with testing @ryan:ratml.org!!!! :heart: :tada:
Min RK
@minrk
@ryan:ratml.org thanks for testing! coredns was running out of memory, I've patched it and I think it's working now.
Min RK
@minrk
OVH2 cluster appears to be healthy. I think we can add it to the federation to see how it goes: jupyterhub/mybinder.org-deploy#2425 . Current capacity is just 10.
Min RK
@minrk
Running the old-image culler script on GKE, since we haven't in a long time and the size is getting costly
Min RK
@minrk
Apparently our events archive doesn't have any events from GKE since October 20
What's super duper weird is our logs do have the events, but for some reason the GKE logs aren't being included in the logs export
Min RK
@minrk
Figured it out and fixed it: jupyterhub/mybinder.org-deploy#2441 now just need to figure out how to 're-sink' the missed events that are still in the logs bucket, but not in the storage bucket
manics
@manics:matrix.org
[m]
I'm a member of https://github.com/orgs/binder-examples/people but I don't have permission to merge/close PRs
manics
@manics:matrix.org
[m]
Thanks to whoever just fixed it :-)
minrk
@minrk:matrix.org
[m]
No problem! I think we should look over the permissions and membership on that org