Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Elliot Blackburn
    @BlueHatbRit
    We now have automated docker builds :tada: https://cloud.docker.com/u/statsd/repository/docker/statsd/statsd
    And the automated npm publishing should be working again for post 0.8.2 releases.
    nitinpandey-154
    @nitinpandey-154
    Hi, Do you know how to list all the available stats in statsd?
    Elliot Blackburn
    @BlueHatbRit
    Hi @nitinpandey-154, sorry I’ve been away for a little bit so I missed this. Statsd really just pushes your stats onto a consumer such as graphite, I feel like you most-likely want to check your metric consumer rather than statsd itself.
    Elliot Blackburn
    @BlueHatbRit
    However, if you wish to check statsd, you can use the management console commands. Send help to your statsd instance over http to your management port (defaults to 8126) and it’ll tell you what commands you can use
    The commands like stats and counters should get you what you’re after.
    Samba
    @gsambasiva
    Hi,
    Unable to connect to statsd service running in my local from docker container
    My statsd service is running in my local with port 8125.
    Could anyone help on this.
    I even tried with my giving host as "host.docker.internal"
    Elliot Blackburn
    @BlueHatbRit
    @gsambasiva you’re going to need to give a few more details, what are you using to run your container? docker, kubernetes, swarm, docker-compose? And what’s your configuration on the container through those?
    Andrey Ivanov
    @a-nigredo
    Hi all, I noticed that if I send a similar metric for 2 times statsd merge it into one. Is it possible to fix it?
    Elliot Blackburn
    @BlueHatbRit
    @a-nigredo sorry for the slow reply, the holidays got away from me. Could you give some more detail, what sort of metric are you sending, how are you sending it, what values are you setting?
    Bishwa
    @Bishwa05

    After running the docker image from https://hub.docker.com/r/statsd/statsd,
    I do see,
    24 Mar 09:30:40 - [1] reading config file: config.js
    24 Mar 09:30:40 - server is up INFO

    but i don't see anything is running in localhost:8125

    Elliot Blackburn
    @BlueHatbRit
    @Bishwa05 it sounds like you haven’t exposed the port locally via your docker configuration. You need to link the ports using the ports flag with whatever you’re using to run your containers.
    See https://docs.docker.com/engine/reference/commandline/run/#publish-or-expose-port--p---expose if you’re using the docker cli. If you’re using another tool then you’ll want to find the equivilent docs for that system (docker-compose, kubernetes, etc).
    Bishwa
    @Bishwa05
    I am running StatsD docker deamon in my local env and wanted the metric in splunk(docker in my local), what would be docker run command to run the statsD which could send the metric to a port in localhost.
    Elliot Blackburn
    @BlueHatbRit
    @Bishwa05 it sounds like you want the -p option on the docker-cli as I linked above. That will expose the port from the container to your host machine, you can then make calls to the port.
    Bishwa
    @Bishwa05
    @BlueHatbRit , I tried that, after doing this i can't open up same system port for splunk (running splunk docker as well). For eg, using 8125:8125 for statsD and the same ports can't be used for splunk.
    Then how splunk will communicate to statsD.
    Elliot Blackburn
    @BlueHatbRit
    If you’re running splunk in a container as well, you’ll need to enable networking between them rather than just exposing both their ports
    I’d suggest reading a guide on docker-compose for this sort of thing, when trying to run multiple containers locally it’s a big help and makes life much easier. You can very easily create a network between two containers that way for testing :)
    corentin
    @corenti13711539_twitter
    Hi! Not a lot of activity recently - is this chat still alive? :smile:
    Elliot Blackburn
    @BlueHatbRit
    :wave: @corenti13711539_twitter it's not super active here but I get email notifications it seems, what's up?
    corentin
    @corenti13711539_twitter
    yay! :raised_hands: :smile:
    So, I'm pondering about which metric type to use. I'd like to publish pre-aggregated timing data via statsd for visualizing in Grafana. I'm currently using the timing metric type, but that doesn't seem right because then the pre-aggregated data gets another round of aggregation.
    corentin
    @corenti13711539_twitter
    The actual use case is akin to measuring estimated duration of customer service interactions. At irregular intervals we compute the estimated total time that each ongoing interaction will take to finish and then compute a median of these estimates across x % of the most recently started interactions. The median gets sent to the computation interval varies based on the busyness of the system from tens of seconds to tens of minutes and there can be long gaps.
    @BlueHatbRit Would the counting metric with absolute values be appropriate in this case?
    Elliot Blackburn
    @BlueHatbRit
    @corenti13711539_twitter what database are you putting the data into from statsd? Is it going into something like graphite or influx?
    corentin
    @corenti13711539_twitter
    influx
    Elliot Blackburn
    @BlueHatbRit

    Statsd timing is designed for something like: emitting the time it took for a HTTP endpoint to return a response. Those stats can be produced at any time and so statsd will aggregate them into blocks and flush at the pre-determined interval to help normalise the data. Your data might look like:

    • 320ms
    • 200ms
    • 80ms

    If you've already got the timing (ie: http response time) and the time you want it to land in the database, ie data that looks like:

    • 320ms - 10:14:32
    • 200ms - 10:14:42
    • 80ms - 10:14:42

    Then I'd suggest just tossing it straight into influx yourself and ignoring statsd as you're not really getting any benefit from using statsd.

    If you have the first form of data, then timing is definitely a good metric type to use, and statsd will reliably aggregate it and flush the data for you at your interval.
    So I guess timing is a great metric to use, if you want to toss the data into statsd the moment it's produced by the customer service interaction ending.
    If you aggregate up the data and then want to push it in at the end of a working day for example, then I'd just toss it directly into influx as you can then control the timestamps you assign to it yourself. You can then just write a bit of code to normalise the timings down to 10s intervals (or whatever) to get a nice normalised statsd like input.
    But statsd expects to handle the timestamps for you, and expects to input it into the db in real time, so that might trip you up if it's pre-aggregated with timestamps.
    Does that answer your question, or have I totally missed the point? :laughing:
    corentin
    @corenti13711539_twitter
    The timings are already aggregated when they're sent to statsd and they occur with irregular intervals. This would be similar to if you were operating a call center and at irregular intervals would estimate the expected total duration per each ongoing call and then compute median duration per each ongoing call, so e.g.
    • 15 min 33 sec - 10:14:32
    • 10 min 11 sec - 10:16:09
    • 12 min 44 sec - 10:17:01
      etc.
    I guess sending the data directly to influxdb could be an option to consider. The reason for using statsd for this is that we have an existing API and "pipeline" for delivering the data to influxdb and UDP is used by default so, it's fire-and-forget avoiding delays in the main application flow when delivering data.
    corentin
    @corenti13711539_twitter
    Also, connecting to influxdb directly would involve adding a new dependency, configuring influx endpoints for the app etc. OTOH, I understand your point about using influxdb and this may not be a really good use case for statsd in general.
    Elliot Blackburn
    @BlueHatbRit
    Okay so your timings actually don't have timestamps against them yet, you're using statsd for that and want it to flush them into influx in real time. In that case timing is a perfectly fine one to use. Whether all the other bits like median, 99 percentile etc are useful are up to you, you may wish to turn those off.
    This seems like a totally fine use case for statsd and for the timing metric imo :)
    corentin
    @corenti13711539_twitter
    Each timing are associated with a timestamp (e.g. 10:14:32 above). hmm. So, it's possible to turn off 99 percentile etc - how?
    Elliot Blackburn
    @BlueHatbRit

    Hey sorry it got a bit late last night and I think I misread a few things.

    The timings are already aggregated when they're sent to statsd and they occur with irregular intervals.

    This is really the important bit, statsd lets you hand in a timing duration into the stat, and assumes the timestamp to store that against is whatever statsd understands the current time to be on the system. So if you want to pass it a timing data + a specific timestamp you won't be able to do that.

    If you're just passing it the timing data (ie, producing these stats in real time when a customer service interaction ends) and tossing it straight to statsd then you're all good and statsd can help you. Just ignore your own timestamps, pass it the duration and it'll do everything for you.

    It is possible to turn off various calculated stats for the timer metrics - you can use calculatedTimerMetrics which is in the exampleConfig.js.

    Elliot Blackburn
    @BlueHatbRit
    If the data is pre-aggregated with timestamps and you want to use those, then using statsd probably isn't best because you can't pass in timestamps to use. Statsd is designed to handle real-time stats being produced and normalise the data as it goes in.
    corentin
    @corenti13711539_twitter
    Thanks for the feedback and discussion @BlueHatbRit - very much appreciated! :bow: :thumbsup: