Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    serg
    @urd405_twitter
    okey doke
    Dmitry Spasibenko
    @dspasibenko
    cool!
    Ilya Z
    @ilya-zz
    hey
    :steam_locomotive:
    Dmitry Spasibenko
    @dspasibenko

    In our rpc implementation we try to reduce number of allocs due to high volume traffic that we serve. The rpc needs slices of bytes of different size to serve the network traffic. The obvious solution could be to use sync.Pool to avoid extra allocations and re-use already allocated and now freed slices. The problem is we don't know in advance what size of buffer will be requested. I even don't have an idea whether the size distribution is uniform or it is going to tend to be around a mean...

    One of the ideas is to create several pools object, which will serve only particular buffer sizes. So let's say up to 100, up to 1K, up to 10K and up to 100K, if requested size is 4500 we will go to the buffer which has 10K slices and ask there.

    We can measure the request statistics as well, but would not it an overengineering for the task?

    Ilya Z
    @ilya-zz
    is it a real problem right now? has LR some known performance issues?
    Kostya
    @klastochkin
    hi
    you could use cache idea to implement pool of buffers. instead of keys you could use size of buffer required, eventually most used buffer sizes will be allocated mostly
    Dmitry Spasibenko
    @dspasibenko
    @klastochkin I did not get it. The queries are going to have the size. Let's say 20, 5340, 250, 1200, 8000, etc. how we are going to turn the size into keys?
    Kostya
    @klastochkin
    Kostya @klastochkin 13:27
    let's say you need 250. you request cache for buffer of size 250 (it doesn't meant 251 or even 260 couldn't do). if there is free buffer, it's returned and marked 'busy'. if there is not such buffer but there is free memory left, new one is allocated of size 250 (or, say, 256). If there is no free memory, some previously allocated buffer of different size least recently used, or my other params, maybe taking in account size/usefulness gets released and 250 is allocated. And so on. What's good here is that it's automatic. You don't need stats for it to begin to work. You don't need to think about differences between platforms and use cases
    Dmitry Spasibenko
    @dspasibenko
    I presume we will hardly hit the same value again. So the size will be in range 100..50000 with mean is about 4501 (for example). so the chances that we will ask for the next buffer of 4501 are low, but around it. Also I am going to use sync.Pool, which has a specific and some system behavior. Anyway, thanks for the idea with caching approach!
    Kostya
    @klastochkin
    As I said, you can have some specific predefined sizes like 256, 1000, etc. If you need 250, don't ask for 250, but ask for nearest available size, 256. you'll have some memory overhead but some times it's worth it. For big sizes, you might want to not use it at all
    and if you'll hardly hit the same size again, but you still want buffer, you'll have overhead anyway
    Dmitry Spasibenko
    @dspasibenko
    @klastochkin yep, clear enough, thanks! So the idea is to have multiple buckets with a range of sizes, so each of such bucket is served by a special Pool. If our distribution is normal, then just few pools will be involved. The problem will happen if the size distribution is uniform, then every bucket can be touched with the same probability, so many pools will be created. I will try to elaborate the idea...
    Kostya
    @klastochkin
    yeah, sync.Pool isn't working well with my idea :)
    well, another idea then :) you create new sync.Pool automatically if you see that there are many requests for specific size interval
    Kostya
    @klastochkin
    dynamically. on example you request 250 and create sync.Pool for <500, and so on. and at some point you see a lot of buffers are below 250 so you create another one for <250. After that you see that <500 isn't used much, so you merge it with <1000. But then you notice that 99.9% of buffers there are <800, so you create <800 and leave rest to be used by <1200, etc.
    Slava
    @Slava-P
    Probably obvious but anyway, real-time stats may help a little bit on the size prediction side
    ops, missed the question, it was already mentioned
    Dmitry Spasibenko
    @dspasibenko
    Dmitry Spasibenko
    @dspasibenko
    logdevice.io
    Slava
    @Slava-P
    C++ :-)
    Dmitry Spasibenko
    @dspasibenko
    yep, but all internals and their "knowhow" sequencers are very similar what we have in Logrange. Pretty competitive.
    Dmitry Spasibenko
    @dspasibenko
    @ilya-zz we don't have problems with it at all at this moment, just polishing memory management for the rpc
    Dmitry Spasibenko
    @dspasibenko
    we did some performance tests
    Sid Grover
    @sidgrovr_twitter
    Hi all: Cool project. Have you conducted any benchmarks against FB's LogDevice? Anecdotally I am told it also is architected to favor write over read performance (ref your performance test results posted to the blog).
    Dmitry Spasibenko
    @dspasibenko
    Hi @sidgrovr_twitter, thanks for you interest. We have not conducted it yet. I don't think that we tried to make write operation faster than read, but we just got the results in the aggressive write environment. To be honest we did not tune logrange, but performed the tests in the basic configuration. If we increase the read buffer the results for read will be better. I am sure.
    Akhilesh
    @akhilesh2011_twitter
    Hi, are custom fields supported in logrange?, The document says that records can have optional fields in key-value form. How can it be used?
    Dmitry Spasibenko
    @dspasibenko

    @akhilesh2011_twitter , every record can have custom fields. Custom field is the key-value pair which could be assigned per record base. The fields can be used in WHERE conditions for filtering records in LQL the same way like msg or ts fields are used. For instance, you know that some of your records has field 'fld_error' and if you want to select only records with the field is not empty you write the query like this:

    SELECT FROM app="myapp" WHERE fields.fld_error != "" LIMIT 1000'

    Vasiliy Tolstov
    @vtolstov
    hi
    @dspasibenko i think that this is more suitable place for questions
    as i'm wrote in issue i need to have pure go api to be able to read/write logs in logrange.
    loki - is like proetheus but for logs, it can store logs, deduplicated it, index by labels and so
    also you can set retention period , so older logs transfers to s3 compatible storage, but it can be requested via loki (it stores local index with file chunk names and positions)
    so for hot data you can get logs from local storage, for archive - from s3
    main drawback for loki - it written to run standalone inside docker, so when you want to bundle it with own app its pain, and it have big deps like aws and cortex
    Dmitry Spasibenko
    @dspasibenko

    @vtolstov I think logrange has similar functionality but with some variations. Logrange is built with idea to be fast and effective. We compared its injection speed with Kafka and believe we can make it better.

    I think the main your request is about embedding Logrange to your app, but after some ping-pongs it seems that you simply need just another API. Either way it is possible, so let me know what you would prefer more, we can discuss details.

    Vasiliy Tolstov
    @vtolstov
    hi!
    i have some roadmap question - can you create some milestones/roadmap info about future plans? for example do you plan to have retention support or deduplication when writing to persistent storage
    Dmitry Spasibenko
    @dspasibenko
    @vtolstov, the idea to created and publish roadmap is quite good, so we considered to publish it soon. Regarding deduplication - I am not sure that such feature is going to be in the roadmap. It seems like it is particular after-save processing which could be a part of features based on auto-processing functionality. Anyway you are welcome to come up with any suggestions or PRs regarding it.
    Akhilesh
    @akhilesh2011_twitter
    Hi Team, When is HA support planned for Logrange?
    arisettia
    @arisettia
    Hi LogRange Team, any Web UI is planned or any community working on providing Web UI for viewing Logrange Data?
    Abhishek Sharma
    @Abhishek627
    Hi logrange team, Do we have any support for truncate API from golang client ?
    I see only Query receiver and it supports only select queries. What are the other alternates ?
    Abhishek Sharma
    @Abhishek627
    @dspasibenko