These are chat archives for rust-lang/rust

18th
Oct 2018
Ichoran
@Ichoran
Oct 18 2018 00:12
Overhead on uncontested CAS is really low. I'd use that instead.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:13
it's probably not uncontested though? it's a pretty hot code-path
Ichoran
@Ichoran
Oct 18 2018 00:15
You won't be updating the same token on the same user with a lot of separate threads at the same time, will you?
The whole array, yes, but I mean that each token should be an AtomicUsize.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:16
i don't think so, but maybe if some nefarious user has a script that pipelines and spams requests, maybe?
shouldn't work for long though
Ichoran
@Ichoran
Oct 18 2018 00:16
Then the threads that handle that user will get slow, but I don't see how you can avoid that anyway.
And not that slow.
Successful CAS is ~1-2 ns; unsuccessful is more like 10/iteration. Lock/unlock tends to be in the 10-30ns range depending on details.
You do have to get the bitwise math right for packing stuff into a usize, however.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:20
it needs to be usize? currently i'm only using 16 bits not 32
Ichoran
@Ichoran
Oct 18 2018 00:21
Yeah. Unless you're running on an embedded device, who cares? Your processor is going to fetch the whole cache line--64 bytes probably--and the atomic instructions are the machine word size, which is probably 8 bytes (which usize would be on those machines).
John
@gitlabaccount624_gitlab
Oct 18 2018 00:22
well it doubles the ram usage and i can use that same cache line for other users
Ichoran
@Ichoran
Oct 18 2018 00:23
Probably not unless the IDs are consecutive, and if they are, you are in grave danger of different users' threads clobbering each other.
At best you'll get slowdowns from contention at the hardware level instead of the language level.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:23
yeah the ids are consecutive that's why it works as an array
Ichoran
@Ichoran
Oct 18 2018 00:24
I really think that having a rate limiter that is highly flaky and can possibly clobber state of other users is worse than nothing.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:24
wait i don't understand the rate limiter array only has tokens and timestamps in it how does that mess with other users state?
Ichoran
@Ichoran
Oct 18 2018 00:25
I can't imagine that 6 bytes extra per user is going to be an issue.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:25
other than maybe giving them a free token somehow?
Ichoran
@Ichoran
Oct 18 2018 00:25
Because the CPU doesn't shuffle 2 byte pieces of data around very effectively.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:26
i'm not really understanding. could you explain how the cpu could potentially mess with a different index? i'm just not seeing how that would happen
Ichoran
@Ichoran
Oct 18 2018 00:26
One optimization is to load the whole word (4 or 8 bytes), fix the part that you want to change, and put the whole word back.
If you are single-threaded, that should be totally safe.
If you are not, your past state leaks into your present state.
I don't know whether this optimization is used, but if it is, your strategy will cause problems beyond the 2-byte struct you think you're dealing with.
That's why I'd recommend first taking the default CAS operations and implementing it with that--it will be fast and correct--and then if it's a bottleneck, try direct writing with your scheme.
Anyway, it's not like anything can go disastrously wrong, it's just a feature that may not really do what you want very well, at which point one might as well just leave it out.
John
@gitlabaccount624_gitlab
Oct 18 2018 00:31
i just don't understand enough. a lot of the things you're saying i'm not sure how to interpret. and i think some people are fear mongering/didn't read my use-case entirely and i don't know enough to separate the legitimate claims from the illegitimate
Ichoran
@Ichoran
Oct 18 2018 00:31
(And also, you may find it surprisingly slow as the CPUs have to deal with memory contention for different parts of the same word--core-to-core communication isn't something I've really looked into in detail, but I know it's slower than L1 cache.)
Why don't you write a little benchmark to mock it out on a small array?
Anyway, I need to get going. Good luck!
John
@gitlabaccount624_gitlab
Oct 18 2018 00:33
ok i'm going to spend some time re-reading everything you said. and i don't really trust myself to benchmark properly it's hard
John
@gitlabaccount624_gitlab
Oct 18 2018 01:02
is this ever gonna get merged? rust-lang/rust#32976
John
@gitlabaccount624_gitlab
Oct 18 2018 03:11
nightly only for u8, no?
John
@gitlabaccount624_gitlab
Oct 18 2018 04:40
question: is assignment to a u32 atomic? what about a u8?
for example: if i do x = 5 on one thread and x = 10 on another thread, is it possible that the final result would ever be anything other than 5 or 10?
like, a mangling of bits, such that the final result would be 8 or something?
and same question for reads
Thiez
@Thiez
Oct 18 2018 05:12
Depends on your CPU, cache line size, memory alignment... Just give up this madness and use real synchronization! You can't reliably communicate between threads without synchronisation. It's like people trying to communicate between threads using volatile back in the old days; it doesn't work, it may look like it works, and in is UB.
Vladislav
@vmarkushin
Oct 18 2018 05:21
@Thiez because volatile is just qualifier used to declare that an object can be modified in program by another thread or something else
John
@gitlabaccount624_gitlab
Oct 18 2018 05:30
@Thiez "just give up this madness" =/ i'm trying to learn here. someone in the thread recommended i read https://manishearth.github.io/blog/2015/05/17/the-problem-with-shared-mutability/ and in the post he said "The way caches and memory work; we’ll never need to worry about two processes writing to the same memory location simultaneously and coming up with a hybrid value, or a read happening halfway through a write." and i was asking here to make sure that was true because i thought it might not be
also it doesn't need to be that reliable as i stated already
Thiez
@Thiez
Oct 18 2018 05:54
@NEGIhere volatile can be used for multithreading in languages such as Java and C#, but it has no such meaning in the C/C++ memory model, which is also the one that Rust is based on.
Vladislav
@vmarkushin
Oct 18 2018 05:59
@Thiez yes
Thiez
@Thiez
Oct 18 2018 06:05
@gitlabaccount624_gitlab you say you're trying to learn, but you also seem to be ignoring everyone who tells you that what you are doing is a bad idea / doesn't work. Having a shared data structure is inherently communication between threads, and such communication should always use the right synchronization primitives. You appear to be ignoring this requirement so that your program will be more "efficient", but it looks like you have absolutely no profiling information / benchmarks that prove that your solution is measurably faster. If I look at the algorithm you have presented on reddit a few hours ago, do you know what immediately springs out as an opportunity for optimization to me? Utc::now().timestamp(). Are you making a system call to get the current time for every request? That will be where most of your time is being spent, and that is where you should point your attention if you really care about the efficiency of your rate limiter.
John
@gitlabaccount624_gitlab
Oct 18 2018 06:06
i'm not really ignoring people as much as i am not understanding them because i lack a lot of information so i'm not really sure how to interpret their replies. and yeah i had the same thought about the timestamp but i'm not sure how else to get a unix timestamp in an efficient manner
i think you need timestamps of some form to do rate limiting.. not sure though
that's a good point though. i wonder if the synchronization cost would be dwarfed by the cost of grabbing a timestamp.. hmm
i saw the reply about using queues with mpsc but i'm confused how that would work. i'd still need to wrap it in an arc mutex right?
Thiez
@Thiez
Oct 18 2018 06:24
If you use striped locking you would use mutexes but they would statistically have very low contention. But
You may not need that - perhaps the arc and mutex approach is fast enough, you're not holding the lock very long
I think you could have a thread running in the background that keeps track of timestamps and stores them in an AtomicUsize, so your main threads ran look at that instead of doing the system call
John
@gitlabaccount624_gitlab
Oct 18 2018 06:28
!
that's clever!
Thiez
@Thiez
Oct 18 2018 06:28
Or you could put it inside your mutex because your threads need to access that anyway
John
@gitlabaccount624_gitlab
Oct 18 2018 06:28
wait let me make sure i understand that point properly
wait you just gave me another idea actually
Thiez
@Thiez
Oct 18 2018 06:29
And it doesn't matter if the timestamp thread holds the lock for a few microseconds every few seconds. You don't need to update every second
John
@gitlabaccount624_gitlab
Oct 18 2018 06:29
couldn't i just call the timestamp function once at the start of the program in a thread that runs forever and sleeps every second and all the thread does is replace a global value with the current time?
Thiez
@Thiez
Oct 18 2018 06:30
Sure you could
John
@gitlabaccount624_gitlab
Oct 18 2018 06:30
hmmmmmmmm that's not a bad idea
but wait isn't that a similar problem where there's a global mutable value that is also being read? or does it not matter in that case?
i think it'd matter if there was that issue i was asking about earlier whether or not there can be a case where there is a read in the middle of a write.. i guess it'd need to be an atomic u32
but that's no big deal because rust has that
Thiez
@Thiez
Oct 18 2018 06:34
You could just have a Arc<Mutex<(timestamp, HashMap<userId, (timestamp, tokens)>)>>
your magic thread would update the timestamp regularly, the rest of the threads would access the mutex and perform their bookkeeping.
You could represent the global timestamp as a separate atomic, but it wouldn't help you because the threads checking the tokens have to take the lock to the mutex anyway
John
@gitlabaccount624_gitlab
Oct 18 2018 06:36
ah i see
hmm interesting
Thiez
@Thiez
Oct 18 2018 06:36
so it's not saving you anything to put the timestamp outside the mutex
John
@gitlabaccount624_gitlab
Oct 18 2018 06:37
but now we get back to my earlier problem of not knowing if arc mutexes are the best solution. people were recommending queues, actors, atomics, stripes, etc. and i don't know enough to evaluate the choices.. and then people say just do whatever but the thing is i'm curious about all the other solutions and i want to know. i guess i just need to read more
Thiez
@Thiez
Oct 18 2018 06:38
I would suggest you just implement the mutex solution. It's the simplest, and doing that will give you a better understanding of the problem. Then you can benchmark your solution. Once you are at that point you can try to implement other solutions and compare them
John
@gitlabaccount624_gitlab
Oct 18 2018 06:39
ok
Sergey Bushnyak
@sigrlami
Oct 18 2018 08:24
Is there a guide on what to consider writing software "the Rust way"? I mean there should be already something like best practices for different parts. Something like "don’t Index, Iterate" or similar. Interested in your experience
Sylwester Rąpała
@xoac
Oct 18 2018 08:27
I have found some information here https://doc.rust-lang.org/1.0.0/style/README.html but it has a lot FIXME and RFC. And I think now is a little outdated. But still good if you are new in rust
Sergey Bushnyak
@sigrlami
Oct 18 2018 08:35
@xoac thanks, still wip, I'm more interested in Rust only things, like RAII guards, or "never do this". Mabe there is something related? what you consider when starting new project. I have experience with Haskell, OCaml and it's typical case to start with types first defining your domain and then progress to functionality.
Sylwester Rąpała
@xoac
Oct 18 2018 08:41

I didn't see that guide. You define structs and traits you need or import crates and then functionality like in other languages..
In the book is simple application: https://doc.rust-lang.org/stable/book/second-edition/ch20-00-final-project-a-web-server.html

But I think you want to use already some frameworks.. like tokio, actix etc.. depend what kind of application u wanna crate.

Maybe u will be interested in rust news https://this-week-in-rust.org
For example functionality available since rust 1.26 https://blog.rust-lang.org/2018/05/10/Rust-1.26.html
Sergey Bushnyak
@sigrlami
Oct 18 2018 08:49
@xoac thanks, "this week in Rust" might be interesting. I just found good article on what I actually need https://llogiq.github.io/2017/06/01/perf-pitfalls.html list of performance pitfalls from developer experience, now I can avoid some of them. I think that's a good way to describe best practices.
Sylwester Rąpała
@xoac
Oct 18 2018 08:54
You can collect posts and then publish interesting list :)
Michal 'vorner' Vaner
@vorner
Oct 18 2018 09:15
@sigrlami I guess I've learned by reading a lot of code. Like, if there's some interesting function in docs, click on the [src] button. Or just looking at how how the API looks like. But there's also a lot of links here (rust by example can be a good source). And there's this (I don't know why it is not linked anywhere) https://rust-lang-nursery.github.io/api-guidelines/
Sergey Bushnyak
@sigrlami
Oct 18 2018 09:31
@vorner thanks, nice link. Never saw it. Yes, that's natural way to learn and how I usually do in other programming languages but sometimes you need to dive in very quickly and avoid common errors.
raja sekar
@rajasekarv
Oct 18 2018 12:56
As I am experimenting with Rust, I noticed it's string to float method is considerably slower than hand written code. Golang string to float is also faster than Rust's. Why can't we use Go way to use fast algorithm where round off won't be required and keep the slow correct algorithm for the one where round off will be required.
Michal 'vorner' Vaner
@vorner
Oct 18 2018 13:13

@rajasekarv First thing, are you running a release build? If so and you've got a faster (correct) implentation, you can of course open a merge request on the standard library. Or, even if you have a way that should be faster, but haven't written it yet, you can open an issue there describing it.

However, if you'd need to change the API (eg. offering two parsing methods instead of one), then that would need to go through the full RFC process (and changing behaviour of the current one would probably not be accepted, things are supposed to stay compatible).

Anyway, in general, if there's a balancing between correctness and speed, Rust first tries to do both and if that fails, prefers correctness in the place where people are going to reach for the thing first (and maybe offer alternatives as additional methods or something).

raja sekar
@rajasekarv
Oct 18 2018 13:15
I am running a release build only. More than 90% of the time is
raja sekar
@rajasekarv
Oct 18 2018 13:21
Spent on float parsing itself, which led me to look into this. I reimplemented the code in Go which ran considerably faster. So I checked out their implementation. There they first check whether the string when get converted to float needs to be rounded off. Depending on yes/no they apply two alorithms. One is slow which will always give proper answer and another one is fast but works correctly only where no round off is not required. So even if they spend some time in checking whether the number needs to be truncated, in real world cases like mine, overall code works faster. In Rust as far as I checked they implemented that slow code no matter what the input string
I am not an expert in this, it would be helpful if someone explains the rationale behind Rust's choice here
Michal 'vorner' Vaner
@vorner
Oct 18 2018 13:26

Well, quite a big possibility is that there's no rationale about it, that someone just did it at the early days and nobody touched it since.

But if there is a rationale and it was discussed, the place to look for the historical records would probably be either the RFC repository or the internals forum. Some discussion might have happened in the rust repo's issues as well, maybe.

If you find none, then assume the first possibility. Then you can open up the discussion ‒ probably in the internals forum.

Denis Lisov
@tanriol
Oct 18 2018 13:27
raja sekar
@rajasekarv
Oct 18 2018 13:34
Thanks @tanriol . This clears up a lot.
Sergey Bushnyak
@sigrlami
Oct 18 2018 13:46
Is there any list of libraries that are really needed in Rust ecosystem? or better domain areas? I'm trying to find specific niche where I can learn more by implementing and wanted to be it something useful to community. Interesting that whatever term I try to search already have a Rust library which is nice :)
Ichoran
@Ichoran
Oct 18 2018 14:10
There aren't many tools for static-length arrays (as opposed to slices of them). arraymap is the only crate I know of that treats them directly. But to do this at all well you'd need to use macros, which is a bit finicky for a first project.
There are also really big things like image IO libraries that aren't very complete, but again, not a great starting project to do comprehensively.
Sergey Bushnyak
@sigrlami
Oct 18 2018 14:44
@Ichoran what about profiling, performance tracking tools?
Vladislav
@vmarkushin
Oct 18 2018 16:31
@sigrlami I think you can use the same tools as C-programmers use
For example, GDB is perfectly working with rust programs
Sergey Bushnyak
@sigrlami
Oct 18 2018 16:41
@NEGIhere thanks, I was more interested in performance analysis, already found criterion.rs for that
Michal 'vorner' Vaner
@vorner
Oct 18 2018 16:54
@sigrlami https://this-week-in-rust.org/ comes with a call-for-help issues, so you can look in there. Or you can pick a project (or library) you like and either fix your own pain points in that, or look for some E-Easy tasks there. The big ones (eg. rust itself) often offer mentoring at some of them too.
@Ichoran There's arrayvec and smallvec. They act a lot like vectors and implement a lot of their traits (like FromIterator), so that covers a lot. Or what tools would you need?