I'm trying to parse a (big) XML file, do some aggregation and then send the resulting data as JSON to another service. What I did so far is parse the XML into data structures, turn the data structures into JSON, send the JSON to the other service, all one after the other. That was using quite a bit of memory and also wasn't too fast. I thought using hyper::Body::channel could help with that. So I've been using it somewhat like this: https://gist.github.com/endor/4aab0dc39f844af634b6dbf37f9ad731.
I imagine this is probably not the most efficient way to use both hyper and tokio, but I haven't really understood the concepts in tokio yet.
First I tried to send every little bit of data I just processed with sender.send_data().await, but that seemed to be very slow even though it was using less memory. Then I tried to compile larger chunks first and send those the same way and that was faster (as in total request/processing time), but was using a bit more memory.
My question is whether I'm using hyper::Body::channel correctly in combination with tokio and whether there is some way to make this still faster. For reference, the XML I'm parsing is 147MB, the peak memory usage is about 430MB and the time it takes from start to finish is 3min, which seems quite long.
Body::channel()) is probably the least of your worries
I did some a while ago, but just to make sure I did one again. The XML parsing and aggregation part takes about 37s. This could likely also be improved, but it still leaves ~2.5min for the JSON sending. Or rather for the JSON building from internal data structures and sending.
I was thinking that the sending takes so long, because it makes a difference in how long the whole thing takes whether I send very small pieces or whether I first build larger JSON pieces and then send those.
@inzanez It turns out we should be downcasting to
TcpStream. From there you can access inner
TcpStream if you want:
let upgraded_parts = upgraded.downcast::<AddrStream>().unwrap(); let upgraded_tcp_stream = upgraded_parts.io.into_inner();
Credit to sfackler on Discord.