@certik , multiplying series (cos*sin) is fastest with Flint, see below:

```
N=100 N=1000
flint fmpq_poly_mul() 0.27 88
flint fmpq_poly_mullow() 0.29 91
piranha psin*pcos trunc. 0.69 690
piranha psin*pcos no trunc. 5.0 4670
pari sin(x)+cos(x) 11 1400
```

*that is pari sin times cos, not add

@rwst great job! This is awesome. How do you like benchpress?

I started playing with Nonius in #634, but I don't like its dependence on Boost.

@bluescarni any ideas why Piranha is so slow?

I noticed at https://github.com/rwst/series-benchmark/blob/bf4d56268878af0c7b142592e42460c83565f153/piranha.cpp#L17, that it should be better to use the kronecker monomial, wouldn't it?

@certik , benchpress needs some love, I had to fiddle quite a bit and had to install gcc-4.9, and then it still would only work the way I found myself, not as advertised. But I like its lightweightness. But you're right I should test the kronecker monomial.

@rwst another problem is with the length *exactly* does

`N`

. What `fmpq_poly_sin_series(x, x, N)`

do?
In Piranha, the following code

```
for (unsigned int i=0; i<100; i++) {
const short j = 2*i + 1;
if (i != 0)
prod *= 1-j;
prod *= j;
psin += rational{1,prod} * x.pow(j);
}
```

returns a series in $sin(x)$ up to $O(x^{199})$, but in Flint you might be only doing series up to $O(x^{99})$.

@rwst would you mind rerunning the benchmarks with the correct series size? When we say length N, let's just make it mean $O(x^N)$, I think that's the most intuitive.

I just submitted a benchmark when double precision coefficients are used: rwst/series-benchmark#1

And it runs at about 88% of theoretical peak performance on my computer.

@bluescarni we've discussed this already in private email, but since others might also be interested in this: what's nice is that in Piranha, you can also switch to double precision coefficients, and thus comparing against my Fortran code then tells you the overall slowdown of the Piranha's machinery, compared to the theoretical maximum (that can be essentially attained with the above code). And double precision coefficients might actually be useful in some applications.