- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

how the pypy pieces fit together http://rpython.readthedocs.io/en/latest/translation.html#how-it-fits-together

Difficult / interesting things are: 1) edge case semantics (NaN, NaT etc.) 2) broadcasting 3) complicated indexing 4) the myriad of weird high-level helpers Numpy has

Of course you also want the exposed IR to be expressive enough for efficient optimizations (hence my remark about nditer() for commutative reductions)

though

`__add__`

is a bit tricky in that numpy's scalar `__add__`

does overflow checking
Call it "numnum"

"pitrou"

you can constant/variable expand the scalar value into a vector

and for no sensible reason, scalar add has overflow checking and

`np.add`

does not
(personally I think the ideal semantics would be that numpy by default does overflow-checking for both scalars and arrays, and that this can be controlled by something like

`np.errstate`

, so if people need maximum speed they can explicitly turn off the checking. In this case there would still be a branch to check whether checking was enabled, but this seems easily hoistable.)
```
In [7]: np.int32(2**31 - 1) * np.int32(2**31 - 1)
/home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: overflow encountered in int_scalars
#!/home/njs/.user-python3.5-64bit/bin/python3.5
Out[7]: 1
```

^^ python language semantics may disagree but :-/

and if you use arrays instead, you get 1 without the warning

(controlled by something like np.errstate)

right now it's unconditional for scalars and impossible for arrays; I think it should be consistent between scalars and arrays, controllable, and default to on (in decreasing order of priority)

*Representative of the things that people would want to be faster in numpy

It could also help make it more clear how the work on the add function would end up translating into speedups

@kmod : there were some links to existing benchmarks earlier in chat here: https://gitter.im/python-compilers-workshop/chat?at=57851290b79455146fa44595

It seems like there's not much cross-C work to be done there

Or well, I don't know, it's not obvious to me what people would want to be faster

yeah, that cronbach benchmark is not going to really benefit from any fancy JIT

I bet if you google any numba talk or tutorial you will find a nice example of a function that could be made much faster given a jit that understands numpy :-)

@njsmith , I have a minimal PyIR interpreter working for our "array sum" function at https://github.com/sklam/pyir_interpreter

Mark had some foresight here :)