ocramz on gh-pages
Add `sampling` (compare)
ocramz on gh-pages
Add kdt, Supervised Learning se… (compare)
ocramz on gh-pages
Add arrayfire (compare)
ocramz on gh-pages
add inliterate (compare)
ocramz on gh-pages
update hvega entry (compare)
ocramz on gh-pages
Add pcg-random (compare)
ocramz on gh-pages
Fix graphite link Merge pull request #41 from alx… (compare)
looked at falcon, about the data size it handled:
10M flights in the browser and ~180M flights or ~1.7B stars when connected to OmniSciDB (formerly known as MapD)
make me remember that at a time about months ago, I'd had to increase chrome's heap size to around 12GB to visualize one of my dataset (with BokehJS frontend and golang backend), as even the 64-bit version of chrome has a default heap size limit around 3.5GB:
performance.memory.jsHeapSizeLimit/1024/1024/1024
3.501772880554199
seems they haven't hit this limit, while Bokeh already exceeded.
numeric-libs-benchmarks
repo is very incomplete wrt fastest-matrices
Suppose a 2d array initialisation whereby each entry depends on the entries immediately above the current entry and to the left and above and to the left. That can naturally be computed in parallel by traversing along the diagonals. Example:
1 2 3
4 5 6
7 8 9
You start with 1, then compute 4 and 2 in parallel then 7 5 3 in parallel then 8 6 in parallel then 9 in parallel
waterfallCreate ::
(Mutable r Ix2 a, PrimMonad m, MonadUnliftIO m, MonadThrow m)
=> Sz2
-> (Maybe a -> Maybe a -> a)
-> (a -> a -> a)
-> m (Array r Ix2 a)
waterfallCreate sz@(Sz2 m n) g f =
createArray_ Par sz $ \scheduler marr -> do
forM_ (0 ..: m) $ \i -> writeM marr (i :. 0) . g Nothing =<< A.read marr (i - 1 :. 0)
-- ^ fill left edge
forM_ (1 ..: n) $ \j -> do
writeM marr (0 :. j) . (`g` Nothing) =<< A.read marr (0 :. j - 1)
-- ^ fill the top edge
let jIter = rangeStepSize Seq j (-1) (Sz (min (m - 1) j))
iforSchedulerM_ scheduler jIter $ \i' -> writeF marr (i' + 1)
forM_ (2 ..: m) $ \i -> do
let jIter = rangeStepSize Seq (n - 1) (-1) (Sz (min (n - 1) (m - i)))
iforSchedulerM_ scheduler jIter $ \i' -> writeF marr (i' + i)
where
writeF marr i j = do
left <- readM marr (i :. j - 1)
top <- readM marr (i - 1 :. j)
writeM marr (i :. j) $ f left top
{-# INLINE writeF #-}
{-# INLINE waterfallCreate #-}
g
and f
instead of one, because it will be more performant if you supply separate functions: one that produces elements for the borders and another one for the inside, this way function f
always knows that it will get the neighbors it needs and doesn't have to check for that edge case. Here is a cool part, because you know that the whole array will be filled up, and since f
and g
are pure, we are guaranteed that the whole waterfallCreate
is pure, so we can safely wrap it in unsafePerformIO
. More over the interior read and write functions, are guaranteed not to be out of bounds, so once you are done with all the testing you can improve the performance even further by switching to unsafe*
functions: