df = df.repartition(partition_size="100MB")I get ValueError: need at least one array to concatenate
def __init__(self,client): self.out_folder = None self.client = client
I've been watching some videos about speeding up the dask scheduler and this is really exciting!
The last time I tested dask (1-2 yrs ago) it was too slow for running our workflows (running simple task graphs on 500-1000 nodes (~20.000 cores)). Is anyone aware of what the largest scale is at which someone has run dask computations?
.visualize()that prevents me from seeing all these extra tasks:
Falseby default in