- A+

I have some large arrays (~100 million points) that I need to interactively plot. I am currenlty using Matplotlib. Plotting the arrays as-is gets very slow and is a waste since you can't visualize that many points anyway.

So I made a min/max decimation function that I tied to the 'xlim_changed' callback of the axis. I went with a min/max approach because the data contains fast spikes that I do not want to miss by just stepping through the data. There are more wrappers that crop to the x-limits, and skip processing under certain conditions but the relevant part is below:

`def min_max_downsample(x,y,num_bins): """ Break the data into num_bins and returns min/max for each bin""" pts_per_bin = x.size // num_bins #Create temp to hold the reshaped & slightly cropped y y_temp = y[:num_bins*pts_per_bin].reshape((num_bins, pts_per_bin)) y_out = np.empty((num_bins,2)) #Take the min/max by rows. y_out[:,0] = y_temp.max(axis=1) y_out[:,1] = y_temp.min(axis=1) y_out = y_out.ravel() #This duplicates the x-value for each min/max y-pair x_out = np.empty((num_bins,2)) x_out[:] = x[:num_bins*pts_per_bin:pts_per_bin,np.newaxis] x_out = x_out.ravel() return x_out, y_out `

This works pretty well and is sufficiently fast (~80ms on 1e8 points & 2k bins). There is very little lag as it periodically recalculates & updates the line's x & y-data.

However, my only complaint is in the x-data. This code duplicates the x-value of each bin's left edge and doesn't return the true x-location of the y min/max pairs. I typically set the number of bins to double the axis pixel width. So you can't really see the difference because the bins are so small...but I know its there... and it bugs me.

So attempt number 2 which does return the actual x-values for every min/max pair. However it is about 5x slower.

`def min_max_downsample_v2(x,y,num_bins): pts_per_bin = x.size // num_bins #Create temp to hold the reshaped & slightly cropped y y_temp = y[:num_bins*pts_per_bin].reshape((num_bins, pts_per_bin)) #use argmax/min to get column locations cc_max = y_temp.argmax(axis=1) cc_min = y_temp.argmin(axis=1) rr = np.arange(0,num_bins) #compute the flat index to where these are flat_max = cc_max + rr*pts_per_bin flat_min = cc_min + rr*pts_per_bin #Create a boolean mask of these locations mm_mask = np.full((x.size,), False) mm_mask[flat_max] = True mm_mask[flat_min] = True x_out = x[mm_mask] y_out = y[mm_mask] return x_out, y_out `

This takes roughly 400+ ms on my machine which becomes pretty noticeable. So my question is basically is there a way to go faster and provide the same results? The bottleneck is mostly in the `numpy.argmin`

and `numpy.argmax`

functions which are a good bit slower than `numpy.min`

and `numpy.max`

.

The answer might be to just live with version #1 since it visually doesn't really matter. Or maybe try to speed it up something like cython (which I have never used).

FYI using Python 3.6.4 on Windows ... example usage would be something like this:

`x_big = np.linspace(0,10,100000000) y_big = np.cos(x_big ) x_small, y_small = min_max_downsample(x_big ,y_big ,2000) #Fast but not exactly correct. x_small, y_small = min_max_downsample_v2(x_big ,y_big ,2000) #correct but not exactly fast. `

I managed to get an improved performance by using the output of `arg(min|max)`

directly to index the data arrays. This comes at the cost of an extra call to `np.sort`

but the axis to be sorted has only two elements (the min. / max. indices) and the overall array is rather small (number of bins):

`def min_max_downsample_v3(x, y, num_bins): pts_per_bin = x.size // num_bins x_view = x[:pts_per_bin*num_bins].reshape(num_bins, pts_per_bin) y_view = y[:pts_per_bin*num_bins].reshape(num_bins, pts_per_bin) i_min = np.argmin(y_view, axis=1) i_max = np.argmax(y_view, axis=1) r_index = np.repeat(np.arange(num_bins), 2) c_index = np.sort(np.stack((i_min, i_max), axis=1)).ravel() return x_view[r_index, c_index], y_view[r_index, c_index] `

I checked the timings for your example and I obtained:

`min_max_downsample_v1`

: 110 ms ± 5 ms`min_max_downsample_v2`

: 240 ms ± 8.01 ms`min_max_downsample_v3`

: 164 ms ± 1.23 ms

I also checked returning directly after the calls to `arg(min|max)`

and the result was equally 164 ms, i.e. there's no real overhead after that anymore.