- A+

I have a quite large 1d numpy array Xold with given values. These values shall be replaced according to the rule specified by a 2d numpy array Y: An example would be

`Xold=np.array([0,1,2,3,4]) Y=np.array([[0,0],[1,100],[3,300],[4,400],[2,200]]) `

Whenever a value in Xold is identical to a value in Y[:,0], the new value in Xnew should be the corresponding value in Y[:,1]. This is accomplished by two nested for loops:

`Xnew=np.zeros(len(Xold)) for i in range(len(Xold)): for j in range(len(Y)): if Xold[i]==Y[j,0]: Xnew[i]=Y[j,1] `

With the given example, this yields `Xnew=[0,100,200,300,400]`

. However, for large data sets this procedure is quite slow. What is a faster and more elegant way to accomplish this task?

We can use `np.searchsorted`

for a generic case when the data in first column of `Y`

is not necessarily sorted -

`sidx = Y[:,0].argsort() out = Y[sidx[np.searchsorted(Y[:,0], Xold, sorter=sidx)],1] `

Sample run -

`In [53]: Xold Out[53]: array([14, 10, 12, 13, 11]) In [54]: Y Out[54]: array([[ 10, 0], [ 11, 100], [ 13, 300], [ 14, 400], [ 12, 200]]) In [55]: sidx = Y[:,0].argsort() ...: out = Y[sidx[np.searchsorted(Y[:,0], Xold, sorter=sidx)],1] In [56]: out Out[56]: array([400, 0, 200, 300, 100]) `

If not all elements have corresponding mappings available, then we need to do a bit more of work, like so -

`sidx = Y[:,0].argsort() sorted_indx = np.searchsorted(Y[:,0], Xold, sorter=sidx) sorted_indx[sorted_indx==len(sidx)] = len(sidx)-1 idx_out = sidx[sorted_indx] out = Y[idx_out,1] out[Y[idx_out,0]!=Xold] = 0 # NA values as 0s `