Replace elements in numpy array avoiding loops

  • A+

I have a quite large 1d numpy array Xold with given values. These values shall be replaced according to the rule specified by a 2d numpy array Y: An example would be

Xold=np.array([0,1,2,3,4]) Y=np.array([[0,0],[1,100],[3,300],[4,400],[2,200]]) 

Whenever a value in Xold is identical to a value in Y[:,0], the new value in Xnew should be the corresponding value in Y[:,1]. This is accomplished by two nested for loops:

Xnew=np.zeros(len(Xold)) for i in range(len(Xold)): for j in range(len(Y)):     if Xold[i]==Y[j,0]:         Xnew[i]=Y[j,1] 

With the given example, this yields Xnew=[0,100,200,300,400]. However, for large data sets this procedure is quite slow. What is a faster and more elegant way to accomplish this task?


We can use np.searchsorted for a generic case when the data in first column of Y is not necessarily sorted -

sidx = Y[:,0].argsort() out = Y[sidx[np.searchsorted(Y[:,0], Xold, sorter=sidx)],1] 

Sample run -

In [53]: Xold Out[53]: array([14, 10, 12, 13, 11])  In [54]: Y Out[54]:  array([[ 10,   0],        [ 11, 100],        [ 13, 300],        [ 14, 400],        [ 12, 200]])  In [55]: sidx = Y[:,0].argsort()     ...: out = Y[sidx[np.searchsorted(Y[:,0], Xold, sorter=sidx)],1]  In [56]: out Out[56]: array([400,   0, 200, 300, 100]) 

If not all elements have corresponding mappings available, then we need to do a bit more of work, like so -

sidx = Y[:,0].argsort() sorted_indx = np.searchsorted(Y[:,0], Xold, sorter=sidx) sorted_indx[sorted_indx==len(sidx)] = len(sidx)-1 idx_out = sidx[sorted_indx] out = Y[idx_out,1] out[Y[idx_out,0]!=Xold] = 0 # NA values as 0s 


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: