# Why does Python copy numpy arrays where the length of the dimensions are the same?

• A+
Category：Languages

I have a problem with referencing to a numpy array. I have an array of the form

``import numpy as np a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),      np.array([0.0, 0.2, 0.4, 0.6, 0.8]),      np.array([0.0, 0.2, 0.4, 0.6, 0.8])] ``

and if I now create a new variable

``b = np.array(a) ``

and do

``b += 1 print(a) ``

then `a` is not changing.

``a = [array([0. , 0.2, 0.4, 0.6, 0.8]),       array([0. , 0.2, 0.4, 0.6, 0.8]),       array([0. , 0.2, 0.4, 0.6, 0.8])] ``

But if I do the same thing with:

``a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),      np.array([0.0, 0.2, 0.4, 0.6, 0.8]),      np.array([0.0, 0.2, 0.4, 0.6])] ``

so I removed one number in the end of the last dimension. Then I do this again:

``b = np.array(a) b += 1 print(a) ``

Now `a` is changing, what I thought is the normal behavior in python.

``a = [array([1. , 1.2, 1.4, 1.6, 1.8]),       array([0. , 0.2, 0.4, 0.6, 0.8]),       array([0. , 0.2, 0.4, 0.6])] ``

Can anybody explain me this?

In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.

With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.

With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.