# Why does nansum work for input that exceeds matrix dimensions?

• A+
Category：Languages

I am wondering about matlab's `nansum`function.

When I use the example from the documentation

``X = magic(3); X([1 6:9]) = repmat(NaN, 1, 5);  X =     NaN     1   NaN      3     5   NaN      4   NaN   NaN ``

and then call

``>> nansum(X, 1)  ans =       7     6     0  >> nansum(X, 2)  ans =       1      8      4 ``

it works as expected.

However, what I did not expect is that it also works for

``>> nansum(X, 400)  ans =       0     1     0      3     5     0      4     0     0 ``

What is the reasoning here? Why wouldn't this crash with the error that `dim`exceeds the matrix dimensions?

In MATLAB, all arrays/matrices have infinite singleton trailing dimensions.

A singleton dimension is a dimension, `dim`, where `size(A,dim) = 1`. It's called a trailing singleton dimension when it comes after all non-singleton dimensions (i.e. it doesn't change the structure of the matrix).

Any function (including `nansum`) which can operate on a specific dimension can do so on any one of the infinite singleton dimensions. Often you wont see any affect (for instance using `max` or `sum` in this way simply returns the inputs), but `nansum` replaces `NaN` with zero, so that's all that happens.

Note that `nansum(A,dim)` is the same as `sum(A,dim,'omitnan')`. You can see this by typing `edit nansum`. So my example uses `sum` for ease. See the bottom of this answer for references about defined behaviour.

Let's try to visualise this:

``A = ones(3,4); size( A ) % >> ans = [3, 4] % Under the hood: % size( A ) = [3, 4, 1, 1, 1, 1, ...] sum( A, 1 )   % Sum through the rows, or the 1st dimension, which has 3 elements per sum               % >> ans = [3 3 3 3] sum( A, 2 )   % Sum through the columns, or the 2nd dimension, which has 4 elements per sum               % >> ans = [4; 4; 4] sum( A, 400 ) % Sum through the ???, the 400th dimension, which has 1 element per sum               % >> ans = [1 1 1 1; 1 1 1 1; 1 1 1 1] ``

If you wanted, you could `reshape` the original matrix to have singleton 2nd through 399th dimensions to further this:

``% Set up dimensions as [3, 1, 1, ..., 1, 1, 4], for a 400-D array! dims = num2cell( [3 ones(1,398), 4] ); % Note we'll now still have trailing singleton dims, but have 398 in the structure too B = reshape( A, dims{:} );  ``

Now we can do a similar `sum` example. The final thing to know is that `squeeze` removes non-trailing singleton dimensions, we can use this to tidy up the outputs:

``sum( B, 1 ); % >> ans(:,:,1,1,1,...,1) = 3               % >> ans(:,:,1,1,1,...,2) = 3              % >> ans(:,:,1,1,1,...,3) = 3              % >> ans(:,:,1,1,1,...,4) = 3 squeeze( sum( B, 1 ) ); % >> ans = [3; 3; 3; 3]   % similarly   squeeze( sum( B, 2 ) );   % >> ans = [1 1 1 1; 1 1 1 1; 1 1 1 1] squeeze( sum( B, 400 ) ); % >> ans = [4; 4; 4] ``

We can see that, now we've reshaped things, summing in the 400th dimension does the same as originally summing in the 2nd dimension and vice-versa. This would be easier to visualise if you replaced 400 with 3!

[ 1 ] See the `sum` and `max` documentation as examples where the behaviour is explicitly defined "if dim is greater than `ndims(A)`." In both cases the implementation is made more efficient by just returning `A`. In the case of `nansum` there has to be some computation in case elements are `NaN`.