The well known formula for OLS is (X'X)^(-1)X'y where X is nxK and y is nx1.One way to implement this in Julia is (X'*X)/X'*y.

## why does a*b*a take longer than (a'*(a*b)')' when using gpuArray in Matlab scripts?

The code below performs the operation the same operation on gpuArrays a and b in two different ways. The first part computes (a'*(a*b)')' , while the second part computes a*b*a. The results are then verified to be the same.