The NVidia GP100 has 30 TPC circuits and 240 "texture units". Do the TPCs and texture units get used by TensorFlow, or are these disposable bits of silicon for machine learning?
The code below performs the operation the same operation on gpuArrays a and b in two different ways. The first part computes (a'*(a*b)')' , while the second part computes a*b*a. The results are then verified to be the same.