why perf has such high context-switches?

  • A+

I was trying to understand the linux perf, and found some really confusing behavior:

I wrote a simple multi-threading example with one thread pinned to each core; each thread runs computation locally and does not communicate with each other (see test.cc below). I was thinking that this example should have really low, if not zero, context switches. However, using linux perf to profile the example shows thousands of context-switches - much more than what I expected. I further profiled the linux command sleep 20 for a comparison, showing much fewer context switches.

This profile result does not make any sense to me. What is causing so many context switches?

> sudo perf stat -e sched:sched_switch ./test  Performance counter stats for './test':                   6,725  sched:sched_switch                                                  20.835 seconds time elapsed  > sudo perf stat -e sched:sched_switch sleep 20   Performance counter stats for 'sleep 20':                   1      sched:sched_switch                                                  20.001 seconds time elapsed 

For reproducing the results, please run the following code:

perf stat -e context-switches sleep 20 perf stat -e context-switches ./test 

To compile the source code, please type the following code:

g++ -std=c++11 -pthread -o test test.cc 
// test.cc #include <iostream> #include <thread> #include <vector>  int main(int argc, const char** argv) {   unsigned num_cpus = std::thread::hardware_concurrency();   std::cout << "Launching " << num_cpus << " threads/n";    std::vector<std::thread> threads(num_cpus);   for (unsigned i = 0; i < num_cpus; ++i) {     threads[i] = std::thread([i] {       int j = 0;       while (j++ < 100) {         int tmp = 0;         while (tmp++ < 110000000) { }       }     });      cpu_set_t cpuset;     CPU_ZERO(&cpuset);     CPU_SET(i, &cpuset);     int rc = pthread_setaffinity_np(threads[i].native_handle(),                                     sizeof(cpu_set_t), &cpuset);     if (rc != 0) {       std::cerr << "Error calling pthread_setaffinity_np: " << rc << "/n";     }   }    for (auto& t : threads) {     t.join();   }   return 0; } 


We can't tell you exactly what is being scheduled - but you can find out yourself using perf.

perf record -e sched:sched_switch ./test 

Note this requires a mounted debugfs and root permissions. Now a perf report will give you an overview of what the scheduler was switching to (or see perf script for a full listing). Now there is no apparent thing in your code that would cause a context switch (e.g. sleep, waiting for I/O), so it is most likely another task that is being scheduled on these cores.

The reason why sleep has almost no context switches is simple. It goes to sleep almost immediately - which is one context switch. While the task is not active, it cannot be displaced by another task.


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: