What's causing so much overhead in Google BigQuery query?

  • A+

I am running the following function to profile a BigQuery query:

# q = "SELECT * FROM bqtable LIMIT 1'''  def run_query(q):     t0 = time.time()     client = bigquery.Client()     t1 = time.time()     res = client.query(q)     t2 = time.time()     results = res.result()     t3 = time.time()     records = [_ for _ in results]     t4 = time.time()     print (records[0])     print ("Initialize BQClient: %.4f | ExecuteQuery: %.4f | FetchResults: %.4f | PrintRecords: %.4f | Total: %.4f | FromCache: %s" % (t1-t0, t2-t1, t3-t2, t4-t3, t4-t0, res.cache_hit)) 

And, I get something like the following:

Initialize BQClient: 0.0007 | ExecuteQuery: 0.2854 | FetchResults: 1.0659 | PrintRecords: 0.0958 | Total: 1.4478 | FromCache: True

I am running this on a GCP machine and it is only fetching ONE result in location US (same region, etc.), so the network transfer should (I hope?) be negligible. What's causing all the overhead here?

I tried this on the GCP console and it says the cache hit takes less than 0.1s to return, but in actuality, it's over a second. Here is an example video to illustrate: https://www.youtube.com/watch?v=dONZH1cCiJc.

Notice for the first query, for example, it says it returned in 0.253s from cache:

What's causing so much overhead in Google BigQuery query?

However, if you view the above video, the query actually STARTED at 7 seconds and 3 frames --

What's causing so much overhead in Google BigQuery query?

And it COMPLETED at 8 seconds and 13 frames --

What's causing so much overhead in Google BigQuery query?

That is well over a second -- almost a second and a half!! That number is similar to what I get when I execute a query from the command-line in python.

So why then does it report that it only took 0.253s when in actuality, to do the query and return the one result, it takes over five times that amount?

In other words, it seems like there's about a second overhead REGARDLESS of the query time (which are not noted at all in the execution details). Are there any ways to reduce this time?


The UI is reporting the query execution time, not the total time.

Query execution time is how long it takes BigQuery to actually scan the data and compute the result. If it's just reading from cache then it will be very quick and usually under 1 second, which reflects the timing you're seeing.

However that doesn't include downloading the result table and displaying it in the UI. You measured this yourself in your Python script which shows the FetchResults step taking over 1 second, and this is the same thing that's happening in the browser console. For example, a cached query containing millions of rows will be executed very quickly but might take 30 seconds to fully download.

BigQuery is a large-scale analytical (OLAP) system and is designed for throughput rather than latency. It uses a distributed design with an intensive planning process and writes all results to temporary tables. This allows it to process petabytes in seconds but the trade-off is that every query will take a few seconds to run, no matter how small.

You can look at the official documentation for more info on query planning and performance, but in this situation there is no way to reduce the total latency any further and a few seconds is currently the best case scenario for BigQuery.

If you need lower response times for repeated queries then you should look into storing the results in your own caching layer (like Redis), or using BigQuery to aggregate data into a much smaller dataset and then storing that in a traditional relational database (like Postgres or MySQL).


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: