How to do binary search by table with known data order in specific fields SQL

  • A+
Category:Languages

I have a table with structure like this:

create table to_much_data (     id primary key clustered,     dt datetime,     data varbinary(400) ) 

it didn't have an index by datetime, but i know that dt non-decreasing sequence. I need to query data from this table with specific condition by date field like this:

select *  from to_much_data where dt between '20190220' and '20190221' 

since no index for dt, i prefer to convert query to:

select *  from to_much_data where id between StartDateID and EndDateID  

I believe that StartDateID and EndDateID could be found with log(N) or better complexity. But I didn't know any solution to do this.

Does any one know the way to do it?

UPD

It looks like no wide-known ready to use solution exists. If index creation is not possible some workarounds can be used:

  • filtered index, but it can affect table performance and increase locks
  • another table with mapping, but it needs to be manually updated (or through trigger or stored procedure) and it can affect performance and increase locks
  • t-sql code with silly binary search but it looks like bicycle reinventing

despite this I believe that databases can be more effective and intuitive in some cases like this. I will be glad if some day we will be able to write:

select *  from to_much_data with(sequence_order(id asc, dt asc)) where dt between '20190220' and '20190221' 

 


You could just reproduce the binary search algorithm in TSQL or use a recursive CTE but this is still going to need greater than 70 seeks to get both ends and is tedious to do.

A possible middle ground might be to create an indexed view with at least every nth row. For example

CREATE VIEW dbo.to_much_data_Sample WITH SCHEMABINDING AS   SELECT id,          dt   FROM   dbo.to_much_data   WHERE  id % 100000 = 0  GO  CREATE UNIQUE CLUSTERED INDEX ix   ON dbo.to_much_data_Sample(dt, id); 

you can then use (assuming id is an integer)

DECLARE @StartDate DATETIME = '20190220',         @EndDate   DATETIME = '20190221';  DECLARE @StartDateID INT,         @EndDateID   INT;  SELECT TOP 1 @StartDateID = id FROM   dbo.to_much_data_Sample WITH (NOEXPAND) WHERE  dt < @StartDate ORDER  BY dt DESC;  SELECT TOP 1 @EndDateID = id FROM   dbo.to_much_data_Sample WITH (NOEXPAND) WHERE  dt > @EndDate ORDER  BY dt ASC;  SELECT * FROM   to_much_data WHERE  id BETWEEN isnull(@StartDateID, -2147483648) AND isnull(@EndDateID, 2147483647)        AND dt BETWEEN @StartDate AND @EndDate;  

How to do binary search by table with known data order in specific fields SQL

The value of n will be a trade off between index size and number of additional rows read at runtime.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: