Efficiently find an integer not in a set of size 40, 400, or 4000

  • A+
Category:Languages

Related to the classic problem find an integer not among four billion given ones but not exactly the same.

To clarify, by integers what I really mean is only a subset of its mathemtical definition. That is, assume there are only finite number of integers. Say in C++, they are int in the range of [INT_MIN, INT_MAX].

Now given a std::vector<int> (no duplicates) or std::unordered_set<int>, whose size can be 40, 400, 4000 or so, but not too large, how to efficiently generate a number that is guaranteed to be not among the given ones?

If there is no worry for overflow, then I could multiply all nonzero ones together and add the product by 1. But there is. The adversary test cases could delibrately contain INT_MAX.

I am more in favor of simple, non-random approaches. Is there any?

Thank you!

Update: to clear up ambiguity, let's say an unsorted std::vector<int> which is guaranteed to have no duplicates. So I am asking if there is anything better than O(n log(n)). Also please note that test cases may contain both INT_MIN and INT_MAX.

 


You could just return the first of N+1 candidate integers not contained in your input. The simplest candidates are the numbers 0 to N. This requires O(N) space and time.

 int find_not_contained(container<int> const&data)  {      const int N=data.size();      std::vector<char> known(N+1, 0);   // one more candidates than data      for(int i=0; i< N; ++i)          if(data[i]>=0 && data[i]<=N)              known[data[i]]=1;      for(int i=0; i<=N; ++i)          if(!known[i])              return i;      assert(false);                     // should never be reached.  } 

Random methods can be more space efficient, but may require more passes over the data in the worst case.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: