Efficiently accessing arbitrarily deep dictionaries

  • A+

Suppose I have a multi-level dictionary like this

mydict = {     'first': {         'second': {             'third': {                 'fourth': 'the end'              }          }      } } 

I'd like to access it like this

test = get_entry(mydict, 'first.second.third.fourth') 

What I have so far is

def get_entry(dict, keyspec):     keys = keyspec.split('.')      result = dict[keys[0]]     for key in keys[1:]:        result = dict[key]      return result 

Are there more efficient ways to do it? According to %timeit the runtime of the function is 1.26us, while accessing the dictionary the standard way like this

foo = mydict['first']['second']['third']['fourth'] 

takes 541ns. I'm looking for ways to trim it to 800ns range if possible.


There's really only one solution. Rebuild your dictionary. But do it just once.

def recursive_flatten(mydict):     d = {}     for k, v in mydict.items():         if isinstance(v, dict):             for k2, v2 in recursive_flatten(v).items():                 d[k + '.' + k2] = v2          else:             d[k] = v     return d 

In [786]: new_dict = recursive_flatten(mydict); new_dict Out[786]: {'first.second.third.fourth': 'the end'} 

(Some more tests)

In [788]: recursive_flatten({'x' : {'y' : 1, 'z' : 2}, 'y' : {'a' : 5}, 'z' : 2}) Out[788]: {'x.y': 1, 'x.z': 2, 'y.a': 5, 'z': 2}  In [789]: recursive_flatten({'x' : 1, 'y' : {'x' : 234}}) Out[789]: {'x': 1, 'y.x': 234} 

Every access becomes constant time from here on.

Now, just access your value using new_dict['first.second.third.fourth']. Should work for any arbitrarily nested dictionary that does not contain a self-reference.

Note that every solution has its fair share of tradeoffs, this is no exception. Unless you're firing millions of queries at your data such that preprocessing is an acceptable overhead, then this is it. With the other solutions, you are only sidestepping the problem instead of addressing it - which is dealing with the dictionary's structure. OTOH, if you're going to do this once on many such similar data structures, it make no sense to preprocess just for a single query, in which case you may prefer one of the other solutions.


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: