flatten a list of nested ordered dictionaries in python

  • A+
Category:Languages

I am a bit confused how to extract information from a nested list of ordered lists in Python. For example:

list_of_interest = [OrderedDict([('name', 'Viscozyme'), ('company', 'Roche (Chile)')]),  [OrderedDict([('name', 'Davictrel'), ('company', None)]),   OrderedDict([('name', 'Enbrel Sureclick'), ('company', None)]),   OrderedDict([('name', 'Tunex'), ('company', None)])],  OrderedDict([('name', 'Angiox'), ('company', None)]),  [OrderedDict([('name', 'Enantone'), ('company', None)]),   OrderedDict([('name', 'Leuplin'), ('company', 'Takeda')]),   OrderedDict([('name', 'LeuProMaxx'), ('company', 'Baxter/Teva')]),   OrderedDict([('name', 'Leupromer'), ('company', None)]),   OrderedDict([('name', 'Lutrate'), ('company', None)]),   OrderedDict([('name', 'Memryte'), ('company', 'Curaxis')]),   OrderedDict([('name', 'Prostap 3'), ('company', 'Takeda UK')]),   OrderedDict([('name', 'Prostap SR'), ('company', 'Takeda UK')]),   OrderedDict([('name', 'Viadur'), ('company', 'Bayer AG')])],  OrderedDict([('name', 'Geref'), ('company', 'Serono Pharma')])] 

I need to extract all items under 'name'.

So I need a function:

get_names(list_of_interest) --> ['Viscozyme', 'Davictrel', 'Enbrel Sureclick', 'Tunex', 'Angiox', 'Enantone', ..., 'Geref'] 

I honestly tried nested list comprehensions, generator expressions and even pandas data frame, but it fails, as some sublists are single values.

 


from collections import OrderedDict  list_of_interest =/     [OrderedDict([('name', 'Viscozyme'), ('company', 'Roche (Chile)')]),     [OrderedDict([('name', 'Davictrel'), ('company', None)]),      OrderedDict([('name', 'Enbrel Sureclick'), ('company', None)]),      OrderedDict([('name', 'Tunex'), ('company', None)])],      OrderedDict([('name', 'Angiox'), ('company', None)]),     [OrderedDict([('name', 'Enantone'), ('company', None)]),      OrderedDict([('name', 'Leuplin'), ('company', 'Takeda')]),      OrderedDict([('name', 'LeuProMaxx'), ('company', 'Baxter/Teva')]),      OrderedDict([('name', 'Leupromer'), ('company', None)]),      OrderedDict([('name', 'Lutrate'), ('company', None)]),      OrderedDict([('name', 'Memryte'), ('company', 'Curaxis')]),      OrderedDict([('name', 'Prostap 3'), ('company', 'Takeda UK')]),      OrderedDict([('name', 'Prostap SR'), ('company', 'Takeda UK')]),      OrderedDict([('name', 'Viadur'), ('company', 'Bayer AG')])],      OrderedDict([('name', 'Geref'), ('company', 'Serono Pharma')])]  names = [] for item in list_of_interest:     if isinstance(item, OrderedDict):         names.append(item['name'])     else:         for list_ord_dict in item:             names.append(list_ord_dict['name'])  print(names) #['Viscozyme', 'Davictrel', 'Enbrel Sureclick', 'Tunex', 'Angiox', 'Enantone', 'Leuplin', 'LeuProMaxx', 'Leupromer', 'Lutrate', 'Memryte', 'Prostap 3', 'Prostap SR', 'Viadur', 'Geref'] 

You have two types of item, you can know that iterating and printing the type through your main list. If you have more depth, you can use a recursive function that would call itself when encountering a list. For the Dataset you provided, the code above works just fine.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: