Pythonic way of collapsing/grouping a list to aggregating max/min

  • A+
Category:Languages

Lets say I have the following list in python. It is ordered first by Equip, then by Date:

my_list = [     {'Equip': 'A-1', 'Job': 'Job 1', 'Date': '2018-01-01'},     {'Equip': 'A-1', 'Job': 'Job 1', 'Date': '2018-01-02'},     {'Equip': 'A-1', 'Job': 'Job 1', 'Date': '2018-01-03'},     {'Equip': 'A-1', 'Job': 'Job 2', 'Date': '2018-01-04'},     {'Equip': 'A-1', 'Job': 'Job 2', 'Date': '2018-01-05'},     {'Equip': 'A-2', 'Job': 'Job 1', 'Date': '2018-01-03'},     {'Equip': 'A-2', 'Job': 'Job 3', 'Date': '2018-01-04'},     {'Equip': 'A-2', 'Job': 'Job 3', 'Date': '2018-01-05'} ] 

What I want to do is collapse the list by each set where a given piece of Equipment's job does not change, and grab the first and last date the equipment was there. E.g., this simple example should change to:

list_by_job = [     {'Equip': 'A-1', 'Job': 'Job 1', 'First': '2018-01-01', 'Last': '2018-01-03'},     {'Equip': 'A-1', 'Job': 'Job 2', 'First': '2018-01-04', 'Last': '2018-01-05'},     {'Equip': 'A-2', 'Job': 'Job 1', 'First': '2018-01-03', 'Last': '2018-01-03'},     {'Equip': 'A-2', 'Job': 'Job 3', 'First': '2018-01-04', 'Last': '2018-01-05'} ] 

A couple of things to note:

  1. A-2 on Job 1 is only there for a single day, thus its First and Last Date should be the same.
  2. A piece of equipment could be on a job, leave that job, and come back. In this case, I'd need to see an entry for each time it was on the job, not just one single summary.
  3. As stated before, the list is already sorted first by Equip, then by Date, so that ordering can be assumed. (If there is a better way to sort to accomplish this, I am all ears)

For point 3, the list

my_list = [     {'Equip': 'A-1', 'Job': 'Job 1', 'Date': '2018-01-01'},     {'Equip': 'A-1', 'Job': 'Job 2', 'Date': '2018-01-02'},     {'Equip': 'A-1', 'Job': 'Job 1', 'Date': '2018-01-03'} ] 

should yield

    list_by_job = [         {'Equip': 'A-1', 'Job': 'Job 1', 'First': '2018-01-01', 'Last': '2018-01-01'},         {'Equip': 'A-2', 'Job': 'Job 2', 'First': '2018-01-02', 'Last': '2018-01-02'},         {'Equip': 'A-1', 'Job': 'Job 1', 'First': '2018-01-03', 'Last': '2018-01-03'}     ] 

Currently I am doing so in a simple loop/non-pythonic way:

list_by_job = []  last_entry = None for entry in my_list:     if last_entry is None or last_entry['Equip'] != entry['Equip'] or last_entry['Job'] != entry['Job']:       list_by_job.append({'Equip': entry['Equip'], 'Job': entry['Job'], 'First': entry['Date'], 'Last': entry['Date']})     else:       list_by_job[-1]['Last'] = entry['Date']     last_entry = entry 

Is there a more pythonic way to do this using Python's list comprehension, etc?

 


You can use itertools.groupby:

import itertools def _key(d):   return (d['Equip'], d['Job'])  my_list = [{'Date': '2018-01-01', 'Equip': 'A-1', 'Job': 'Job 1'}, {'Date': '2018-01-02', 'Equip': 'A-1', 'Job': 'Job 1'}, {'Date': '2018-01-03', 'Equip': 'A-1', 'Job': 'Job 1'}, {'Date': '2018-01-04', 'Equip': 'A-1', 'Job': 'Job 2'}, {'Date': '2018-01-05', 'Equip': 'A-1', 'Job': 'Job 2'}, {'Date': '2018-01-03', 'Equip': 'A-2', 'Job': 'Job 1'}, {'Date': '2018-01-04', 'Equip': 'A-2', 'Job': 'Job 3'}, {'Date': '2018-01-05', 'Equip': 'A-2', 'Job': 'Job 3'}] new_data = [[a, list(b)] for a, b in itertools.groupby(my_list, key=_key)] final_result = [{"Equip":c, 'Job':d, 'First':b[0]['Date'], 'Last':b[-1]['Date']} for [c, d], b in new_data] 

Output:

[{'Equip': 'A-1', 'Job': 'Job 1', 'Last': '2018-01-03', 'First': '2018-01-01'},   {'Equip': 'A-1', 'Job': 'Job 2', 'Last': '2018-01-05', 'First': '2018-01-04'},   {'Equip': 'A-2', 'Job': 'Job 1', 'Last': '2018-01-03', 'First': '2018-01-03'},   {'Equip': 'A-2', 'Job': 'Job 3', 'Last': '2018-01-05', 'First': '2018-01-04'}] 

Edit:

Using data as suggested in your comment:

my_list = [{'Date': '2018-01-01', 'Equip': 'A-1', 'Job': 'Job 1'}, {'Date': '2018-01-02', 'Equip': 'A-1', 'Job': 'Job 2'}, {'Date': '2018-01-03', 'Equip': 'A-1', 'Job': 'Job 1'}, {'Date': '2018-01-04', 'Equip': 'A-1', 'Job': 'Job 2'}, {'Date': '2018-01-05', 'Equip': 'A-1', 'Job': 'Job 2'}, {'Date': '2018-01-03', 'Equip': 'A-2', 'Job': 'Job 1'}, {'Date': '2018-01-04', 'Equip': 'A-2', 'Job': 'Job 3'}, {'Date': '2018-01-05', 'Equip': 'A-2', 'Job': 'Job 3'}] 

Output:

[{'Equip': 'A-1', 'Job': 'Job 1', 'Last': '2018-01-01', 'First': '2018-01-01'},   {'Equip': 'A-1', 'Job': 'Job 2', 'Last': '2018-01-02', 'First': '2018-01-02'},   {'Equip': 'A-1', 'Job': 'Job 1', 'Last': '2018-01-03', 'First': '2018-01-03'},   {'Equip': 'A-1', 'Job': 'Job 2', 'Last': '2018-01-05', 'First': '2018-01-04'},   {'Equip': 'A-2', 'Job': 'Job 1', 'Last': '2018-01-03', 'First': '2018-01-03'},   {'Equip': 'A-2', 'Job': 'Job 3', 'Last': '2018-01-05', 'First': '2018-01-04'}] 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: