Using pandas to add list elements together

  • A+
Category:Languages

I have the following array of dicts:

items = [  {     'FirstName': 'David',     'Language': ['en',] }, {     'FirstName': 'David',     'Language': ['fr',] },  {     'FirstName': 'David',     'Language': ['en',] }, {     'FirstName': 'Bob',     'Language': ['en',] } ] 

Which I want to group by on FirstName and add the unique languages together, like so:

items = [  {     'FirstName': 'David',     'Language': ['en', 'fr'] },  {     'FirstName': 'Bob',     'Language': ['en',] } ] 

The SQL I would use would be:

SELECT FirstName, GROUP_CONCAT(DISTINCT Language ORDER BY Language) FROM items GROUP BY FirstName 

Using pandas, how would I combine this and do a group by on FirstName and get an array of unique languages? Here is what I have so far:

>>> df = pandas.DataFrame(items) >>> df.groupby('FirstName')['Language']       .apply(lambda x: list(set(x))) # this line is off       .reset_index()       .to_dict(orient='records') 


Aggregate all with sum, transform values to set and then to_dict()

>>> df.groupby('FirstName').sum()["Language"].transform(set).reset_index().to_dict(orient='records')  [{'FirstName': 'Bob', 'Language': {'en'}},  {'FirstName': 'David', 'Language': {'en', 'fr'}}] 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: