Add new column to Panda dataframe based on other column

  • A+
Category:Languages

I'm trying to add a new column to a Panda dataset. This new column df['Year_Prod'] is derived from another one df['title'] from which I am extracting the year.

Data example:

country    designation     title Italy      Vulkà Bianco    Nicosia 2013 Vulkà Bianco (Etna)          Portugal   Avidagos        Quinta dos Avidagos 2011 Avidagos Red (Douro)       

Code:

import re  import pandas as pd  df=pd.read_csv(r'test.csv', index_col=0)  df['Year_Prod']=re.findall('//d+', df['title'])  print(df.head(10)) 

I am getting the following error:

 File "C:/Python37/lib/site-packages/pandas/core/frame.py", line 3119, in __setitem__self._set_item(key, value)    File "C:/Python37/lib/site-packages/pandas/core/frame.py", line 3194, in _set_item value = self._sanitize_column(key, value)    File "C:/Python37/lib/site-packages/pandas/core/frame.py", line 3391, in _sanitize_column value = _sanitize_index(value, self.index, copy=False)    File "C:/Python37/lib/site-packages/pandas/core/series.py", line 4001, in _sanitize_index raise ValueError('Length of values does not match length of ' 'index')  **ValueError: Length of values does not match length of index** 

Please let me know your thoughts on this, thanks.

 


You can use pandas str.extract

df['Year_Prod']= df.title.str.extract('(/d{4})')      country     designation     title                                          Year_Prod 0   Italy       Vulkà Bianco    Nicosia 2013 Vulkà Bianco (Etna)                2013 1   Portugal    Avidagos        Quinta dos Avidagos 2011 Avidagos Red (Douro)   2011 

Edit: As @Paul H. suggested in comments, the reason your code doesn't work is that re.findall expects a string but you are passing a series. It can be done using apply where at every row, the value passed is a string but doesn't make much sense as str.extract is more efficient.

df.title.apply(lambda x: re.findall('/d{4}', x)[0]) 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: