How to treat NULL as a normal string with pandas?

  • A+
Category:Languages

I have a csv-file with a column with strings and I want to read it with pandas. In this file the string null occurs as an actual value and should not be regarded as a missing value.

Example:

import pandas as pd from io import StringIO  data = u'strings,numbers/nfoo,1/nbar,2/nnull,3' print(pd.read_csv(StringIO(data))) 

This gives the following output:

  strings  numbers 0     foo        1 1     bar        2 2     NaN        3 

What can I do to get the value null as it is (and not as NaN) into the DataFrame? The file can be assumed to not contain any actually missing values.

 


You can specify a converters argument for the string column.

pd.read_csv(StringIO(data), converters={'strings' : str})    strings  numbers 0     foo        1 1     bar        2 2    null        3 

This will by-pass pandas' automatic parsing.


Another option is setting na_filter=False:

pd.read_csv(StringIO(data), na_filter=False)    strings  numbers 0     foo        1 1     bar        2 2    null        3 

This works for the entire DataFrame, so use with caution. I recommend first option if you want to surgically apply this to select columns instead.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: