Subset pandas dataframe by dtype [duplicate]

  • A+
Category:Languages

This question already has an answer here:

I have a pandas dataframe df with a column, call it A, that contains multiple data types. I want to select all rows of df where A has a particular data type.

For example, suppose that A has types int and str. I want to do something like df[type(df[A])==int] .

 


Setup

df = pd.DataFrame({'A': ['hello', 1, 2, 3, 'bad']}) 

This entire column will be assigned dtype Object. If you just want to find numeric values:

pd.to_numeric(df.A, errors='coerce').dropna()  

1    1.0 2    2.0 3    3.0 Name: A, dtype: float64 

However, this would also allow floats, string representations of numbers, etc. into the mix. If you really want to find elements that are of type int, you can use a list comprehension:

df.loc[[isinstance(val, int) for val in df.A], 'A'] 

1    1 2    2 3    3 Name: A, dtype: object 

But notice that the dtype is still Object.


If the column has Boolean values, these will be kept, since bool is a subclass of int. If you don't want this behavior, you can use type instead of isinstance

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: