Checking that a pandas.Series.index contains a value

  • A+

I (think I) know how to check if a value is contained in the index of a pandas Series, but I can't get it to work in the example below. Is it a bug perhaps?

First, I generate some random numbers:

import numpy as np import pandas as pd  some_numbers = np.random.randint(0,4,size=10) print(some_numbers) 


[0 2 2 3 1 1 2 2 3 2] 

Then, I create a Series with those numbers and compute their frequency

s = pd.Series(some_numbers) gb = s.groupby(s).size() / len(s) print(gb) 


0    0.1 1    0.2 2    0.5 3    0.2 dtype: float64 

So far, so good. But I do not understand the output of the next line of code:

1.3 in gb 



Shouldn't the output be False? (I have pandas 0.20.3 on Python 3.6.2)

I know that I could use

1.3 in list(gb.index) 

but this is not very efficient if the Series is large.


import pandas as pd s = pd.Series([.1,.2,.3]) print(s)  0    0.1 1    0.2 2    0.3 dtype: float64 
3.4 in s  False 

but, wait for it...

s = pd.Series([.1,.2,.3,.4]) print(s)  0    0.1 1    0.2 2    0.3 3    0.4 dtype: float64 
3.4 in s  True 


I believe that the issue is that gb.index is an int64 index:

>>> gb.index Int64Index([0, 1, 2, 3], dtype='int64')  >>> type(gb.index) <class 'pandas.core.indexes.numeric.Int64Index'> 

and so when doing your comparison to 1.3, that value is being converted to an int. Some evidence for this is that values up to 3.99999 will return True, because converting that to int gives you 3, however, 4.000001 in gb.index returns False because converting 4.000001 to int returns 4 (which is not in gb.index)

If you force it to a float index, you end up getting false, because 1.3 is not in Float64Index([0.0, 1.0, 2.0, 3.0], dtype='float64'):

>>> 1.3 in gb.index.astype('float') False 

tested in pandas '0.21.1', python 3.6.3


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: