# How to efficiently fillna(0) if series is all-nan, or else remaining non-nan entries are zero?

• A+
Category：Languages

Given that I have a pandas Series, I want to fill the NaNs with zero if either all the values are NaN or if all the values are either zero or NaN.

For example, I would want to fill the NaNs in the following Series with zeroes.

``0       0 1       0 2       NaN 3       NaN 4       NaN 5       NaN 6       NaN 7       NaN 8       NaN ``

But, I would not want to fillna(0) the following Series:

``0       0 1       0 2       2 3       0 4       NaN 5       NaN 6       NaN 7       NaN 8       NaN ``

I was looking at the documentation and it seems like I could use pandas.Series.value_counts to ensure the values are only 0 and NaN, and then simply call fillna(0).In other words, I am looking to check if set(s.unique().astype(str)).issubset(['0.0','nan']), THEN fillna(0), otherwise do not.

Considering how powerful pandas is, it seemed like a there may be a better way to do this. Does anyone have any suggestions to do this cleanly and efficiently?

Potential solution thanks to cᴏʟᴅsᴘᴇᴇᴅ

``if s.dropna().eq(0).all():     s = s.fillna(0) ``

You can compare by `0` and `isna` if only `NaN`s and `0` and then `fillna`:

``if ((s == 0) | (s.isna())).all():     s = pd.Series(0, index=s.index) ``

Or compare unique values:

``if pd.Series(s.unique()).fillna(0).eq(0).all():     s = pd.Series(0, index=s.index) ``

@cᴏʟᴅsᴘᴇᴇᴅ solution, thank you - compare Series without `NaN`s with `dropna`:

`` if s.dropna().eq(0).all():     s = pd.Series(0, index=s.index) ``

Solution from question - need convert to `string`s, because problem with compare with `NaN`s:

``if set(s.unique().astype(str)).issubset(['0.0','nan']):      s = pd.Series(0, index=s.index) ``

Timings:

``s = pd.Series(np.random.choice([0,np.nan], size=10000))  In [68]: %timeit ((s == 0) | (s.isna())).all() The slowest run took 4.85 times longer than the fastest. This could mean that an intermediate result is being cached. 1000 loops, best of 3: 574 µs per loop  In [69]: %timeit pd.Series(s.unique()).fillna(0).eq(0).all() 1000 loops, best of 3: 587 µs per loop  In [70]: %timeit s.dropna().eq(0).all() The slowest run took 4.65 times longer than the fastest. This could mean that an intermediate result is being cached. 1000 loops, best of 3: 774 µs per loop  In [71]: %timeit set(s.unique().astype(str)).issubset(['0.0','nan']) The slowest run took 5.78 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 157 µs per loop ``