-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
This comment below is a TLDR for this feature request
See below for the original bug report.
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
def function_that_should_work_with_any_arg_type(obj, default_value=None):
if obj != default_value:
...
function_that_should_work_with_any_arg_type(pd.DataFrame())Issue Description
Background
I have close to zero experience with pandas, but I'm touching some input and output pandas objects (like DataFrames) as part of a Django application - which uses Sentry for monitoring. In one place, we're using Django's cache framework to cache GeoDataFrame objects (which inherit from DataFrame).
Now, Sentry has a Django integration that hooks onto the caching framework, and it does the following check when calling cache.get():
https://github.com/getsentry/sentry-python/blob/b069aa24fdf3c52a9e8b75f4f83d5fee035c3234/sentry_sdk/integrations/django/caching.py#L86
In our case, value is a GeoDataFrame, and default_value is None - which creates the example scenario above.
Potentially relevant installed package versions
Django : 4.2.23
geopandas : 1.0.1
sentry-sdk : 2.43.0
Issue
The code in the example above will raise ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). (because obj != default_value returns a DataFrame object).
Use cases similar to Sentry's Django cache hook seem perfectly reasonable and innocuous - in that doing a simple equality comparison between two arbitrary objects and checking the bool value of the result should always be a type safe operation - and so this definitely feels like a bug with pandas - hence why I chose the bug report issue template. However, from my understanding, this behavior is a deliberate design choice on your end (even if it feels like a slight misuse of Python's magic method system, to my naive mind 😅), and so you can also consider this a feature request - hence "ENH" in the issue title 🙂
Lastly, it could be relevant to mention that the current implementation of DataFrame.__eq__() and DataFrame.__ne__() (through OpsMixin) violate the conventional return type contract of object.__eq__() and object.__ne__():
https://github.com/python/typeshed/blob/4239e94a3148fd3250e894f9b8e5a4ccbfe2f5a0/stdlib/builtins.pyi#L124-L125
Expected Behavior
That obj != default_value (in the example above) returns True (i.e. a bool, not some DataFrame) and that obj == default_value returns False, and that ValueError is not raised.
Installed Versions
INSTALLED VERSIONS
------------------
commit : 2cc37625532045f4ac55b27176454bbbc9baf213
python : 3.10.11
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.26200
machine : AMD64
processor : Intel64 Family 6 Model 170 Stepping 4, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 2.3.0
numpy : 2.2.6
pytz : 2025.2
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : 5.4.0
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : 2.9.11
pymysql : None
pyarrow : None
pyreadstat : None
pytest : 8.4.2
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None