Pandas: filter a column containing a list with df.query()

pandas, python
Published on 2017-02-26 | Jérémy Grèze

The method Dataframe.query() of the library Pandas allows you to filter a dataframe with a textual query (string). Personally, I like the readability that this brings and it is easy to use with booleans.

For example, I use it when I build webapps: an HTML page contains a form with available filters (multiselect), the user makes his filter choices, then a Javascript query and a Python backend apply filters with the method query(), the results are then returned to the page.

I noticed that beyond about thirty filters, errors were frequent. The trick is then to use equality with a list, which scales much better!

# not good
df.query(' (column_a == "val1") | (column_a == "val2" ) | ...', inplace=True)

# better!
df.query(' column_a == ["val1", "val2", ...]', inplace=True)