Filter Transform#

The filter transform removes objects from a data stream based on a provided filter expression, selection, or other filter predicate. A filter can be added at the top level of a chart using the Chart.transform_filter() method. The argument to transform_filter can be one of a number of expressions and objects:

  1. A Vega expression expressed as a string or built using the expr module

  2. A Field predicate, such as FieldOneOfPredicate, FieldRangePredicate, FieldEqualPredicate, FieldLTPredicate, FieldGTPredicate, FieldLTEPredicate, FieldGTEPredicate,

  3. A Selection predicate or object created by selection()

  4. A Logical operand that combines any of the above

We’ll show a brief example of each of these in the following sections

Filter Expression#

A filter expression uses the Vega expression language, either specified directly as a string, or built using the expr module. This can be useful when, for example, selecting only a subset of data.

For example:

import altair as alt
from altair import datum

from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    (datum.year == 2000) & (datum.sex == 1)
)

Notice that, like in the Filter Transform, data values are referenced via the name datum.

Field Predicates#

Field predicates overlap somewhat in function with expression predicates, but have the advantage that their contents are validated by the schema. Examples are:

Here is an example of a FieldEqualPredicate used to select just the values from year 2000 as in the above chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldEqualPredicate(field='year', equal=2000)
)

A FieldOneOfPredicate is similar, but allows selection of any number of specific values:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldOneOfPredicate(field='year', oneOf=[1900, 1950, 2000])
)

Finally, a FieldRangePredicate() allows selecting values within a particular continuous range:

import altair as alt
from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).transform_filter(
    alt.FieldRangePredicate(field='year', range=[1960, 2000])
)

Selection Predicates#

Selection predicates can be used to filter data based on a selection. While these can be constructed directly using a SelectionPredicate class, in Altair it is often more convenient to construct them using the selection() function. For example, this chart uses a multi-selection that allows the user to click or shift-click on the bars in the bottom chart to select the data to be shown in the top chart:

import altair as alt
from vega_datasets import data
pop = data.population.url

selection = alt.selection_point(fields=['year'])

top = alt.Chart().mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).properties(
    width=600, height=200
).transform_filter(
    selection
)

bottom = alt.Chart().mark_bar().encode(
    x='year:O',
    y='sum(people):Q',
    color=alt.condition(selection, alt.value('steelblue'), alt.value('lightgray'))
).properties(
    width=600, height=100
).add_parameter(
    selection
)

alt.vconcat(
    top, bottom,
    data=pop
)

Logical Operands#

At times it is useful to combine several types of predicates into a single selection. This can be accomplished using the various logical operand classes:

These are not yet part of the Altair interface (see Issue 695) but can be constructed explicitly; for example, here we plot US population distributions for all data except the years 1950-1960, by applying a LogicalNotPredicate schema to a FieldRangePredicate:

import altair as alt
from vega_datasets import data

pop = data.population.url

alt.Chart(pop).mark_line().encode(
    x='age:O',
    y='sum(people):Q',
    color='year:O'
).properties(
    width=600, height=200
).transform_filter(
    {'not': alt.FieldRangePredicate(field='year', range=[1950, 1960])}
)

Transform Options#

The transform_filter() method is built on the FilterTransform class, which has the following options:

Property

Type

Description

filter

PredicateComposition

The filter property must be a predication definition, which can take one of the following forms:

  1. an expression string, where datum can be used to refer to the current data object. For example, {filter: "datum.b2 > 60"} would make the output data includes only items that have values in the field b2 over 60.

  2. one of the field predicates: equal, lt, lte, gt, gte, range, oneOf, or valid,

  3. a selection predicate, which define the names of a selection that the data point should belong to (or a logical composition of selections).

  4. a logical composition of (1), (2), or (3).