Announcing VegaFusion 1.4
Improved Vega coverage, external data source foundations, extended architecture support
By: Jon Mease
The VegaFusion team is happy to announce the release of version 1.4. Along with the usual bug fixes and updates to the core Arrow and DataFusion dependencies, this release improves coverage of Vega’s features by supporting q1
/q3
aggregation functions and bitwise operators. It also lays important foundations for supporting external data sources and compute engines, and adds additional architectures for pip and conda packages.
Improved Vega coverage
VegaFusion 1.4 adds support for the q1
and q3
aggregation functions. This makes it possible for VegaFusion to evaluate all the transforms associated with a Vega-Lite boxplot. Here’s a Vega-Altair example:
import altair as alt
from vega_datasets import data
import vegafusion as vf
vf.enable()
source = data.cars()
chart = alt.Chart(source).mark_boxplot(extent="min-max").encode(
alt.X("Miles_per_Gallon:Q").scale(zero=False),
alt.Y("Origin:N"),
)
chart
An easy way to see that the transforms are supported is to extract the transformed data with vf.transformed_data
.
vf.transformed_data(chart)
Origin |
lower_box_Miles_per_Gallon |
upper_box_Miles_per_Gallon |
mid_box_Miles_per_Gallon |
lower_whisker_Miles_per_Gallon |
upper_whisker_Miles_per_Gallon |
|
---|---|---|---|---|---|---|
0 |
USA |
15 |
24 |
18.5 |
9 |
39 |
1 |
Europe |
24 |
30.65 |
26.5 |
16.2 |
44.3 |
2 |
Japan |
25.7 |
34.05 |
31.6 |
18 |
46.6 |
In addition, the full complement of bitwise operators are now supported in the Vega expression language including |
, &
, ^
, <<
, and >>
.
External Data Source Foundations
VegaFusion 1.4 lays some important foundations toward the goal of supporting external data sources and compute engines. The vegafusion.dataset.sql.SqlDataset
abstract class defines the interface for implementing SQL data sources in Python for any of VegaFusion’s 10 supported SQL dialects. Implementations for DuckDB and Snowpark are also provided. In addition, the vegafusion.dataset.DataFrameDataset
abstract class defines the interface for implementing VegaFusion’s data transformations with external DataFrame libraries. Motivating examples include the future ability to dispatch VegaFusion data transformations to the Ibis and Polars Python libraries.
In the coming release of Vega-Altair 5.1, it will be possible to pass implementations of SqlDataset
and DataFrameDataset
to Altair Chart
objects instead of pandas DataFrames. Stay tuned for more information and examples after the release of Altair 5.1!
Extended Architecture Support
VegaFusion wheels are now built and published to PyPI for the aarch64 Linux architecture. In addition, conda-forge packages are now published for the Apple Silicon architecture.
Updates to Arrow and DataFusion dependencies
VegaFusion 1.4 updates the dependency on arrow-rs to version 42.0.0 and DataFusion to version 27.0.0.
Looking ahead
The coming release of Vega-Altair 5.1 will include first-class integration with VegaFusion to support extracting transformed data from a chart with chart.transformed_data()
. It will also include a "vegafusion"
data transformer that will cause Altair to use VegaFusion to pre-evaluate data transformations and remove unused columns when saving or displaying charts. The timeline has not been decided on yet, but the plan is to eventually deprecate VegaFusion’s transformed_data
and save_*
functions and the VegaFusion mime renderer in favor of the integrations built into Altair.
Another near-term focus is on lowering the barrier to contributing to VegaFusion by adopting the Pixi environment manager.
Learn more
Check out these resources to learn more: