Inline Datasets#
The VegaFusion transform methods (data, spec, and extract) and chart state all support an inline_datasets
argument. This may be used to pass Arrow tables, DataFrames, or DataFusion logical plans into Vega specifications.
Vega-Altair’s "vegafusion"
data transformer uses this approach to reference DataFrames from Vega specs without writing them to disk or converting them to JSON.
Overview#
Vega specs may include data
entries with a url
of the form "vegafusion+dataset://{dataset_name}
, where a dataset named {dataset_name}
is expected to be provided using inline_datasets
.
Here is an example Vega specification:
{
...
"data": [
{
"name": "source0",
"url": "vegafusion+dataset://movies",
"transform": [
...
]
}
],
...
}
In this case, VegaFusion expects that inline_datasets
will contain a dataset named movies
.
Python#
In Python, inline_datasets
should be a dict
from dataset names (e.g. movies
in the example above) to DataFrames or Arrow tables. “DataFrames” may be of any type supported by Narwhals (including pandas, Polars, PyArrow, Vaex, Ibis, etc.) and “Arrow tables” may be any object supporting the Arrow PyCapsule interface (e.g. arro3, nanoarrow, etc.).
In the case of types supported by Narwhals, VegaFusion will use get_column_usage
to project down to the minimal collection of columns that are required, then rely on Narwhals’ support for the Arrow PyCapsule API to convert these required columns to an arro3
Arrow table for zero-copy transfer to Rust.
See inline_datasets.py for a complete example with pandas.
Rust#
In Rust, inline_datasets
should be a HashMap<String, VegaFusionDataset>
from dataset names (e.g. movies
in the example above) to VegaFusionDataset
instances. VegaFusionDataset
is an enum that may be either a VegaFusionTable
(which is a thin wrapper around Arrow RecordBatches), or a DataFusion LocalPlan
(which represents an arbitrary DataFusion query).
See inline_datasets.rs for a complete example using a VegaFusionTable
, and see inline_datasets_plan.rs for a complete example using a DataFusion LogicalPlan
.