Column Usage#
VegaFusion provides a function for introspecting a Vega specification and determining which columns are referenced from each root dataset. A root dataset is one defined at the top-level of the spec that includes a url
or values
properties. This is useful in contexts where it’s more efficient to minimize the number of columns provided to the Vega specification. For example, the Python library uses this function to determine how to downsample the input DataFrame columns prior to converting to Arrow.
When VegaFusion cannot precisely determine which columns are referenced from each root dataset, this function returns None
or null
for the corresponding dataset.
Python#
- vegafusion.get_column_usage(spec: dict[str, Any]) dict[str, list[str] | None] #
Compute the columns from each root dataset that are referenced in a Vega spec.
- Parameters:
spec – Vega spec
- Returns:
Dict from root-level dataset name to either
A list of columns that are referenced in this dataset if this can be determined precisely
None if it was not possible to determine the full set of columns that are referenced from this dataset
- Return type:
dict[str, list[str] | None]
See column_usage.py for a complete example.
Rust#
See column_usage.rs for a complete example.
JavaScript#
See the Editor Demo for example usage of the getColumnUsage
function in the vegafusion-wasm
package.