Transform Extract#
The pre_transform_extract
method generates a transformed spec like the pre_transform_spec
method, but instead of inlining the transformed datasets in the spec, these datasets are returned separately in arrow table format. This can be useful in contexts where the inline datasets are large, and it’s possible to transmit them more efficiently in arrow format.
Python#
- VegaFusionRuntime.pre_transform_extract(spec: dict[str, Any] | str, local_tz: str | None = None, default_input_tz: str | None = None, preserve_interactivity: bool = True, extract_threshold: int = 20, extracted_format: str = 'arro3', inline_datasets: dict[str, DataFrameLike] | None = None, keep_signals: list[str | tuple[str, list[int]]] | None = None, keep_datasets: list[str | tuple[str, list[int]]] | None = None) tuple[dict[str, Any], list[tuple[str, list[int], pa.Table]], list[PreTransformWarning]] #
Evaluate supported transforms in an input Vega specification.
Produces a new specification with small pre-transformed datasets (under
extract_threshold
rows) included inline and larger inline datasets (extract_threshold
rows or more) extracted into arrow tables.- Parameters:
spec – A Vega specification dict or JSON string.
local_tz – Name of timezone to be considered local. E.g. ‘America/New_York’. Defaults to the value of vf.get_local_tz(), which defaults to the system timezone if one can be determined.
default_input_tz – Name of timezone (e.g. ‘America/New_York’) that naive datetime strings should be interpreted in. Defaults to local_tz.
preserve_interactivity – If True (default) then the interactive behavior of the chart will be preserved. This requires that all the data that participates in interactions be included in the resulting spec rather than being pre-transformed. If False, then all possible data transformations are applied even if they break the original interactive behavior of the chart.
extract_threshold – Datasets with length below extract_threshold will be inlined.
extracted_format –
The format for the extracted datasets. Options are:
"arro3"
: (default) arro3.Table"pyarrow"
: pyarrow.Table"arrow-ipc"
: bytes in arrow IPC format"arrow-ipc-base64"
: base64 encoded arrow IPC format
inline_datasets – A dict from dataset names to pandas DataFrames or pyarrow Tables. Inline datasets may be referenced by the input specification using the following url syntax ‘vegafusion+dataset://{dataset_name}’ or ‘table://{dataset_name}’.
keep_signals –
Signals from the input spec that must be included in the pre-transformed spec, even if they are no longer referenced. A list with elements that are either:
The name of a top-level signal as a string
A two-element tuple where the first element is the name of a signal as a string and the second element is the nested scope of the dataset as a list of integers
keep_datasets –
Datasets from the input spec that must be included in the pre-transformed spec even if they are no longer referenced. A list with elements that are either:
The name of a top-level dataset as a string
A two-element tuple where the first element is the name of a dataset as a string and the second element is the nested scope of the dataset as a list of integers
- Returns:
Three-element tuple of
The Vega specification as a dict with pre-transformed datasets included but left empty.
- Extracted datasets as a list of three element tuples
dataset name
dataset scope list
arrow data
A list of warnings as dictionaries. Each warning dict has a
'type'
key indicating the warning type, and a'message'
key containing a description of the warning. Potential warning types include:'RowLimitExceeded'
: Some datasets in resulting Vega specification have been truncated to the provided row limit'BrokenInteractivity'
: Some interactive features may have been broken in the resulting Vega specification'Unsupported'
: No transforms in the provided Vega specification were eligible for pre-transforming
- Return type:
tuple[dict[str, Any], list[tuple[str, list[int], pa.Table]], list[PreTransformWarning]]
Example: See pre_transform_extract.py for a complete example.
Rust#
See pre_transform_extract.rs for a complete example.