API Reference#

class vowpal_wabbit_next.PredictionType#

Enum where each variant corresponds to Python types for the different prediction types.

Scalar = float#

Scalars = List[float]#

ActionScores = List[Tuple[int, float]]#: Where the tuple is (action_index, score) and action_index is zero based.

Pdf = List[Tuple[float, float, float]]#: Where the tuple is (left, right, value)

ActionProbs = List[Tuple[int, float]]#: Where the tuple is (action_index, probability) and action_index is zero based.

Multiclass = int#

Multilabels = List[int]#

Prob = float#

DecisionProbs = List[List[Tuple[int, float]]]#: Where the tuple is (action_index, probability) and action_index is zero based.

ActionPdfValue = Tuple[float, float]#: Where the tuple is (action, value)

ActiveMulticlass = Tuple[int, List[int]]#: Where the tuple is (predicted_class, more_info_required_for_classes)

NoPred = None#

class vowpal_wabbit_next.LabelType#

CB = <LabelType.CB: 1>#

CBEval = <LabelType.CBEval: 2>#

CCB = <LabelType.CCB: 6>#

CS = <LabelType.CS: 3>#

Continuous = <LabelType.Continuous: 9>#

Multiclass = <LabelType.Multiclass: 5>#

Multilabel = <LabelType.Multilabel: 4>#

NoLabel = <LabelType.NoLabel: 8>#

Simple = <LabelType.Simple: 0>#

Slates = <LabelType.Slates: 7>#

property name#

property value#

class vowpal_wabbit_next.Workspace(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#

__init__(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#

Main object used for making predictions and training a model.

The VW library logs various things while running. There are two streams of logging exposed, which can be accessed via the standard Python logging interface. * vowpal_wabbit_next.log - VW’s structured logging stream. If it outputs a warning it will be logged here. * vowpal_wabbit_next.driver - This is essentially the CLI driver output. This is rarely needed from Python.

See the logging example below.

Examples

Load a model from a file:

>>> from vowpal_wabbit_next import Workspace
>>> with open("model.bin", "rb") as f:
...     model_data = f.read()
>>> workspace = Workspace([], model_data=model_data)

Create a workspace for training a contextual bandit with action dependent features model:

>>> from vowpal_wabbit_next import Workspace
>>> workspace = Workspace(["--cb_explore_adf"])

Outputting structured logging messages from VW:

>>> from vowpal_wabbit_next import Workspace
>>> import logging
>>> logging.basicConfig(level=logging.INFO)
>>> logging.getLogger("vowpal_wabbit_next.log").setLevel("INFO")
>>> workspace = Workspace([])

Parameters

args (List[str]) – VowpalWabbit command line options for configuring the model. An overall list can be found here. Some options are unsupported which are: –sort_features, –ngram, –feature_limit, –ignore.
model_data (Optional[bytes], optional) – Bytes of a VW model to be loadeed.
_existing_workspace (Optional[_core.Workspace], optional) – This is for internal usage and should not be set by a user.

predict_one(example: Union[Example, List[Example]]) → Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]]#

Make a single prediction.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
>>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924"))
1.0

Parameters: example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is vowpal_wabbit_next.Workspace.multiline(), otherwise it is should be a single Example
Returns: Prediction produced by this example. The type corresponds to the vowpal_wabbit_next.Workspace.prediction_type() of the model. See vowpal_wabbit_next.PredictionType for the mapping to types.
Return type: Prediction

learn_one(example: Union[Example, List[Example]]) → None#

Learn from one single example. Note, passing a list of examples here means the input is a multiline example, and not several individual examples. The label type of the example must match what is returned by vowpal_wabbit_next.Workspace.label_type().

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
>>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924"))
1.0

Parameters: example (Union[Example, List[Example]]) – Example to learn on.

predict_then_learn_one(example: Union[Example, List[Example]]) → Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]]#

Make a prediction then learn from the example. This is potentially more efficient than a predict_one call followed by a learn_one call as the implementation is able to avoid duplicated work as long as the prediction is guaranteed to be from before learning.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.predict_then_learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
0.0

Parameters: example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is vowpal_wabbit_next.Workspace.multiline(), otherwise it is should be a single Example
Returns: Prediction produced by this example. The type corresponds to the vowpal_wabbit_next.Workspace.prediction_type() of the model. See vowpal_wabbit_next.PredictionType for the mapping to types.
Return type: Prediction

property prediction_type: PredictionType#

Based on the command line parameters used to setup VW a certain type of prediction is produced. See vowpal_wabbit_next.PredictionType for the list of types and their corresponding Python type.

Returns: The type of prediction this Workspace produces
Return type: PredictionType

property label_type: LabelType#

Based on the command line parameters used to setup VW a certain label type is required. This can also be thought of as the type of problem being solved.

Returns: The type of label Examples must have to be used by this Workspace
Return type: LabelType

property multiline: bool#

Based on the command line parameters used to setup VW, the input to learn, predict or parsers expects either single Examples or lists of Examples.

Returns: True if this Workspace is configured as multiline, otherwise False
Return type: bool

serialize() → bytes#

Serialize the current workspace as a VW model that can be loaded by the Workspace constructor, or command line tool.

Returns: raw bytes of serialized Workspace
Return type: bytes

weights() → ndarray[Any, dtype[float32]]#

Access to the weights of the model currently.

This function returns a view of the weights and any changes to the returned array will be reflected in the model.

There are 3 dimensions:

The feature index (aka weight index)
The index of the interleaved model, which should usually be 0
The weight itself and the extra state stored with the weight

Attention

Only dense weights are supported.

Warning

This is an experimental feature.

Examples

>>> from vowpal_wabbit_next import Workspace
>>> model = Workspace([])
>>> print(model.weights().shape)
(262144, 1, 4)

Returns: Array of weights
Return type: np.ndarray

json_weights(*, include_feature_names: bool = False, include_online_state: bool = False) → str#

Debugging utility which dumps the weights in the model currently as a JSON string.

Warning

This is an experimental feature.

Parameters

include_feature_names (bool, optional) – Includes the feature names and interaction terms in the output. This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_feature_names_experimental
include_online_state (bool, optional) – Includes extra save_resume state in the output.This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_extra_online_state_experimental

Returns

JSON string representing model weights

Return type

str

get_index_for_scalar_feature(feature_name: str, *, feature_value: Optional[str] = None, namespace_name: str = ' ') → int#

Calculate the has for a given feature.

The logic is rather complicated to work out an index. This function also takes into account index truncation caused by the index multiplier taking the index out of the standard weight space.

Warning

This is an experimental feature, the interface may change.

Examples

>>> from vowpal_wabbit_next import Workspace
>>> model = Workspace([])
>>> # Feature which looks like "|test thing" in text format
>>> model.get_index_for_scalar_feature("thing", namespace_name="test")
148099

Parameters

feature_name (str) – The name of the feature
feature_value (Optional[str], optional) – String value of feature. If passed chain hashing will be used. In text format this looks like feature_name:feature_value
namespace_name (str, optional) – Namespace of feature. Defaults to ” ” which is the default namespace.

Returns

The index of the feature

Return type

int

class vowpal_wabbit_next.Example#

class vowpal_wabbit_next.TextFormatParser(workspace: Workspace)#

__init__(workspace: Workspace)#

Parse VW text format examples.

Parameters: workspace (Workspace) – Workspace object used to configure this parser

parse_line(text: str) → Example#

Parse a single line.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> example = parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")

Parameters: text (str) – Text to parse
Returns: Parsed example
Return type: Example

class vowpal_wabbit_next.TextFormatReader(workspace: Workspace, file: TextIO)#

__init__(workspace: Workspace, file: TextIO)#

Read VW text format examples from the given text file. This reader produces either single Examples or List[Example] based on if the given workspace is multiline or not.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatReader
>>> workspace = Workspace([])
>>> with open("data.txt", "r") as f:
...     with TextFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)

Parameters

workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from

class vowpal_wabbit_next.DSJsonFormatParser(workspace: Workspace)#

__init__(workspace: Workspace)#

Parse VW DSJson format examples.

Parameters: workspace (Workspace) – Workspace object used to configure this parser

parse_json(text: str) → List[Example]#

Parse a single json object in dsjson format.

Examples

>>> from vowpal_wabbit_next import Workspace, DSJsonFormatParser
>>> workspace = Workspace(["--cb_explore_adf])
>>> parser = DSJsonFormatParser(workspace)
>>> json_str = """
... {
...     "_label_cost": -1.0,
...     "_label_probability": 0.5,
...     "_label_Action": 2,
...     "_labelIndex": 1,
...     "a": [2, 1],
...     "c": {
...         "shared": { "f": "1" },
...         "_multi": [{ "action": { "f": "1" } }, { "action": { "f": "2" } }]
...     },
...     "p": [0.5, 0.5]
... }
... """
>>> example = parser.parse_json(json_str)

Parameters: text (str) – JSON string of input
Returns: List of parsed examples
Return type: List[Example]

class vowpal_wabbit_next.DSJsonFormatReader(workspace: Workspace, file: TextIO)#

__init__(workspace: Workspace, file: TextIO)#

Read VW DSJson format examples from the given text file. This reader always produces lists of examples.

Examples

>>> from vowpal_wabbit_next import Workspace, DSJsonFormatReader
>>> workspace = Workspace([])
>>> with open("data.txt", "r") as f:
...     with DSJsonFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)

Parameters

workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from

class vowpal_wabbit_next.CacheFormatWriter(workspace: Workspace, file: BinaryIO)#

__init__(workspace: Workspace, file: BinaryIO)#

Creates a VW cache file.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> with open("data.cache", "wb") as f:
...     with CacheFormatWriter(workspace, f) as writer:
...         writer.write_example(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))

Parameters

workspace (Workspace) – Workspace object used to configure this writer.
file (BinaryIO) – File to write cache to

write_example(example: Union[Example, List[Example]]) → None#

Write a single example to the cache file.

Parameters: example (Union[Example, List[Example]]) – Either a single or multiex to be written.

class vowpal_wabbit_next.CacheFormatReader(workspace: Workspace, file: BinaryIO)#

__init__(workspace: Workspace, file: BinaryIO)#

Read VW examples in cache format from the given file.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter
>>> workspace = Workspace([])
>>> with open("data.cache", "rb") as f:
...     with CacheFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)

Parameters

workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from

class vowpal_wabbit_next.ModelDelta(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#

__init__(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#

A delta between two VW models.

The standard way to create one is with vowpal_wabbit_next.calculate_delta().

Parameters

data (bytes) – Bytes of a previously serialized ModelDelta to be used for loading
_existing_model_delta (Optional[_core.ModelDelta], optional) – This is an internal parameter and should not be used by end users.

serialize() → bytes#

Serialize the delta.

Returns: The serialized delta.
Return type: bytes

vowpal_wabbit_next.calculate_delta(base_model: Workspace, derived_model: Workspace) → ModelDelta#

Produce a delta between two existing models.

Parameters

base_model (Workspace) – The base of the model
derived_model (Workspace) – The model produced from further training of base_model

Returns

The delta between the models.

Return type

ModelDelta

vowpal_wabbit_next.apply_delta(model: Workspace, delta: ModelDelta) → Workspace#

Apply the delta to the model.

Parameters

model (Workspace) – The model to apply the delta to
delta (ModelDelta) – The delta to apply

Returns

The new model

Return type

Workspace

vowpal_wabbit_next.merge_deltas(deltas: List[ModelDelta]) → ModelDelta#

Merge a list of deltas into a single delta.

Parameters: deltas (List[ModelDelta]) – The deltas to merge. All deltas should come from the same base model.
Returns: The merged delta.
Return type: ModelDelta

exception vowpal_wabbit_next.CLIError(message: str, driver_output: str, log_output: List[str])#

__init__(message: str, driver_output: str, log_output: List[str])#

Represents a failure when running the CLI. Exposes output for additional context.

Parameters

message (str) – Error message
driver_output (str) – Output of the VW driver
log_output (List[str]) – List of logs messages

vowpal_wabbit_next.run_cli_driver(args: List[str], *, onethread: bool = False, cwd: Optional[Path] = None) → Tuple[str, List[str]]#

Is the equivalent of running the VW command line tool with the given command line.

There are a few differences:

Any input from stdin is not supported
The argfile input to command line is not supported
If any place in VW writes to stdout, stderr directly it is not captured. This means that –version and –help are not currently captured.

Warning

This is an experimental feature.

Examples

>>> from vowpal_wabbit_next import run_cli_driver
>>> driver_output, logs = run_cli_driver(["-d", "my_data.txt"])

You can use shlex to split a command line:

>>> from vowpal_wabbit_next import run_cli_driver
>>> import shlex
>>> driver_output, logs = run_cli_driver(shlex.split("-d my_data.txt"))

Parameters

args (List[str]) – Arguments to be passed to the command line driver
onethread (bool, optional) – Whether to use background thread for parsing. If False, a background thread is used for parsing. If True, no background threads are used and everything is done in the foreground of this call.
cwd (Optional[Path], optional) – The current working directory to use for the command line driver. If None, the current working directory is used.

Raises

CLIError – If there is any error raised by execution.

Returns

driver output and log messages respectively as a tuple

Return type

Tuple[str, List[str]]

vowpal_wabbit_next.VW_COMMIT: str = '8a6c027f6'#: Commit of VowpalWabbit that this package is built with

vowpal_wabbit_next.VW_VERSION: str = '9.7.0'#: Version number of VowpalWabbit that this package is built with