API Reference#

class vowpal_wabbit_next.PredictionType#

Enum where each variant corresponds to Python types for the different prediction types.

Scalar = float#
Scalars = List[float]#
ActionScores = List[Tuple[int, float]]#

Where the tuple is (action_index, score) and action_index is zero based.

Pdf = List[Tuple[float, float, float]]#

Where the tuple is (left, right, value)

ActionProbs = List[Tuple[int, float]]#

Where the tuple is (action_index, probability) and action_index is zero based.

Multiclass = int#
Multilabels = List[int]#
Prob = float#
DecisionProbs = List[List[Tuple[int, float]]]#

Where the tuple is (action_index, probability) and action_index is zero based.

ActionPdfValue = Tuple[float, float]#

Where the tuple is (action, value)

ActiveMulticlass = Tuple[int, List[int]]#

Where the tuple is (predicted_class, more_info_required_for_classes)

NoPred = None#
class vowpal_wabbit_next.LabelType#
CB = <LabelType.CB: 1>#
CBEval = <LabelType.CBEval: 2>#
CCB = <LabelType.CCB: 6>#
CS = <LabelType.CS: 3>#
Continuous = <LabelType.Continuous: 9>#
Multiclass = <LabelType.Multiclass: 5>#
Multilabel = <LabelType.Multilabel: 4>#
NoLabel = <LabelType.NoLabel: 8>#
Simple = <LabelType.Simple: 0>#
Slates = <LabelType.Slates: 7>#
property name#
property value#
class vowpal_wabbit_next.Workspace(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#
__init__(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#

Main object used for making predictions and training a model.

The VW library logs various things while running. There are two streams of logging exposed, which can be accessed via the standard Python logging interface. * vowpal_wabbit_next.log - VW’s structured logging stream. If it outputs a warning it will be logged here. * vowpal_wabbit_next.driver - This is essentially the CLI driver output. This is rarely needed from Python.

See the logging example below.

Examples

Load a model from a file:

>>> from vowpal_wabbit_next import Workspace
>>> with open("model.bin", "rb") as f:
...     model_data = f.read()
>>> workspace = Workspace([], model_data=model_data)

Create a workspace for training a contextual bandit with action dependent features model:

>>> from vowpal_wabbit_next import Workspace
>>> workspace = Workspace(["--cb_explore_adf"])

Outputting structured logging messages from VW:

>>> from vowpal_wabbit_next import Workspace
>>> import logging
>>> logging.basicConfig(level=logging.INFO)
>>> logging.getLogger("vowpal_wabbit_next.log").setLevel("INFO")
>>> workspace = Workspace([])
Parameters
  • args (List[str]) – VowpalWabbit command line options for configuring the model. An overall list can be found here. Some options are unsupported which are: –sort_features, –ngram, –feature_limit, –ignore.

  • model_data (Optional[bytes], optional) – Bytes of a VW model to be loadeed.

  • _existing_workspace (Optional[_core.Workspace], optional) – This is for internal usage and should not be set by a user.

predict_one(example: Union[Example, List[Example]]) Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]]#

Make a single prediction.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
>>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924"))
1.0
Parameters

example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is vowpal_wabbit_next.Workspace.multiline(), otherwise it is should be a single Example

Returns

Prediction produced by this example. The type corresponds to the vowpal_wabbit_next.Workspace.prediction_type() of the model. See vowpal_wabbit_next.PredictionType for the mapping to types.

Return type

Prediction

learn_one(example: Union[Example, List[Example]]) None#

Learn from one single example. Note, passing a list of examples here means the input is a multiline example, and not several individual examples. The label type of the example must match what is returned by vowpal_wabbit_next.Workspace.label_type().

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
>>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924"))
1.0
Parameters

example (Union[Example, List[Example]]) – Example to learn on.

predict_then_learn_one(example: Union[Example, List[Example]]) Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]]#

Make a prediction then learn from the example. This is potentially more efficient than a predict_one call followed by a learn_one call as the implementation is able to avoid duplicated work as long as the prediction is guaranteed to be from before learning.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> workspace.predict_then_learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
0.0
Parameters

example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is vowpal_wabbit_next.Workspace.multiline(), otherwise it is should be a single Example

Returns

Prediction produced by this example. The type corresponds to the vowpal_wabbit_next.Workspace.prediction_type() of the model. See vowpal_wabbit_next.PredictionType for the mapping to types.

Return type

Prediction

property prediction_type: PredictionType#

Based on the command line parameters used to setup VW a certain type of prediction is produced. See vowpal_wabbit_next.PredictionType for the list of types and their corresponding Python type.

Returns

The type of prediction this Workspace produces

Return type

PredictionType

property label_type: LabelType#

Based on the command line parameters used to setup VW a certain label type is required. This can also be thought of as the type of problem being solved.

Returns

The type of label Examples must have to be used by this Workspace

Return type

LabelType

property multiline: bool#

Based on the command line parameters used to setup VW, the input to learn, predict or parsers expects either single Examples or lists of Examples.

Returns

True if this Workspace is configured as multiline, otherwise False

Return type

bool

serialize() bytes#

Serialize the current workspace as a VW model that can be loaded by the Workspace constructor, or command line tool.

Returns

raw bytes of serialized Workspace

Return type

bytes

weights() ndarray[Any, dtype[float32]]#

Access to the weights of the model currently.

This function returns a view of the weights and any changes to the returned array will be reflected in the model.

There are 3 dimensions:

  • The feature index (aka weight index)

  • The index of the interleaved model, which should usually be 0

  • The weight itself and the extra state stored with the weight

Attention

Only dense weights are supported.

Warning

This is an experimental feature.

Examples

>>> from vowpal_wabbit_next import Workspace
>>> model = Workspace([])
>>> print(model.weights().shape)
(262144, 1, 4)
Returns

Array of weights

Return type

np.ndarray

json_weights(*, include_feature_names: bool = False, include_online_state: bool = False) str#

Debugging utility which dumps the weights in the model currently as a JSON string.

Warning

This is an experimental feature.

Parameters
  • include_feature_names (bool, optional) – Includes the feature names and interaction terms in the output. This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_feature_names_experimental

  • include_online_state (bool, optional) – Includes extra save_resume state in the output.This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_extra_online_state_experimental

Returns

JSON string representing model weights

Return type

str

get_index_for_scalar_feature(feature_name: str, *, feature_value: Optional[str] = None, namespace_name: str = ' ') int#

Calculate the has for a given feature.

The logic is rather complicated to work out an index. This function also takes into account index truncation caused by the index multiplier taking the index out of the standard weight space.

Warning

This is an experimental feature, the interface may change.

Examples

>>> from vowpal_wabbit_next import Workspace
>>> model = Workspace([])
>>> # Feature which looks like "|test thing" in text format
>>> model.get_index_for_scalar_feature("thing", namespace_name="test")
148099
Parameters
  • feature_name (str) – The name of the feature

  • feature_value (Optional[str], optional) – String value of feature. If passed chain hashing will be used. In text format this looks like feature_name:feature_value

  • namespace_name (str, optional) – Namespace of feature. Defaults to ” ” which is the default namespace.

Returns

The index of the feature

Return type

int

class vowpal_wabbit_next.Example#
class vowpal_wabbit_next.TextFormatParser(workspace: Workspace)#
__init__(workspace: Workspace)#

Parse VW text format examples.

Parameters

workspace (Workspace) – Workspace object used to configure this parser

parse_line(text: str) Example#

Parse a single line.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> example = parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")
Parameters

text (str) – Text to parse

Returns

Parsed example

Return type

Example

class vowpal_wabbit_next.TextFormatReader(workspace: Workspace, file: TextIO)#
__init__(workspace: Workspace, file: TextIO)#

Read VW text format examples from the given text file. This reader produces either single Examples or List[Example] based on if the given workspace is multiline or not.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatReader
>>> workspace = Workspace([])
>>> with open("data.txt", "r") as f:
...     with TextFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)
Parameters
  • workspace (Workspace) – Workspace object used to configure this reader

  • file (BinaryIO) – File to read from

class vowpal_wabbit_next.DSJsonFormatParser(workspace: Workspace)#
__init__(workspace: Workspace)#

Parse VW DSJson format examples.

Parameters

workspace (Workspace) – Workspace object used to configure this parser

parse_json(text: str) List[Example]#

Parse a single json object in dsjson format.

Examples

>>> from vowpal_wabbit_next import Workspace, DSJsonFormatParser
>>> workspace = Workspace(["--cb_explore_adf])
>>> parser = DSJsonFormatParser(workspace)
>>> json_str = """
... {
...     "_label_cost": -1.0,
...     "_label_probability": 0.5,
...     "_label_Action": 2,
...     "_labelIndex": 1,
...     "a": [2, 1],
...     "c": {
...         "shared": { "f": "1" },
...         "_multi": [{ "action": { "f": "1" } }, { "action": { "f": "2" } }]
...     },
...     "p": [0.5, 0.5]
... }
... """
>>> example = parser.parse_json(json_str)
Parameters

text (str) – JSON string of input

Returns

List of parsed examples

Return type

List[Example]

class vowpal_wabbit_next.DSJsonFormatReader(workspace: Workspace, file: TextIO)#
__init__(workspace: Workspace, file: TextIO)#

Read VW DSJson format examples from the given text file. This reader always produces lists of examples.

Examples

>>> from vowpal_wabbit_next import Workspace, DSJsonFormatReader
>>> workspace = Workspace([])
>>> with open("data.txt", "r") as f:
...     with DSJsonFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)
Parameters
  • workspace (Workspace) – Workspace object used to configure this reader

  • file (BinaryIO) – File to read from

class vowpal_wabbit_next.CacheFormatWriter(workspace: Workspace, file: BinaryIO)#
__init__(workspace: Workspace, file: BinaryIO)#

Creates a VW cache file.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter
>>> workspace = Workspace([])
>>> parser = TextFormatParser(workspace)
>>> with open("data.cache", "wb") as f:
...     with CacheFormatWriter(workspace, f) as writer:
...         writer.write_example(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
Parameters
  • workspace (Workspace) – Workspace object used to configure this writer.

  • file (BinaryIO) – File to write cache to

write_example(example: Union[Example, List[Example]]) None#

Write a single example to the cache file.

Parameters

example (Union[Example, List[Example]]) – Either a single or multiex to be written.

class vowpal_wabbit_next.CacheFormatReader(workspace: Workspace, file: BinaryIO)#
__init__(workspace: Workspace, file: BinaryIO)#

Read VW examples in cache format from the given file.

Examples

>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter
>>> workspace = Workspace([])
>>> with open("data.cache", "rb") as f:
...     with CacheFormatReader(workspace, f) as reader:
...         for example in reader:
...               workspace.predict_one(example)
Parameters
  • workspace (Workspace) – Workspace object used to configure this reader

  • file (BinaryIO) – File to read from

class vowpal_wabbit_next.ModelDelta(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#
__init__(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#

A delta between two VW models.

The standard way to create one is with vowpal_wabbit_next.calculate_delta().

Parameters
  • data (bytes) – Bytes of a previously serialized ModelDelta to be used for loading

  • _existing_model_delta (Optional[_core.ModelDelta], optional) – This is an internal parameter and should not be used by end users.

serialize() bytes#

Serialize the delta.

Returns

The serialized delta.

Return type

bytes

vowpal_wabbit_next.calculate_delta(base_model: Workspace, derived_model: Workspace) ModelDelta#

Produce a delta between two existing models.

Parameters
  • base_model (Workspace) – The base of the model

  • derived_model (Workspace) – The model produced from further training of base_model

Returns

The delta between the models.

Return type

ModelDelta

vowpal_wabbit_next.apply_delta(model: Workspace, delta: ModelDelta) Workspace#

Apply the delta to the model.

Parameters
  • model (Workspace) – The model to apply the delta to

  • delta (ModelDelta) – The delta to apply

Returns

The new model

Return type

Workspace

vowpal_wabbit_next.merge_deltas(deltas: List[ModelDelta]) ModelDelta#

Merge a list of deltas into a single delta.

Parameters

deltas (List[ModelDelta]) – The deltas to merge. All deltas should come from the same base model.

Returns

The merged delta.

Return type

ModelDelta

exception vowpal_wabbit_next.CLIError(message: str, driver_output: str, log_output: List[str])#
__init__(message: str, driver_output: str, log_output: List[str])#

Represents a failure when running the CLI. Exposes output for additional context.

Parameters
  • message (str) – Error message

  • driver_output (str) – Output of the VW driver

  • log_output (List[str]) – List of logs messages

vowpal_wabbit_next.run_cli_driver(args: List[str], *, onethread: bool = False, cwd: Optional[Path] = None) Tuple[str, List[str]]#

Is the equivalent of running the VW command line tool with the given command line.

There are a few differences:

  • Any input from stdin is not supported

  • The argfile input to command line is not supported

  • If any place in VW writes to stdout, stderr directly it is not captured. This means that –version and –help are not currently captured.

Warning

This is an experimental feature.

Examples

>>> from vowpal_wabbit_next import run_cli_driver
>>> driver_output, logs = run_cli_driver(["-d", "my_data.txt"])

You can use shlex to split a command line:

>>> from vowpal_wabbit_next import run_cli_driver
>>> import shlex
>>> driver_output, logs = run_cli_driver(shlex.split("-d my_data.txt"))
Parameters
  • args (List[str]) – Arguments to be passed to the command line driver

  • onethread (bool, optional) – Whether to use background thread for parsing. If False, a background thread is used for parsing. If True, no background threads are used and everything is done in the foreground of this call.

  • cwd (Optional[Path], optional) – The current working directory to use for the command line driver. If None, the current working directory is used.

Raises

CLIError – If there is any error raised by execution.

Returns

driver output and log messages respectively as a tuple

Return type

Tuple[str, List[str]]

vowpal_wabbit_next.VW_COMMIT: str = '8a6c027f6'#

Commit of VowpalWabbit that this package is built with

vowpal_wabbit_next.VW_VERSION: str = '9.7.0'#

Version number of VowpalWabbit that this package is built with