API Reference#
- class vowpal_wabbit_next.PredictionType#
Enum where each variant corresponds to Python types for the different prediction types.
- Scalar = float#
- Scalars = List[float]#
- ActionScores = List[Tuple[int, float]]#
Where the tuple is (action_index, score) and action_index is zero based.
- Pdf = List[Tuple[float, float, float]]#
Where the tuple is (left, right, value)
- ActionProbs = List[Tuple[int, float]]#
Where the tuple is (action_index, probability) and action_index is zero based.
- Multiclass = int#
- Multilabels = List[int]#
- Prob = float#
- DecisionProbs = List[List[Tuple[int, float]]]#
Where the tuple is (action_index, probability) and action_index is zero based.
- ActionPdfValue = Tuple[float, float]#
Where the tuple is (action, value)
- ActiveMulticlass = Tuple[int, List[int]]#
Where the tuple is (predicted_class, more_info_required_for_classes)
- NoPred = None#
- class vowpal_wabbit_next.LabelType#
- CB = <LabelType.CB: 1>#
- CBEval = <LabelType.CBEval: 2>#
- CCB = <LabelType.CCB: 6>#
- CS = <LabelType.CS: 3>#
- Continuous = <LabelType.Continuous: 9>#
- Multiclass = <LabelType.Multiclass: 5>#
- Multilabel = <LabelType.Multilabel: 4>#
- NoLabel = <LabelType.NoLabel: 8>#
- Simple = <LabelType.Simple: 0>#
- Slates = <LabelType.Slates: 7>#
- property name#
- property value#
- class vowpal_wabbit_next.Workspace(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#
- __init__(args: List[str], *, model_data: Optional[bytes] = None, _existing_workspace: Optional[Workspace] = None)#
Main object used for making predictions and training a model.
The VW library logs various things while running. There are two streams of logging exposed, which can be accessed via the standard Python logging interface. * vowpal_wabbit_next.log - VW’s structured logging stream. If it outputs a warning it will be logged here. * vowpal_wabbit_next.driver - This is essentially the CLI driver output. This is rarely needed from Python.
See the logging example below.
Examples
Load a model from a file:
>>> from vowpal_wabbit_next import Workspace >>> with open("model.bin", "rb") as f: ... model_data = f.read() >>> workspace = Workspace([], model_data=model_data)
Create a workspace for training a contextual bandit with action dependent features model:
>>> from vowpal_wabbit_next import Workspace >>> workspace = Workspace(["--cb_explore_adf"])
Outputting structured logging messages from VW:
>>> from vowpal_wabbit_next import Workspace >>> import logging >>> logging.basicConfig(level=logging.INFO) >>> logging.getLogger("vowpal_wabbit_next.log").setLevel("INFO") >>> workspace = Workspace([])
- Parameters
args (List[str]) – VowpalWabbit command line options for configuring the model. An overall list can be found here. Some options are unsupported which are: –sort_features, –ngram, –feature_limit, –ignore.
model_data (Optional[bytes], optional) – Bytes of a VW model to be loadeed.
_existing_workspace (Optional[_core.Workspace], optional) – This is for internal usage and should not be set by a user.
- predict_one(example: Union[Example, List[Example]]) Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]] #
Make a single prediction.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser >>> workspace = Workspace([]) >>> parser = TextFormatParser(workspace) >>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")) >>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924")) 1.0
- Parameters
example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is
vowpal_wabbit_next.Workspace.multiline()
, otherwise it is should be a single Example- Returns
Prediction produced by this example. The type corresponds to the
vowpal_wabbit_next.Workspace.prediction_type()
of the model. Seevowpal_wabbit_next.PredictionType
for the mapping to types.- Return type
Prediction
- learn_one(example: Union[Example, List[Example]]) None #
Learn from one single example. Note, passing a list of examples here means the input is a multiline example, and not several individual examples. The label type of the example must match what is returned by
vowpal_wabbit_next.Workspace.label_type()
.Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser >>> workspace = Workspace([]) >>> parser = TextFormatParser(workspace) >>> workspace.learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")) >>> workspace.predict_one(parser.parse_line("| price:.53 sqft:.32 age:.87 1924")) 1.0
- predict_then_learn_one(example: Union[Example, List[Example]]) Optional[Union[float, List[float], List[Tuple[int, float]], List[List[Tuple[int, float]]], int, List[int], List[Tuple[float, float, float]], Tuple[float, float], Tuple[int, List[int]]]] #
Make a prediction then learn from the example. This is potentially more efficient than a predict_one call followed by a learn_one call as the implementation is able to avoid duplicated work as long as the prediction is guaranteed to be from before learning.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser >>> workspace = Workspace([]) >>> parser = TextFormatParser(workspace) >>> workspace.predict_then_learn_one(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")) 0.0
- Parameters
example (Union[Example, List[Example]]) – Example to use for prediction. This should be a list if this workspace is
vowpal_wabbit_next.Workspace.multiline()
, otherwise it is should be a single Example- Returns
Prediction produced by this example. The type corresponds to the
vowpal_wabbit_next.Workspace.prediction_type()
of the model. Seevowpal_wabbit_next.PredictionType
for the mapping to types.- Return type
Prediction
- property prediction_type: PredictionType#
Based on the command line parameters used to setup VW a certain type of prediction is produced. See
vowpal_wabbit_next.PredictionType
for the list of types and their corresponding Python type.- Returns
The type of prediction this Workspace produces
- Return type
- property label_type: LabelType#
Based on the command line parameters used to setup VW a certain label type is required. This can also be thought of as the type of problem being solved.
- Returns
The type of label Examples must have to be used by this Workspace
- Return type
- property multiline: bool#
Based on the command line parameters used to setup VW, the input to learn, predict or parsers expects either single Examples or lists of Examples.
- Returns
True if this Workspace is configured as multiline, otherwise False
- Return type
bool
- serialize() bytes #
Serialize the current workspace as a VW model that can be loaded by the Workspace constructor, or command line tool.
- Returns
raw bytes of serialized Workspace
- Return type
bytes
- weights() ndarray[Any, dtype[float32]] #
Access to the weights of the model currently.
This function returns a view of the weights and any changes to the returned array will be reflected in the model.
There are 3 dimensions:
The feature index (aka weight index)
The index of the interleaved model, which should usually be 0
The weight itself and the extra state stored with the weight
Attention
Only dense weights are supported.
Warning
This is an experimental feature.
Examples
>>> from vowpal_wabbit_next import Workspace >>> model = Workspace([]) >>> print(model.weights().shape) (262144, 1, 4)
- Returns
Array of weights
- Return type
np.ndarray
- json_weights(*, include_feature_names: bool = False, include_online_state: bool = False) str #
Debugging utility which dumps the weights in the model currently as a JSON string.
Warning
This is an experimental feature.
- Parameters
include_feature_names (bool, optional) – Includes the feature names and interaction terms in the output. This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_feature_names_experimental
include_online_state (bool, optional) – Includes extra save_resume state in the output.This requires the workspace to be configured to support it. This is not well exposed to Python currently but the way to do it is: –dump_json_weights_experimental=unused –dump_json_weights_include_extra_online_state_experimental
- Returns
JSON string representing model weights
- Return type
str
- get_index_for_scalar_feature(feature_name: str, *, feature_value: Optional[str] = None, namespace_name: str = ' ') int #
Calculate the has for a given feature.
The logic is rather complicated to work out an index. This function also takes into account index truncation caused by the index multiplier taking the index out of the standard weight space.
Warning
This is an experimental feature, the interface may change.
Examples
>>> from vowpal_wabbit_next import Workspace >>> model = Workspace([]) >>> # Feature which looks like "|test thing" in text format >>> model.get_index_for_scalar_feature("thing", namespace_name="test") 148099
- Parameters
feature_name (str) – The name of the feature
feature_value (Optional[str], optional) – String value of feature. If passed chain hashing will be used. In text format this looks like feature_name:feature_value
namespace_name (str, optional) – Namespace of feature. Defaults to ” ” which is the default namespace.
- Returns
The index of the feature
- Return type
int
- class vowpal_wabbit_next.Example#
- class vowpal_wabbit_next.TextFormatParser(workspace: Workspace)#
- __init__(workspace: Workspace)#
Parse VW text format examples.
- Parameters
workspace (Workspace) – Workspace object used to configure this parser
- parse_line(text: str) Example #
Parse a single line.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser >>> workspace = Workspace([]) >>> parser = TextFormatParser(workspace) >>> example = parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976")
- Parameters
text (str) – Text to parse
- Returns
Parsed example
- Return type
- class vowpal_wabbit_next.TextFormatReader(workspace: Workspace, file: TextIO)#
- __init__(workspace: Workspace, file: TextIO)#
Read VW text format examples from the given text file. This reader produces either single Examples or List[Example] based on if the given workspace is multiline or not.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatReader >>> workspace = Workspace([]) >>> with open("data.txt", "r") as f: ... with TextFormatReader(workspace, f) as reader: ... for example in reader: ... workspace.predict_one(example)
- Parameters
workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from
- class vowpal_wabbit_next.DSJsonFormatParser(workspace: Workspace)#
- __init__(workspace: Workspace)#
Parse VW DSJson format examples.
- Parameters
workspace (Workspace) – Workspace object used to configure this parser
- parse_json(text: str) List[Example] #
Parse a single json object in dsjson format.
Examples
>>> from vowpal_wabbit_next import Workspace, DSJsonFormatParser >>> workspace = Workspace(["--cb_explore_adf]) >>> parser = DSJsonFormatParser(workspace) >>> json_str = """ ... { ... "_label_cost": -1.0, ... "_label_probability": 0.5, ... "_label_Action": 2, ... "_labelIndex": 1, ... "a": [2, 1], ... "c": { ... "shared": { "f": "1" }, ... "_multi": [{ "action": { "f": "1" } }, { "action": { "f": "2" } }] ... }, ... "p": [0.5, 0.5] ... } ... """ >>> example = parser.parse_json(json_str)
- Parameters
text (str) – JSON string of input
- Returns
List of parsed examples
- Return type
List[Example]
- class vowpal_wabbit_next.DSJsonFormatReader(workspace: Workspace, file: TextIO)#
- __init__(workspace: Workspace, file: TextIO)#
Read VW DSJson format examples from the given text file. This reader always produces lists of examples.
Examples
>>> from vowpal_wabbit_next import Workspace, DSJsonFormatReader >>> workspace = Workspace([]) >>> with open("data.txt", "r") as f: ... with DSJsonFormatReader(workspace, f) as reader: ... for example in reader: ... workspace.predict_one(example)
- Parameters
workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from
- class vowpal_wabbit_next.CacheFormatWriter(workspace: Workspace, file: BinaryIO)#
- __init__(workspace: Workspace, file: BinaryIO)#
Creates a VW cache file.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter >>> workspace = Workspace([]) >>> parser = TextFormatParser(workspace) >>> with open("data.cache", "wb") as f: ... with CacheFormatWriter(workspace, f) as writer: ... writer.write_example(parser.parse_line("1.0 | price:.18 sqft:.15 age:.35 1976"))
- Parameters
workspace (Workspace) – Workspace object used to configure this writer.
file (BinaryIO) – File to write cache to
- class vowpal_wabbit_next.CacheFormatReader(workspace: Workspace, file: BinaryIO)#
- __init__(workspace: Workspace, file: BinaryIO)#
Read VW examples in cache format from the given file.
Examples
>>> from vowpal_wabbit_next import Workspace, TextFormatParser, CacheFormatWriter >>> workspace = Workspace([]) >>> with open("data.cache", "rb") as f: ... with CacheFormatReader(workspace, f) as reader: ... for example in reader: ... workspace.predict_one(example)
- Parameters
workspace (Workspace) – Workspace object used to configure this reader
file (BinaryIO) – File to read from
- class vowpal_wabbit_next.ModelDelta(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#
- __init__(data: bytes, *, _existing_model_delta: Optional[ModelDelta] = None)#
A delta between two VW models.
The standard way to create one is with
vowpal_wabbit_next.calculate_delta()
.- Parameters
data (bytes) – Bytes of a previously serialized ModelDelta to be used for loading
_existing_model_delta (Optional[_core.ModelDelta], optional) – This is an internal parameter and should not be used by end users.
- serialize() bytes #
Serialize the delta.
- Returns
The serialized delta.
- Return type
bytes
- vowpal_wabbit_next.calculate_delta(base_model: Workspace, derived_model: Workspace) ModelDelta #
Produce a delta between two existing models.
- Parameters
- Returns
The delta between the models.
- Return type
- vowpal_wabbit_next.apply_delta(model: Workspace, delta: ModelDelta) Workspace #
Apply the delta to the model.
- Parameters
model (Workspace) – The model to apply the delta to
delta (ModelDelta) – The delta to apply
- Returns
The new model
- Return type
- vowpal_wabbit_next.merge_deltas(deltas: List[ModelDelta]) ModelDelta #
Merge a list of deltas into a single delta.
- Parameters
deltas (List[ModelDelta]) – The deltas to merge. All deltas should come from the same base model.
- Returns
The merged delta.
- Return type
- exception vowpal_wabbit_next.CLIError(message: str, driver_output: str, log_output: List[str])#
- __init__(message: str, driver_output: str, log_output: List[str])#
Represents a failure when running the CLI. Exposes output for additional context.
- Parameters
message (str) – Error message
driver_output (str) – Output of the VW driver
log_output (List[str]) – List of logs messages
- vowpal_wabbit_next.run_cli_driver(args: List[str], *, onethread: bool = False, cwd: Optional[Path] = None) Tuple[str, List[str]] #
Is the equivalent of running the VW command line tool with the given command line.
There are a few differences:
Any input from stdin is not supported
The argfile input to command line is not supported
If any place in VW writes to stdout, stderr directly it is not captured. This means that –version and –help are not currently captured.
Warning
This is an experimental feature.
Examples
>>> from vowpal_wabbit_next import run_cli_driver >>> driver_output, logs = run_cli_driver(["-d", "my_data.txt"])
You can use shlex to split a command line:
>>> from vowpal_wabbit_next import run_cli_driver >>> import shlex >>> driver_output, logs = run_cli_driver(shlex.split("-d my_data.txt"))
- Parameters
args (List[str]) – Arguments to be passed to the command line driver
onethread (bool, optional) – Whether to use background thread for parsing. If False, a background thread is used for parsing. If True, no background threads are used and everything is done in the foreground of this call.
cwd (Optional[Path], optional) – The current working directory to use for the command line driver. If None, the current working directory is used.
- Raises
CLIError – If there is any error raised by execution.
- Returns
driver output and log messages respectively as a tuple
- Return type
Tuple[str, List[str]]
- vowpal_wabbit_next.VW_COMMIT: str = '8a6c027f6'#
Commit of VowpalWabbit that this package is built with
- vowpal_wabbit_next.VW_VERSION: str = '9.7.0'#
Version number of VowpalWabbit that this package is built with