Title: | Interface to 'TensorFlow' Estimators |
---|---|
Description: | Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks. |
Authors: | JJ Allaire [aut], Yuan Tang [aut] , Kevin Ushey [aut], Kevin Kuo [aut] , Tomasz Kalinowski [cre], Daniel Falbel [ctb, cph], RStudio [cph, fnd], Google Inc. [cph] |
Maintainer: | Tomasz Kalinowski <[email protected]> |
License: | Apache License 2.0 |
Version: | 1.9.2 |
Built: | 2024-12-11 03:41:12 UTC |
Source: | https://github.com/rstudio/tfestimators |
Construct a boosted trees estimator.
boosted_trees_regressor( feature_columns, n_batches_per_layer, model_dir = NULL, label_dimension = 1L, weight_column = NULL, n_trees = 100L, max_depth = 6L, learning_rate = 0.1, l1_regularization = 0, l2_regularization = 0, tree_complexity = 0, min_node_weight = 0, config = NULL ) boosted_trees_classifier( feature_columns, n_batches_per_layer, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, n_trees = 100L, max_depth = 6L, learning_rate = 0.1, l1_regularization = 0, l2_regularization = 0, tree_complexity = 0, min_node_weight = 0, config = NULL )
boosted_trees_regressor( feature_columns, n_batches_per_layer, model_dir = NULL, label_dimension = 1L, weight_column = NULL, n_trees = 100L, max_depth = 6L, learning_rate = 0.1, l1_regularization = 0, l2_regularization = 0, tree_complexity = 0, min_node_weight = 0, config = NULL ) boosted_trees_classifier( feature_columns, n_batches_per_layer, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, n_trees = 100L, max_depth = 6L, learning_rate = 0.1, l1_regularization = 0, l2_regularization = 0, tree_complexity = 0, min_node_weight = 0, config = NULL )
feature_columns |
An R list containing all of the feature columns used
by the model (typically, generated by |
n_batches_per_layer |
The number of batches to collect statistics per layer. |
model_dir |
Directory to save the model parameters, graph, and so on. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model. |
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
weight_column |
A string, or a numeric column created by
|
n_trees |
Number trees to be created. |
max_depth |
Maximum depth of the tree to grow. |
learning_rate |
Shrinkage parameter to be used when a tree added to the model. |
l1_regularization |
Regularization multiplier applied to the absolute weights of the tree leafs. |
l2_regularization |
Regularization multiplier applied to the square weights of the tree leafs. |
tree_complexity |
Regularization factor to penalize trees with more leaves. |
min_node_weight |
Minimum hessian a node must have for a split to be considered. The value will be compared with sum(leaf_hessian)/(batch_size * n_batches_per_layer). |
config |
A run configuration created by |
n_classes |
The number of label classes. |
label_vocabulary |
A list of strings represents possible label values.
If given, labels must be string type and have any value in
|
Other canned estimators:
dnn_estimators
,
dnn_linear_combined_estimators
,
linear_estimators
If users keep data in TensorFlow Example format, they need to call tf$parse_example
with a proper feature spec. There are two main things that this utility
helps:
Users need to combine parsing spec of features with labels and
weights (if any) since they are all parsed from same tf$Example
instance.
This utility combines these specs.
It is difficult to map expected label by
a classifier such as dnn_classifier
to corresponding tf$parse_example
spec.
This utility encodes it by getting related information from users (key,
dtype).
classifier_parse_example_spec( feature_columns, label_key, label_dtype = tf$int64, label_default = NULL, weight_column = NULL )
classifier_parse_example_spec( feature_columns, label_key, label_dtype = tf$int64, label_default = NULL, weight_column = NULL )
feature_columns |
An iterable containing all feature columns. All items
should be instances of classes derived from |
label_key |
A string identifying the label. It means |
label_dtype |
A |
label_default |
used as label if label_key does not exist in given
|
weight_column |
A string or a numeric column created by
|
A dict mapping each feature key to a FixedLenFeature
or
VarLenFeature
value.
ValueError: If label is used in feature_columns
.
ValueError: If weight_column is used in feature_columns
.
ValueError: If any of the given feature_columns
is not a feature column instance.
ValueError: If weight_column
is not a numeric column instance.
ValueError: if label_key is NULL
.
Other parsing utilities:
regressor_parse_example_spec()
Base Documentation for Feature Column Constructors
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
Construct a bucketized column, representing discretized dense input. Buckets include the left boundary, and exclude the right boundary.
column_bucketized(source_column, boundaries)
column_bucketized(source_column, boundaries)
source_column |
A one-dimensional dense column, as generated by |
boundaries |
A sorted list or list of floats specifying the boundaries. |
A bucketized column.
ValueError: If source_column
is not a numeric column, or if it is not one-dimensional.
ValueError: If boundaries
is not a sorted list or list.
Other feature column constructors:
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when each of your sparse inputs has both an ID and a value. For example, if you're representing text documents as a collection of word frequencies, you can provide 2 parallel sparse input features ('terms' and 'frequencies' below).
column_categorical_weighted( categorical_column, weight_feature_key, dtype = tf$float32 )
column_categorical_weighted( categorical_column, weight_feature_key, dtype = tf$float32 )
categorical_column |
A categorical column created by
|
weight_feature_key |
String key for weight values. |
dtype |
Type of weights, such as |
A categorical column composed of two sparse features: one represents id, the other represents weight (value) of the id feature in that example.
ValueError: if dtype
is not convertible to float.
Other feature column constructors:
column_bucketized()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when your sparse features are in string or integer format, and you
want to distribute your inputs into a finite number of buckets by hashing.
output_id = Hash(input_feature_string) % bucket_size For input dictionary
features
, features$key$
is either tensor or sparse tensor object. If it's
tensor object, missing values can be represented by -1
for int and ''
for
string. Note that these values are independent of the default_value
argument.
column_categorical_with_hash_bucket(..., hash_bucket_size, dtype = tf$string)
column_categorical_with_hash_bucket(..., hash_bucket_size, dtype = tf$string)
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
hash_bucket_size |
An int > 1. The number of buckets. |
dtype |
The type of features. Only string and integer types are supported. |
A _HashedCategoricalColumn
.
ValueError: hash_bucket_size
is not greater than 1.
ValueError: dtype
is neither string nor integer.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when your inputs are integers in the range [0, num_buckets)
, and
you want to use the input value itself as the categorical ID. Values outside
this range will result in default_value
if specified, otherwise it will
fail.
column_categorical_with_identity(..., num_buckets, default_value = NULL)
column_categorical_with_identity(..., num_buckets, default_value = NULL)
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
num_buckets |
Number of unique values. |
default_value |
If |
Typically, this is used for contiguous ranges of integer indexes, but it
doesn't have to be. This might be inefficient, however, if many of IDs are
unused. Consider column_categorical_with_hash_bucket()
in that case.
For input dictionary features
, features$key
is either tensor or sparse
tensor object. If it's tensor object, missing values can be represented by -1
for
int and ''
for string. Note that these values are independent of the
default_value
argument.
A categorical column that returns identity values.
ValueError: if num_buckets
is less than one.
ValueError: if default_value
is not in range [0, num_buckets)
.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when your inputs are in string or integer format, and you have a
vocabulary file that maps each value to an integer ID. By default,
out-of-vocabulary values are ignored. Use either (but not both) of
num_oov_buckets
and default_value
to specify how to include
out-of-vocabulary values. For input dictionary features
, features[key]
is
either tensor or sparse tensor object. If it's tensor object, missing values can be
represented by -1
for int and ''
for string. Note that these values are
independent of the default_value
argument.
column_categorical_with_vocabulary_file( ..., vocabulary_file, vocabulary_size, num_oov_buckets = 0L, default_value = NULL, dtype = tf$string )
column_categorical_with_vocabulary_file( ..., vocabulary_file, vocabulary_size, num_oov_buckets = 0L, default_value = NULL, dtype = tf$string )
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
vocabulary_file |
The vocabulary file name. |
vocabulary_size |
Number of the elements in the vocabulary. This must be
no greater than length of |
num_oov_buckets |
Non-negative integer, the number of out-of-vocabulary
buckets. All out-of-vocabulary inputs will be assigned IDs in the range
|
default_value |
The integer ID value to return for out-of-vocabulary
feature values, defaults to |
dtype |
The type of features. Only string and integer types are supported. |
A categorical column with a vocabulary file.
ValueError: vocabulary_file
is missing.
ValueError: vocabulary_size
is missing or < 1.
ValueError: num_oov_buckets
is not a non-negative integer.
ValueError: dtype
is neither string nor integer.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when your inputs are in string or integer format, and you have an
in-memory vocabulary mapping each value to an integer ID. By default,
out-of-vocabulary values are ignored. Use default_value
to specify how to
include out-of-vocabulary values. For the input dictionary features
,
features$key
is either tensor or sparse tensor object. If it's tensor object,
missing values can be represented by -1
for int and ''
for string.
column_categorical_with_vocabulary_list( ..., vocabulary_list, dtype = NULL, default_value = -1L, num_oov_buckets = 0L )
column_categorical_with_vocabulary_list( ..., vocabulary_list, dtype = NULL, default_value = -1L, num_oov_buckets = 0L )
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
vocabulary_list |
An ordered iterable defining the vocabulary. Each
feature is mapped to the index of its value (if present) in
|
dtype |
The type of features. Only string and integer types are
supported. If |
default_value |
The value to use for values not in |
num_oov_buckets |
Non-negative integer, the number of out-of-vocabulary
buckets. All out-of-vocabulary inputs will be assigned IDs in the range
|
Note that these values are independent of the default_value
argument.
A categorical column with in-memory vocabulary.
ValueError: if vocabulary_list
is empty, or contains
duplicate keys.
ValueError: if dtype
is not integer or string.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()
Returns a column for performing crosses of categorical features. Crossed
features will be hashed according to hash_bucket_size
.
column_crossed(keys, hash_bucket_size, hash_key = NULL)
column_crossed(keys, hash_bucket_size, hash_key = NULL)
keys |
An iterable identifying the features to be crossed. Each element can be either:
|
hash_bucket_size |
The number of buckets (> 1). |
hash_key |
Optional: specify the hash_key that will be used by the
|
A crossed column.
ValueError: If len(keys) < 2
.
ValueError: If any of the keys is neither a string nor categorical column.
ValueError: If any of the keys is _HashedCategoricalColumn
.
ValueError: If hash_bucket_size < 1
.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_embedding()
,
column_numeric()
,
input_layer()
Use this when your inputs are sparse, but you want to convert them to a dense
representation (e.g., to feed to a DNN). Inputs must be a
categorical column created by any of the column_categorical_*()
functions.
column_embedding( categorical_column, dimension, combiner = "mean", initializer = NULL, ckpt_to_load_from = NULL, tensor_name_in_ckpt = NULL, max_norm = NULL, trainable = TRUE )
column_embedding( categorical_column, dimension, combiner = "mean", initializer = NULL, ckpt_to_load_from = NULL, tensor_name_in_ckpt = NULL, max_norm = NULL, trainable = TRUE )
categorical_column |
A categorical column created by a
|
dimension |
A positive integer, specifying dimension of the embedding. |
combiner |
A string specifying how to reduce if there are multiple
entries in a single row. Currently |
initializer |
A variable initializer function to be used in embedding
variable initialization. If not specified, defaults to
|
ckpt_to_load_from |
String representing checkpoint name/pattern from
which to restore column weights. Required if |
tensor_name_in_ckpt |
Name of the |
max_norm |
If not |
trainable |
Whether or not the embedding is trainable. Default is TRUE. |
A dense column that converts from sparse input.
ValueError: if dimension
not > 0.
ValueError: if exactly one of ckpt_to_load_from
and tensor_name_in_ckpt
is specified.
ValueError: if initializer
is specified and is not callable.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_numeric()
,
input_layer()
Used to wrap any column_categorical()*
(e.g., to feed to DNN). Use
column_embedding()
if the inputs are sparse.
column_indicator(categorical_column)
column_indicator(categorical_column)
categorical_column |
A categorical column which is created by
the |
An indicator column.
Construct a Real-Valued Column
column_numeric( ..., shape = c(1L), default_value = NULL, dtype = tf$float32, normalizer_fn = NULL )
column_numeric( ..., shape = c(1L), default_value = NULL, dtype = tf$float32, normalizer_fn = NULL )
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
shape |
An integer vector that specifies the shape of the tensor. An
integer can be given which means a single dimension tensor with given
width. The tensor representing the column will have the shape of
|
default_value |
A single value compatible with |
dtype |
The types for values contained in the column. The default value
is |
normalizer_fn |
If not |
A numeric column.
TypeError: if any dimension in shape is not an int
ValueError: if any dimension in shape is not a positive integer
TypeError: if default_value
is an iterable but not compatible with shape
TypeError: if default_value
is not compatible with dtype
ValueError: if dtype
is not convertible to tf$float32
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
input_layer()
This helper function provides a set of names to be
used by tidyselect
helpers in e.g. feature_columns()
.
set_columns(columns) with_columns(columns, expr) scoped_columns(columns)
set_columns(columns) with_columns(columns, expr) scoped_columns(columns)
columns |
Either a named R object (whose names will be used to provide a selection context), or a character vector of such names. |
expr |
An R expression, to be evaluated with the selection context active. |
Create a deep neural network (DNN) estimator.
dnn_regressor( hidden_units, feature_columns, model_dir = NULL, label_dimension = 1L, weight_column = NULL, optimizer = "Adagrad", activation_fn = "relu", dropout = NULL, input_layer_partitioner = NULL, config = NULL ) dnn_classifier( hidden_units, feature_columns, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, optimizer = "Adagrad", activation_fn = "relu", dropout = NULL, input_layer_partitioner = NULL, config = NULL )
dnn_regressor( hidden_units, feature_columns, model_dir = NULL, label_dimension = 1L, weight_column = NULL, optimizer = "Adagrad", activation_fn = "relu", dropout = NULL, input_layer_partitioner = NULL, config = NULL ) dnn_classifier( hidden_units, feature_columns, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, optimizer = "Adagrad", activation_fn = "relu", dropout = NULL, input_layer_partitioner = NULL, config = NULL )
An integer vector, indicating the number of hidden
units in each layer. All layers are fully connected. For example,
|
|
feature_columns |
An R list containing all of the feature columns used
by the model (typically, generated by |
model_dir |
Directory to save the model parameters, graph, and so on. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model. |
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
weight_column |
A string, or a numeric column created by
|
optimizer |
Either the name of the optimizer to be used when training the model, or a TensorFlow optimizer instance. Defaults to the Adagrad optimizer. |
activation_fn |
The activation function to apply to each layer. This can either be an
actual activation function (e.g. |
dropout |
When not |
input_layer_partitioner |
An optional partitioner for the input layer.
Defaults to |
config |
A run configuration created by |
n_classes |
The number of label classes. |
label_vocabulary |
A list of strings represents possible label values.
If given, labels must be string type and have any value in
|
Other canned estimators:
boosted_trees_estimators
,
dnn_linear_combined_estimators
,
linear_estimators
Also known as wide-n-deep
estimators, these are estimators for
TensorFlow Linear and DNN joined models for regression.
dnn_linear_combined_regressor( model_dir = NULL, linear_feature_columns = NULL, linear_optimizer = "Ftrl", dnn_feature_columns = NULL, dnn_optimizer = "Adagrad", dnn_hidden_units = NULL, dnn_activation_fn = "relu", dnn_dropout = NULL, label_dimension = 1L, weight_column = NULL, input_layer_partitioner = NULL, config = NULL ) dnn_linear_combined_classifier( model_dir = NULL, linear_feature_columns = NULL, linear_optimizer = "Ftrl", dnn_feature_columns = NULL, dnn_optimizer = "Adagrad", dnn_hidden_units = NULL, dnn_activation_fn = "relu", dnn_dropout = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, input_layer_partitioner = NULL, config = NULL )
dnn_linear_combined_regressor( model_dir = NULL, linear_feature_columns = NULL, linear_optimizer = "Ftrl", dnn_feature_columns = NULL, dnn_optimizer = "Adagrad", dnn_hidden_units = NULL, dnn_activation_fn = "relu", dnn_dropout = NULL, label_dimension = 1L, weight_column = NULL, input_layer_partitioner = NULL, config = NULL ) dnn_linear_combined_classifier( model_dir = NULL, linear_feature_columns = NULL, linear_optimizer = "Ftrl", dnn_feature_columns = NULL, dnn_optimizer = "Adagrad", dnn_hidden_units = NULL, dnn_activation_fn = "relu", dnn_dropout = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, input_layer_partitioner = NULL, config = NULL )
model_dir |
Directory to save the model parameters, graph, and so on. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model. |
linear_feature_columns |
The feature columns used by linear (wide) part of the model. |
linear_optimizer |
Either the name of the optimizer to be used when training the model, or a TensorFlow optimizer instance. Defaults to the FTRL optimizer. |
dnn_feature_columns |
The feature columns used by the neural network (deep) part in the model. |
dnn_optimizer |
Either the name of the optimizer to be used when training the model, or a TensorFlow optimizer instance. Defaults to the Adagrad optimizer. |
An integer vector, indicating the number of hidden
units in each layer. All layers are fully connected. For example,
|
|
dnn_activation_fn |
The activation function to apply to each layer. This can either be an
actual activation function (e.g. |
dnn_dropout |
When not |
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
weight_column |
A string, or a numeric column created by
|
input_layer_partitioner |
An optional partitioner for the input layer.
Defaults to |
config |
A run configuration created by |
n_classes |
The number of label classes. |
label_vocabulary |
A list of strings represents possible label values.
If given, labels must be string type and have any value in
|
Other canned estimators:
boosted_trees_estimators
,
dnn_estimators
,
linear_estimators
Construct a custom estimator, to be used to train and evaluate TensorFlow models.
estimator( model_fn, model_dir = NULL, config = NULL, params = NULL, class = NULL )
estimator( model_fn, model_dir = NULL, config = NULL, params = NULL, class = NULL )
model_fn |
The model function. See Model Function for details on the structure of a model function. |
model_dir |
Directory to save model parameters, graph and etc. This can
also be used to load checkpoints from the directory into a estimator to
continue training a previously saved model. If |
config |
Configuration object. |
params |
List of hyper parameters that will be passed into |
class |
An optional set of R classes to add to the generated object. |
The Estimator
object wraps a model which is specified by a model_fn
,
which, given inputs and a number of other parameters, returns the operations
necessary to perform training, evaluation, and prediction.
All outputs (checkpoints, event files, etc.) are written to model_dir
, or a
subdirectory thereof. If model_dir
is not set, a temporary directory is
used.
The config
argument can be used to passed run configuration object
containing information about the execution environment. It is passed on to
the model_fn
, if the model_fn
has a parameter named "config" (and input
functions in the same manner). If the config
parameter is not passed, it is
instantiated by estimator()
. Not passing config means that defaults useful
for local execution are used. estimator()
makes config available to the
model (for instance, to allow specialization based on the number of workers
available), and also uses some of its fields to control internals, especially
regarding checkpointing.
The params
argument contains hyperparameters. It is passed to the
model_fn
, if the model_fn
has a parameter named "params", and to the
input functions in the same manner. estimator()
only passes params
along, it
does not inspect it. The structure of params
is therefore entirely up to
the developer.
None of estimator's methods can be overridden in subclasses (its
constructor enforces this). Subclasses should use model_fn
to configure the
base class, and may add methods implementing specialized functionality.
The model_fn
should be an R function of the form:
function(features, labels, mode, params) { # 1. Configure the model via TensorFlow operations. # 2. Define the loss function for training and evaluation. # 3. Define the training optimizer. # 4. Define how predictions should be produced. # 5. Return the result as an `estimator_spec()` object. estimator_spec(mode, predictions, loss, train_op, eval_metric_ops) }
The model function's inputs are defined as follows:
features |
The feature tensor(s). |
labels |
The label tensor(s). |
mode |
The current training mode ("train", "eval", "infer").
These can be accessed through the mode_keys() object. |
params |
An optional list of hyperparameters, as received
through the estimator() constructor. |
See estimator_spec()
for more details as to how the estimator specification
should be constructed, and https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/Estimator for
more information as to how the model function should be constructed.
Other custom estimator methods:
estimator_spec()
,
evaluate.tf_estimator()
,
export_savedmodel.tf_estimator()
,
predict.tf_estimator()
,
train.tf_estimator()
Define the estimator specification, used as part of the model_fn
defined with
custom estimators created by estimator()
. See estimator()
for more details.
estimator_spec( mode, predictions = NULL, loss = NULL, train_op = NULL, eval_metric_ops = NULL, training_hooks = NULL, evaluation_hooks = NULL, prediction_hooks = NULL, training_chief_hooks = NULL, ... )
estimator_spec( mode, predictions = NULL, loss = NULL, train_op = NULL, eval_metric_ops = NULL, training_hooks = NULL, evaluation_hooks = NULL, prediction_hooks = NULL, training_chief_hooks = NULL, ... )
mode |
A key that specifies whether we are performing
training ( |
predictions |
The prediction tensor(s). |
loss |
The training loss tensor. Must be either scalar, or with shape |
train_op |
The training operation – typically, a call to |
eval_metric_ops |
A list of metrics to be computed as part of evaluation.
This should be a named list, mapping metric names (e.g. |
training_hooks |
(Available since TensorFlow v1.4) A list of session run hooks to run on all workers during training. |
evaluation_hooks |
(Available since TensorFlow v1.4) A list of session run hooks to run during evaluation. |
prediction_hooks |
(Available since TensorFlow v1.7) A list of session run hooks to run during prediciton. |
training_chief_hooks |
(Available since TensorFlow v1.4) A list of session run hooks to run on chief worker during training. |
... |
Other optional (named) arguments, to be passed to the |
Other custom estimator methods:
estimator()
,
evaluate.tf_estimator()
,
export_savedmodel.tf_estimator()
,
predict.tf_estimator()
,
train.tf_estimator()
Base Documentation for Canned Estimators
object |
A TensorFlow estimator. |
feature_columns |
An R list containing all of the feature columns used
by the model (typically, generated by |
model_dir |
Directory to save the model parameters, graph, and so on. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model. |
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
label_vocabulary |
A list of strings represents possible label values.
If given, labels must be string type and have any value in
|
weight_column |
A string, or a numeric column created by
|
n_classes |
The number of label classes. |
config |
A run configuration created by |
input_layer_partitioner |
An optional partitioner for the input layer.
Defaults to |
partitioner |
An optional partitioner for the input layer. |
train_and_evaluate
EvalSpec
combines details of evaluation of the trained model as well as its
export. Evaluation consists of computing metrics to judge the performance of
the trained model. Export writes out the trained model on to external
storage.
eval_spec( input_fn, steps = 100, name = NULL, hooks = NULL, exporters = NULL, start_delay_secs = 120, throttle_secs = 600 )
eval_spec( input_fn, steps = 100, name = NULL, hooks = NULL, exporters = NULL, start_delay_secs = 120, throttle_secs = 600 )
input_fn |
Evaluation input function returning a tuple of:
|
steps |
Positive number of steps for which to evaluate model.
If |
name |
Name of the evaluation if user needs to run multiple evaluations on different data sets. Metrics for different evaluations are saved in separate folders, and appear separately in tensorboard. |
hooks |
List of session run hooks to run during evaluation. |
exporters |
List of |
start_delay_secs |
Start evaluating after waiting for this many seconds. |
throttle_secs |
Do not re-evaluate unless the last evaluation was started at least this many seconds ago. Of course, evaluation does not occur if no new checkpoints are available, hence, this is the minimum. |
Other training methods:
train_and_evaluate.tf_estimator()
,
train_spec()
Evaluate an estimator on input data provided by an input_fn()
.
## S3 method for class 'tf_estimator' evaluate( object, input_fn, steps = NULL, checkpoint_path = NULL, name = NULL, hooks = NULL, simplify = TRUE, ... )
## S3 method for class 'tf_estimator' evaluate( object, input_fn, steps = NULL, checkpoint_path = NULL, name = NULL, hooks = NULL, simplify = TRUE, ... )
object |
A TensorFlow estimator. |
input_fn |
An input function, typically generated by the |
steps |
The number of steps for which the model should be evaluated on
this particular |
checkpoint_path |
The path to a specific model checkpoint to be used for
prediction. If |
name |
Name of the evaluation if user needs to run multiple evaluations on different data sets, such as on training data vs test data. Metrics for different evaluations are saved in separate folders, and appear separately in tensorboard. |
hooks |
A list of R functions, to be used as callbacks inside the
training loop. By default, |
simplify |
Whether to simplify evaluation results into a |
... |
Optional arguments passed on to the estimator's |
For each step, this method will call input_fn()
to produce a single batch
of data. Evaluation continues until:
steps
batches are processed, or
The input_fn()
is exhausted of data.
An R list of evaluation metrics.
Other custom estimator methods:
estimator_spec()
,
estimator()
,
export_savedmodel.tf_estimator()
,
predict.tf_estimator()
,
train.tf_estimator()
Construct an experiment object.
experiment(object, ...)
experiment(object, ...)
object |
An R object. |
... |
Optional arguments passed on to implementing methods. |
Save an estimator (alongside its weights) to the directory export_dir_base
.
## S3 method for class 'tf_estimator' export_savedmodel( object, export_dir_base, serving_input_receiver_fn = NULL, assets_extra = NULL, as_text = FALSE, checkpoint_path = NULL, overwrite = TRUE, versioned = !overwrite, ... )
## S3 method for class 'tf_estimator' export_savedmodel( object, export_dir_base, serving_input_receiver_fn = NULL, assets_extra = NULL, as_text = FALSE, checkpoint_path = NULL, overwrite = TRUE, versioned = !overwrite, ... )
object |
A TensorFlow estimator. |
export_dir_base |
A string containing a directory in which to export the SavedModel. |
serving_input_receiver_fn |
A function that takes no argument and
returns a |
assets_extra |
A dict specifying how to populate the assets.extra
directory within the exported SavedModel, or |
as_text |
whether to write the SavedModel proto in text format. |
checkpoint_path |
The checkpoint path to export. If |
overwrite |
Should the |
versioned |
Should the model be exported under a versioned subdirectory? |
... |
Optional arguments passed on to the estimator's
|
This method builds a new graph by first calling the serving_input_receiver_fn
to obtain feature Tensor
s, and then calling this Estimator
's model_fn to
generate the model graph based on those features. It restores the given
checkpoint (or, lacking that, the most recent checkpoint) into this graph in
a fresh session. Finally it creates a timestamped export directory below the
given export_dir_base, and writes a SavedModel
into it containing a single
MetaGraphDef
saved from this session. The exported MetaGraphDef
will
provide one SignatureDef
for each element of the export_outputs dict
returned from the model_fn, named using the same keys. One of these keys is
always signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY, indicating
which signature will be served when a serving request does not specify one.
For each signature, the outputs are provided by the corresponding
ExportOutput
s, and the inputs are always the input receivers provided by
the serving_input_receiver_fn. Extra assets may be written into the
SavedModel via the extra_assets argument. This should be a dict, where each
key gives a destination path (including the filename) relative to the
assets.extra directory. The corresponding value gives the full path of the
source file to be copied. For example, the simple case of copying a single
file without renaming it is specified as {'my_asset_file.txt': '/path/to/my_asset_file.txt'}
.
The path to the exported directory, as a string.
ValueError: if no serving_input_receiver_fn is provided, no export_outputs are provided, or no checkpoint can be found.
Other custom estimator methods:
estimator_spec()
,
estimator()
,
evaluate.tf_estimator()
,
predict.tf_estimator()
,
train.tf_estimator()
Constructors for feature columns. A feature column defines the expected 'shape' of an input Tensor.
feature_columns(..., names = NULL)
feature_columns(..., names = NULL)
... |
One or more feature column definitions. The tidyselect package is used to power generation of feature columns. |
names |
Available feature names (for selection / pattern matching) as a
character vector (or R object that implements |
The standard library uses various well-known names to collect and retrieve values associated with a graph.
graph_keys()
graph_keys()
For example, the tf$Optimizer
subclasses default to optimizing the
variables collected undergraph_keys()$TRAINABLE_VARIABLES
if NULL
is
specified, but it is also possible to pass an explicit list of variables.
The following standard keys are defined:
GLOBAL_VARIABLES
: the default collection of Variable
objects, shared
across distributed environment (model variables are subset of these). See
tf$global_variables
for more details. Commonly, all TRAINABLE_VARIABLES
variables will be in MODEL_VARIABLES
, and all MODEL_VARIABLES
variables
will be in GLOBAL_VARIABLES
.
LOCAL_VARIABLES
: the subset of Variable
objects that are local to each
machine. Usually used for temporarily variables, like counters. Note: use
tf$contrib$framework$local_variable
to add to this collection.
MODEL_VARIABLES
: the subset of Variable
objects that are used in the
model for inference (feed forward). Note: use
tf$contrib$framework$model_variable
to add to this collection.
TRAINABLE_VARIABLES
: the subset of Variable
objects that will be
trained by an optimizer. See tf$trainable_variables
for more details.
SUMMARIES
: the summary Tensor
objects that have been created in the
graph. See tf$summary$merge_all
for more details.
QUEUE_RUNNERS
: the QueueRunner
objects that are used to produce input
for a computation. See tf$train$start_queue_runners
for more details.
MOVING_AVERAGE_VARIABLES
: the subset of Variable
objects that will also
keep moving averages. See tf$moving_average_variables
for more details.
REGULARIZATION_LOSSES
: regularization losses collected during graph
construction. The following standard keys are defined, but their
collections are not automatically populated as many of the others are:
WEIGHTS
BIASES
ACTIVATIONS
Other utility functions:
latest_checkpoint()
## Not run: graph_keys() graph_keys()$LOSSES ## End(Not run)
## Not run: graph_keys() graph_keys()$LOSSES ## End(Not run)
Saves Checkpoints Every N Steps or Seconds
hook_checkpoint_saver( checkpoint_dir, save_secs = NULL, save_steps = NULL, saver = NULL, checkpoint_basename = "model.ckpt", scaffold = NULL, listeners = NULL )
hook_checkpoint_saver( checkpoint_dir, save_secs = NULL, save_steps = NULL, saver = NULL, checkpoint_basename = "model.ckpt", scaffold = NULL, listeners = NULL )
checkpoint_dir |
The base directory for the checkpoint files. |
save_secs |
An integer, indicating saving checkpoints every N secs. |
save_steps |
An integer, indicating saving checkpoints every N steps. |
saver |
A saver object, used for saving. |
checkpoint_basename |
The base name for the checkpoint files. |
scaffold |
A scaffold, used to get saver object. |
listeners |
List of checkpoint saver listener subclass instances, used
for callbacks that run immediately after the corresponding
|
Other session_run_hook wrappers:
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
wait_until_step
.This hook delays execution until global step reaches to wait_until_step
. It
is used to gradually start workers in distributed settings. One example usage
would be setting wait_until_step=int(K*log(task_id+1))
assuming that
task_id=0
is the chief.
hook_global_step_waiter(wait_until_step)
hook_global_step_waiter(wait_until_step)
wait_until_step |
An integer indicating that until which global step should we wait. |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
This hook allows users to save the metrics history produced during training or evaluation in a specified frequency.
hook_history_saver(every_n_step = 10)
hook_history_saver(every_n_step = 10)
every_n_step |
Save the metrics every N steps |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
The tensors will be printed to the log, with INFO
severity.
hook_logging_tensor( tensors, every_n_iter = NULL, every_n_secs = NULL, formatter = NULL, at_end = FALSE )
hook_logging_tensor( tensors, every_n_iter = NULL, every_n_secs = NULL, formatter = NULL, at_end = FALSE )
tensors |
A list that maps string-valued tags to tensors/tensor names. |
every_n_iter |
An integer value, indicating the values of |
every_n_secs |
An integer or float value, indicating the values of |
formatter |
A function that takes |
at_end |
A boolean value specifying whether to print the values of |
Note that if at_end
is TRUE
, tensors
should not include any tensor
whose evaluation produces a side effect such as consuming additional inputs.
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
Monitors loss and stops training if loss is NaN. Can either fail with exception or just stop training.
hook_nan_tensor(loss_tensor, fail_on_nan_loss = TRUE)
hook_nan_tensor(loss_tensor, fail_on_nan_loss = TRUE)
loss_tensor |
The loss tensor. |
fail_on_nan_loss |
A boolean indicating whether to raise exception when loss is NaN. |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
This hook creates a progress bar that creates and updates the progress bar during training or evaluation.
hook_progress_bar()
hook_progress_bar()
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
Steps per Second Monitor
hook_step_counter( every_n_steps = 100, every_n_secs = NULL, output_dir = NULL, summary_writer = NULL )
hook_step_counter( every_n_steps = 100, every_n_secs = NULL, output_dir = NULL, summary_writer = NULL )
every_n_steps |
Run this counter every N steps |
every_n_secs |
Run this counter every N seconds |
output_dir |
The output directory |
summary_writer |
The summary writer |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_stop_at_step()
,
hook_summary_saver()
,
session_run_hook()
Monitor to Request Stop at a Specified Step
hook_stop_at_step(num_steps = NULL, last_step = NULL)
hook_stop_at_step(num_steps = NULL, last_step = NULL)
num_steps |
Number of steps to execute. |
last_step |
Step after which to stop. |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_summary_saver()
,
session_run_hook()
Saves Summaries Every N Steps
hook_summary_saver( save_steps = NULL, save_secs = NULL, output_dir = NULL, summary_writer = NULL, scaffold = NULL, summary_op = NULL )
hook_summary_saver( save_steps = NULL, save_secs = NULL, output_dir = NULL, summary_writer = NULL, scaffold = NULL, summary_op = NULL )
save_steps |
An integer indicating saving summaries every N steps. Exactly one of
|
save_secs |
An integer indicating saving summaries every N seconds. |
output_dir |
The directory to save the summaries to. Only used
if no |
summary_writer |
The summary writer. If |
scaffold |
A scaffold to get summary_op if it's not provided. |
summary_op |
A tensor of type |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
session_run_hook()
This function constructs input function from various types of input used to feed different TensorFlow estimators.
input_fn(object, ...) ## Default S3 method: input_fn(object, ...) ## S3 method for class 'formula' input_fn(object, data, ...) ## S3 method for class 'data.frame' input_fn( object, features, response = NULL, batch_size = 128, shuffle = "auto", num_epochs = 1, queue_capacity = 1000, num_threads = 1, ... ) ## S3 method for class 'list' input_fn( object, features, response = NULL, batch_size = 128, shuffle = "auto", num_epochs = 1, queue_capacity = 1000, num_threads = 1, ... ) ## S3 method for class 'matrix' input_fn(object, ...)
input_fn(object, ...) ## Default S3 method: input_fn(object, ...) ## S3 method for class 'formula' input_fn(object, data, ...) ## S3 method for class 'data.frame' input_fn( object, features, response = NULL, batch_size = 128, shuffle = "auto", num_epochs = 1, queue_capacity = 1000, num_threads = 1, ... ) ## S3 method for class 'list' input_fn( object, features, response = NULL, batch_size = 128, shuffle = "auto", num_epochs = 1, queue_capacity = 1000, num_threads = 1, ... ) ## S3 method for class 'matrix' input_fn(object, ...)
object , data
|
An 'input source' – either a data set (e.g. an R |
... |
Optional arguments passed on to implementing submethods. |
features |
The names of feature variables to be used. |
response |
The name of the response variable. |
batch_size |
The batch size. |
shuffle |
Whether to shuffle the queue. When |
num_epochs |
The number of epochs to iterate over data. |
queue_capacity |
The size of queue to accumulate. |
num_threads |
The number of threads used for reading and enqueueing. In
order to have predictable and repeatable order of reading and enqueueing,
such as in prediction and evaluation mode, |
For list objects, this method is particularly useful when constructing dynamic length of inputs for models like recurrent neural networks. Note that some arguments are not available yet for input_fn applied to list objects. See S3 method signatures below for more details.
Other input functions:
numpy_input_fn()
## Not run: # Construct the input function through formula interface input_fn1 <- input_fn(mpg ~ drat + cyl, mtcars) ## End(Not run) ## Not run: # Construct the input function from a data.frame object input_fn1 <- input_fn(mtcars, response = mpg, features = c(drat, cyl)) ## End(Not run) ## Not run: # Construct the input function from a list object input_fn1 <- input_fn( object = list( feature1 = list( list(list(1), list(2), list(3)), list(list(4), list(5), list(6))), feature2 = list( list(list(7), list(8), list(9)), list(list(10), list(11), list(12))), response = list( list(1, 2, 3), list(4, 5, 6))), features = c("feature1", "feature2"), response = "response", batch_size = 10L) ## End(Not run)
## Not run: # Construct the input function through formula interface input_fn1 <- input_fn(mpg ~ drat + cyl, mtcars) ## End(Not run) ## Not run: # Construct the input function from a data.frame object input_fn1 <- input_fn(mtcars, response = mpg, features = c(drat, cyl)) ## End(Not run) ## Not run: # Construct the input function from a list object input_fn1 <- input_fn( object = list( feature1 = list( list(list(1), list(2), list(3)), list(list(4), list(5), list(6))), feature2 = list( list(list(7), list(8), list(9)), list(list(10), list(11), list(12))), response = list( list(1, 2, 3), list(4, 5, 6))), features = c("feature1", "feature2"), response = "response", batch_size = 10L) ## End(Not run)
Returns a dense tensor as input layer based on given feature_columns
.
At the first layer of the model, this column oriented data should be converted
to a single tensor.
input_layer( features, feature_columns, weight_collections = NULL, trainable = TRUE )
input_layer( features, feature_columns, weight_collections = NULL, trainable = TRUE )
features |
A mapping from key to tensors. Feature columns look up via
these keys. For example |
feature_columns |
An iterable containing the FeatureColumns to use as
inputs to your model. All items should be instances of classes derived from
a dense column such as |
weight_collections |
A list of collection names to which the Variable
will be added. Note that, variables will also be added to collections
|
trainable |
If |
A tensor which represents input layer of a model. Its shape is
(batch_size, first_layer_dimension) and its dtype is float32
.
first_layer_dimension is determined based on given feature_columns
.
ValueError: if an item in feature_columns
is not a dense column.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
Create an Estimator from a compiled Keras model
keras_model_to_estimator( keras_model = NULL, keras_model_path = NULL, custom_objects = NULL, model_dir = NULL, config = NULL )
keras_model_to_estimator( keras_model = NULL, keras_model_path = NULL, custom_objects = NULL, model_dir = NULL, config = NULL )
keras_model |
A keras model. |
keras_model_path |
Directory to a keras model on disk. |
custom_objects |
Dictionary for custom objects. |
model_dir |
Directory to save Estimator model parameters, graph and etc. |
config |
Configuration object. |
Get the Latest Checkpoint in a Checkpoint Directory
latest_checkpoint(checkpoint_dir, ...)
latest_checkpoint(checkpoint_dir, ...)
checkpoint_dir |
The path to the checkpoint directory. |
... |
Optional arguments passed on to |
Other utility functions:
graph_keys()
Construct a linear model, which can be used to predict a continuous outcome
(in the case of linear_regressor()
) or a categorical outcome (in the case
of linear_classifier()
).
linear_regressor( feature_columns, model_dir = NULL, label_dimension = 1L, weight_column = NULL, optimizer = "Ftrl", config = NULL, partitioner = NULL ) linear_classifier( feature_columns, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, optimizer = "Ftrl", config = NULL, partitioner = NULL )
linear_regressor( feature_columns, model_dir = NULL, label_dimension = 1L, weight_column = NULL, optimizer = "Ftrl", config = NULL, partitioner = NULL ) linear_classifier( feature_columns, model_dir = NULL, n_classes = 2L, weight_column = NULL, label_vocabulary = NULL, optimizer = "Ftrl", config = NULL, partitioner = NULL )
feature_columns |
An R list containing all of the feature columns used
by the model (typically, generated by |
model_dir |
Directory to save the model parameters, graph, and so on. This can also be used to load checkpoints from the directory into a estimator to continue training a previously saved model. |
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
weight_column |
A string, or a numeric column created by
|
optimizer |
Either the name of the optimizer to be used when training the model, or a TensorFlow optimizer instance. Defaults to the FTRL optimizer. |
config |
A run configuration created by |
partitioner |
An optional partitioner for the input layer. |
n_classes |
The number of label classes. |
label_vocabulary |
A list of strings represents possible label values.
If given, labels must be string type and have any value in
|
Other canned estimators:
boosted_trees_estimators
,
dnn_estimators
,
dnn_linear_combined_estimators
The canonical set of keys that can be used to access metrics from canned estimators.
metric_keys()
metric_keys()
Other estimator keys:
mode_keys()
,
prediction_keys()
## Not run: metrics <- metric_keys() # Get the available keys metrics metrics$ACCURACY ## End(Not run)
## Not run: metrics <- metric_keys() # Get the available keys metrics metrics$ACCURACY ## End(Not run)
The names for different possible modes for an estimator. The following standard keys are defined:
mode_keys()
mode_keys()
TRAIN |
Training mode. |
EVAL |
Evaluation mode. |
PREDICT |
Prediction / inference mode. |
Other estimator keys:
metric_keys()
,
prediction_keys()
## Not run: modes <- mode_keys() modes$TRAIN ## End(Not run)
## Not run: modes <- mode_keys() modes$TRAIN ## End(Not run)
Get the directory where a model's artifacts are stored.
model_dir(object, ...)
model_dir(object, ...)
object |
Model object |
... |
Unused |
This returns a function outputting features
and target
based on the dict
of numpy arrays. The dict features
has the same keys as the x
.
numpy_input_fn( x, y = NULL, batch_size = 128, num_epochs = 1, shuffle = NULL, queue_capacity = 1000, num_threads = 1 )
numpy_input_fn( x, y = NULL, batch_size = 128, num_epochs = 1, shuffle = NULL, queue_capacity = 1000, num_threads = 1 )
x |
dict of numpy array object. |
y |
numpy array object. |
batch_size |
Integer, size of batches to return. |
num_epochs |
Integer, number of epochs to iterate over data. If |
shuffle |
Boolean, if |
queue_capacity |
Integer, size of queue to accumulate. |
num_threads |
Integer, number of threads used for reading and
enqueueing. In order to have predicted and repeatable order of reading and
enqueueing, such as in prediction and evaluation mode, |
Note that this function is still experimental and should only be used if necessary, e.g. feed in data that's dictionary of numpy arrays.
ValueError: if the shape of y
mismatches the shape of
values in x
(i.e., values in x
have same shape). TypeError: x
is not
a dict or shuffle
is not bool.
Other input functions:
input_fn()
Plots metrics recorded during training.
## S3 method for class 'tf_estimator_history' plot( x, y, metrics = NULL, method = c("auto", "ggplot2", "base"), smooth = getOption("tf.estimator.plot.history.smooth", TRUE), theme_bw = getOption("tf.estimator.plot.history.theme_bw", FALSE), ... )
## S3 method for class 'tf_estimator_history' plot( x, y, metrics = NULL, method = c("auto", "ggplot2", "base"), smooth = getOption("tf.estimator.plot.history.smooth", TRUE), theme_bw = getOption("tf.estimator.plot.history.theme_bw", FALSE), ... )
x |
Training history object returned from |
y |
Unused. |
metrics |
One or more metrics to plot (e.g. |
method |
Method to use for plotting. The default "auto" will use ggplot2 if available, and otherwise will use base graphics. |
smooth |
Whether a loess smooth should be added to the plot, only
available for the |
theme_bw |
Use |
... |
Additional parameters to pass to the |
Generate predicted labels / values for input data provided by input_fn()
.
## S3 method for class 'tf_estimator' predict( object, input_fn, checkpoint_path = NULL, predict_keys = c("predictions", "classes", "class_ids", "logistic", "logits", "probabilities"), hooks = NULL, as_iterable = FALSE, simplify = TRUE, yield_single_examples = TRUE, ... )
## S3 method for class 'tf_estimator' predict( object, input_fn, checkpoint_path = NULL, predict_keys = c("predictions", "classes", "class_ids", "logistic", "logits", "probabilities"), hooks = NULL, as_iterable = FALSE, simplify = TRUE, yield_single_examples = TRUE, ... )
object |
A TensorFlow estimator. |
input_fn |
An input function, typically generated by the |
checkpoint_path |
The path to a specific model checkpoint to be used for
prediction. If |
predict_keys |
The types of predictions that should be produced, as an R list. When this argument is not specified (the default), all possible predicted values will be returned. |
hooks |
A list of R functions, to be used as callbacks inside the
training loop. By default, |
as_iterable |
Boolean; should a raw Python generator be returned? When
|
simplify |
Whether to simplify prediction results into a |
yield_single_examples |
(Available since TensorFlow v1.7) If |
... |
Optional arguments passed on to the estimator's |
Evaluated values of predictions
tensors.
ValueError: Could not find a trained model in model_dir.
ValueError: if batch length of predictions are not same. ValueError: If
there is a conflict between predict_keys
and predictions
. For example
if predict_keys
is not NULL
but EstimatorSpec.predictions
is not a
dict
.
Other custom estimator methods:
estimator_spec()
,
estimator()
,
evaluate.tf_estimator()
,
export_savedmodel.tf_estimator()
,
train.tf_estimator()
The canonical set of keys used for models and estimators that provide
different types of predicted values through their predict()
method.
prediction_keys()
prediction_keys()
Other estimator keys:
metric_keys()
,
mode_keys()
## Not run: keys <- prediction_keys() # Get the available keys keys # Key for retrieving probabilities from prediction values keys$PROBABILITIES ## End(Not run)
## Not run: keys <- prediction_keys() # Get the available keys keys # Key for retrieving probabilities from prediction values keys$PROBABILITIES ## End(Not run)
If users keep data in tf$Example
format, they need to call tf$parse_example
with a proper feature spec. There are two main things that this utility
helps:
Users need to combine parsing spec of features with labels and weights (if
any) since they are all parsed from same tf$Example
instance. This utility
combines these specs.
It is difficult to map expected label by a regressor such as dnn_regressor
to corresponding tf$parse_example
spec. This utility encodes it by getting
related information from users (key, dtype).
regressor_parse_example_spec( feature_columns, label_key, label_dtype = tf$float32, label_default = NULL, label_dimension = 1L, weight_column = NULL )
regressor_parse_example_spec( feature_columns, label_key, label_dtype = tf$float32, label_default = NULL, label_dimension = 1L, weight_column = NULL )
feature_columns |
An iterable containing all feature columns. All items
should be instances of classes derived from |
label_key |
A string identifying the label. It means |
label_dtype |
A |
label_default |
used as label if label_key does not exist in given
|
label_dimension |
Number of regression targets per example. This is the
size of the last dimension of the labels and logits |
weight_column |
A string or a |
A dict mapping each feature key to a FixedLenFeature
or
VarLenFeature
value.
ValueError: If label is used in feature_columns
.
ValueError: If weight_column is used in feature_columns
.
ValueError: If any of the given feature_columns
is not a _FeatureColumn
instance.
ValueError: If weight_column
is not a _NumericColumn
instance.
ValueError: if label_key is NULL
.
Other parsing utilities:
classifier_parse_example_spec()
This class specifies the configurations for an Estimator
run.
run_config()
run_config()
Other run_config methods:
task_type()
## Not run: config <- run_config() # Get the properties of the config names(config) # Change the mutable properties of the config config <- config$replace(tf_random_seed = 11L, save_summary_steps = 12L) # Print config as key value pairs print(config) ## End(Not run)
## Not run: config <- run_config() # Get the properties of the config names(config) # Change the mutable properties of the config config <- config$replace(tf_random_seed = 11L, save_summary_steps = 12L) # Print config as key value pairs print(config) ## End(Not run)
Create a set of session run arguments. These are used as the return values in
the before_run(context)
callback of a session_run_hook()
, for requesting
the values of specific tensor in the after_run(context, values)
callback.
session_run_args(...)
session_run_args(...)
... |
A set of tensors or operations. |
Create a set of session run hooks, used to record information during training of an estimator. See Details for more information on the various hooks that can be defined.
session_run_hook( begin = function() { }, after_create_session = function(session, coord) { }, before_run = function(context) { }, after_run = function(context, values) { }, end = function(session) { } )
session_run_hook( begin = function() { }, after_create_session = function(session, coord) { }, before_run = function(context) { }, after_run = function(context, values) { }, end = function(session) { } )
begin |
|
after_create_session |
|
before_run |
|
after_run |
|
end |
Typically, you'll want to define a |
Other session_run_hook wrappers:
hook_checkpoint_saver()
,
hook_global_step_waiter()
,
hook_history_saver()
,
hook_logging_tensor()
,
hook_nan_tensor()
,
hook_progress_bar()
,
hook_step_counter()
,
hook_stop_at_step()
,
hook_summary_saver()
This constant class gives the constant strings for available task types
used in run_config
.
task_type()
task_type()
Other run_config methods:
run_config()
## Not run: task_type()$MASTER ## End(Not run)
## Not run: task_type()$MASTER ## End(Not run)
This library provides an R interface to the Estimator API inside TensorFlow that's designed to streamline the process of creating, evaluating, and deploying general machine learning and deep learning models.
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.
The TensorFlow API is composed of a set of Python modules that enable constructing and executing TensorFlow graphs. The tensorflow package provides access to the complete TensorFlow API from within R.
For additional documentation on the tensorflow package see https://tensorflow.rstudio.com
(Available since TensorFlow v1.4)
## S3 method for class 'tf_estimator' train_and_evaluate(object, train_spec, eval_spec, ...)
## S3 method for class 'tf_estimator' train_and_evaluate(object, train_spec, eval_spec, ...)
object |
An estimator object to train and evaluate. |
train_spec |
A |
eval_spec |
A |
... |
Not used. |
This utility function trains, evaluates, and (optionally) exports the model by
using the given estimator
. All training related specification is held in
train_spec
, including training input_fn
and training max steps, etc. All
evaluation and export related specification is held in eval_spec
, including
evaluation input_fn
, steps, etc.
This utility function provides consistent behavior for both local (non-distributed) and distributed configurations. Currently, the only supported distributed training configuration is between-graph replication.
Overfitting: In order to avoid overfitting, it is recommended to set up the
training input_fn
to shuffle the training data properly. It is also
recommended to train the model a little longer, say multiple epochs, before
performing evaluation, as the input pipeline starts from scratch for each
training. It is particularly important for local training and evaluation.
Stop condition: In order to support both distributed and non-distributed
configuration reliably, the only supported stop condition for model
training is train_spec.max_steps
. If train_spec.max_steps
is NULL
, the
model is trained forever. Use with care if model stop condition is
different. For example, assume that the model is expected to be trained with
one epoch of training data, and the training input_fn
is configured to throw
OutOfRangeError
after going through one epoch, which stops the
Estimator.train
. For a three-training-worker distributed configuration, each
training worker is likely to go through the whole epoch independently. So, the
model will be trained with three epochs of training data instead of one epoch.
ValueError: if environment variable TF_CONFIG
is incorrectly set.
Other training methods:
eval_spec()
,
train_spec()
train_and_evaluate
TrainSpec
determines the input data for the training, as well as the
duration. Optional hooks run at various stages of training.
train_spec(input_fn, max_steps = NULL, hooks = NULL)
train_spec(input_fn, max_steps = NULL, hooks = NULL)
input_fn |
Training input function returning a tuple of:
|
max_steps |
Positive number of total steps for which to train model.
If |
hooks |
List of session run hooks to run on all workers (including chief) during training. |
Other training methods:
eval_spec()
,
train_and_evaluate.tf_estimator()
Base Documentation for train, evaluate, and predict.
input_fn |
An input function, typically generated by the |
hooks |
A list of R functions, to be used as callbacks inside the
training loop. By default, |
checkpoint_path |
The path to a specific model checkpoint to be used for
prediction. If |
Train an estimator on a set of input data provides by the input_fn()
.
## S3 method for class 'tf_estimator' train( object, input_fn, steps = NULL, hooks = NULL, max_steps = NULL, saving_listeners = NULL, ... )
## S3 method for class 'tf_estimator' train( object, input_fn, steps = NULL, hooks = NULL, max_steps = NULL, saving_listeners = NULL, ... )
object |
A TensorFlow estimator. |
input_fn |
An input function, typically generated by the |
steps |
The number of steps for which the model should be trained on
this particular |
hooks |
A list of R functions, to be used as callbacks inside the
training loop. By default, |
max_steps |
The total number of steps for which the model should be
trained. If set, |
saving_listeners |
(Available since TensorFlow v1.4) A list of
|
... |
Optional arguments, passed on to the estimator's |
A data.frame of the training loss history.
Other custom estimator methods:
estimator_spec()
,
estimator()
,
evaluate.tf_estimator()
,
export_savedmodel.tf_estimator()
,
predict.tf_estimator()
These helper functions extract the names and values of variables in the graphs associated with trained estimator models.
variable_names(object) variable_value(object, variable = NULL)
variable_names(object) variable_value(object, variable = NULL)
object |
A trained estimator model. |
variable |
(Optional) Names of variables to extract as a character vector. If not specified, values for all variables are returned. |
For variable_names()
, a vector of variable names. For variable_values()
, a named list of variable values.