Package 'keras3'

Title: R Interface to 'Keras'
Description: Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Authors: Tomasz Kalinowski [aut, cph, cre], Daniel Falbel [ctb, cph], JJ Allaire [aut, cph], François Chollet [aut, cph], Posit Software, PBC [cph, fnd], Google [cph, fnd], Yuan Tang [ctb, cph] , Wouter Van Der Bijl [ctb, cph], Martin Studer [ctb, cph], Sigrid Keydana [ctb]
Maintainer: Tomasz Kalinowski <[email protected]>
License: MIT + file LICENSE
Version: 1.2.0.9000
Built: 2024-10-06 05:35:31 UTC
Source: https://github.com/rstudio/keras3

Help Index


Exponential Linear Unit.

Description

The exponential linear unit (ELU) with alpha > 0 is defined as:

  • x if x > 0

  • alpha * exp(x) - 1 if x < 0

ELUs have negative values which pushes the mean of the activations closer to zero.

Mean activations that are closer to zero enable faster learning as they bring the gradient closer to the natural gradient. ELUs saturate to a negative value when the argument gets smaller. Saturation means a small derivative which decreases the variation and the information that is propagated to the next layer.

Usage

activation_elu(x, alpha = 1)

Arguments

x

Input tensor.

alpha

Numeric. See description for details.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference

See Also

Other activations:
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Exponential activation function.

Description

Exponential activation function.

Usage

activation_exponential(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Gaussian error linear unit (GELU) activation function.

Description

The Gaussian error linear unit (GELU) is defined as:

gelu(x) = x * P(X <= x) where P(X) ~ N(0, 1), i.e. gelu(x) = 0.5 * x * (1 + erf(x / sqrt(2))).

GELU weights inputs by their value, rather than gating inputs by their sign as in ReLU.

Usage

activation_gelu(x, approximate = FALSE)

Arguments

x

Input tensor.

approximate

A bool, whether to enable approximation.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference

See Also

Other activations:
activation_elu()
activation_exponential()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Hard sigmoid activation function.

Description

The hard sigmoid activation is defined as:

  • 0 if ⁠if x <= -3⁠

  • 1 if x >= 3

  • (x/6) + 0.5 if ⁠-3 < x < 3⁠

It's a faster, piecewise linear approximation of the sigmoid activation.

Usage

activation_hard_sigmoid(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Hard SiLU activation function, also known as Hard Swish.

Description

It is defined as:

  • 0 if ⁠if x < -3⁠

  • x if x > 3

  • x * (x + 3) / 6 if ⁠-3 <= x <= 3⁠

It's a faster, piecewise linear approximation of the silu activation.

Usage

activation_hard_silu(x)

activation_hard_swish(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference


Leaky relu activation function.

Description

Leaky relu activation function.

Usage

activation_leaky_relu(x, negative_slope = 0.2)

Arguments

x

Input tensor.

negative_slope

A float that controls the slope for values lower than the threshold.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Linear activation function (pass-through).

Description

A "linear" activation is an identity function: it returns the input, unmodified.

Usage

activation_linear(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Log-Softmax activation function.

Description

Each input vector is handled independently. The axis argument sets which axis of the input the function is applied along.

Usage

activation_log_softmax(x, axis = -1L)

Arguments

x

Input tensor.

axis

Integer, axis along which the softmax is applied.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Mish activation function.

Description

It is defined as:

mish(x) = x * tanh(softplus(x))

where softplus is defined as:

softplus(x) = log(exp(x) + 1)

Usage

activation_mish(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Applies the rectified linear unit activation function.

Description

With default values, this returns the standard ReLU activation: max(x, 0), the element-wise maximum of 0 and the input tensor.

Modifying default parameters allows you to use non-zero thresholds, change the max value of the activation, and to use a non-zero multiple of the input for values below the threshold.

Usage

activation_relu(x, negative_slope = 0, max_value = NULL, threshold = 0)

Arguments

x

Input tensor.

negative_slope

A numeric that controls the slope for values lower than the threshold.

max_value

A numeric that sets the saturation threshold (the largest value the function will return).

threshold

A numeric giving the threshold value of the activation function below which values will be damped or set to zero.

Value

A tensor with the same shape and dtype as input x.

Examples

x <- c(-10, -5, 0, 5, 10)
activation_relu(x)
## tf.Tensor([ 0.  0.  0.  5. 10.], shape=(5), dtype=float32)

activation_relu(x, negative_slope = 0.5)
## tf.Tensor([-5.  -2.5  0.   5.  10. ], shape=(5), dtype=float32)

activation_relu(x, max_value = 5)
## tf.Tensor([0. 0. 0. 5. 5.], shape=(5), dtype=float32)

activation_relu(x, threshold = 5)
## tf.Tensor([-0. -0.  0.  0. 10.], shape=(5), dtype=float32)

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Relu6 activation function.

Description

It's the ReLU function, but truncated to a maximum value of 6.

Usage

activation_relu6(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Scaled Exponential Linear Unit (SELU).

Description

The Scaled Exponential Linear Unit (SELU) activation function is defined as:

  • scale * x if x > 0

  • scale * alpha * (exp(x) - 1) if x < 0

where alpha and scale are pre-defined constants (alpha = 1.67326324 and scale = 1.05070098).

Basically, the SELU activation function multiplies scale (> 1) with the output of the activation_elu function to ensure a slope larger than one for positive inputs.

The values of alpha and scale are chosen so that the mean and variance of the inputs are preserved between two consecutive layers as long as the weights are initialized correctly (see initializer_lecun_normal()) and the number of input units is "large enough" (see reference paper for more information).

Usage

activation_selu(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

Notes

  • To be used together with initializer_lecun_normal().

  • To be used together with the dropout variant layer_alpha_dropout() (legacy, depracated).

Reference

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Sigmoid activation function.

Description

It is defined as: sigmoid(x) = 1 / (1 + exp(-x)).

For small values (<-5), sigmoid returns a value close to zero, and for large values (>5) the result of the function gets close to 1.

Sigmoid is equivalent to a 2-element softmax, where the second element is assumed to be zero. The sigmoid function always returns a value between 0 and 1.

Usage

activation_sigmoid(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Swish (or Silu) activation function.

Description

It is defined as: swish(x) = x * sigmoid(x).

The Swish (or Silu) activation function is a smooth, non-monotonic function that is unbounded above and bounded below.

Usage

activation_silu(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

Reference

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()


Softmax converts a vector of values to a probability distribution.

Description

The elements of the output vector are in range ⁠[0, 1]⁠ and sum to 1.

Each input vector is handled independently. The axis argument sets which axis of the input the function is applied along.

Softmax is often used as the activation for the last layer of a classification network because the result could be interpreted as a probability distribution.

The softmax of each vector x is computed as exp(x) / sum(exp(x)).

The input values in are the log-odds of the resulting probability.

Usage

activation_softmax(x, axis = -1L)

Arguments

x

Input tensor.

axis

Integer, axis along which the softmax is applied.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softplus()
activation_softsign()
activation_tanh()


Softplus activation function.

Description

It is defined as: softplus(x) = log(exp(x) + 1).

Usage

activation_softplus(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softsign()
activation_tanh()


Softsign activation function.

Description

Softsign is defined as: softsign(x) = x / (abs(x) + 1).

Usage

activation_softsign(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_tanh()


Hyperbolic tangent activation function.

Description

It is defined as: tanh(x) = sinh(x) / cosh(x), i.e. tanh(x) = ((exp(x) - exp(-x)) / (exp(x) + exp(-x))).

Usage

activation_tanh(x)

Arguments

x

Input tensor.

Value

A tensor, the result from applying the activation to the input tensor x.

See Also

Other activations:
activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()


Create an active property class method

Description

Create an active property class method

Usage

active_property(fn)

Arguments

fn

An R function

Value

fn, with an additional R attribute that will cause fn to be converted to an active property when being converted to a method of a custom subclass.

Example

layer_foo <- Model("Foo", ...,
  metrics = active_property(function() {
    list(self$d_loss_metric,
         self$g_loss_metric)
  }))

Fits the state of the preprocessing layer to the data being passed

Description

Fits the state of the preprocessing layer to the data being passed

Usage

adapt(object, data, ..., batch_size = NULL, steps = NULL)

Arguments

object

Preprocessing layer object

data

The data to train on. It can be passed either as a tf.data.Dataset or as an R array.

...

Used for forwards and backwards compatibility. Passed on to the underlying method.

batch_size

Integer or NULL. Number of asamples per state update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of a TF Dataset or a generator (since they generate batches).

steps

Integer or NULL. Total number of steps (batches of samples) When training with input tensors such as TensorFlow data tensors, the default NULL is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. If x is a tf.data.Dataset, and steps is NULL, the epoch will run until the input dataset is exhausted. When passing an infinitely repeating dataset, you must specify the steps argument. This argument is not supported with array inputs.

Details

After calling adapt on a layer, a preprocessing layer's state will not update during training. In order to make preprocessing layers efficient in any distribution context, they are kept constant with respect to any compiled tf.Graphs that call the layer. This does not affect the layer use when adapting each layer only once, but if you adapt a layer multiple times you will need to take care to re-compile any compiled functions as follows:

  • If you are adding a preprocessing layer to a keras model, you need to call compile(model) after each subsequent call to adapt().

  • If you are calling a preprocessing layer inside tfdatasets::dataset_map(), you should call dataset_map() again on the input Dataset after each adapt().

  • If you are using a tensorflow::tf_function() directly which calls a preprocessing layer, you need to call tf_function() again on your callable after each subsequent call to adapt().

keras_model() example with multiple adapts:

layer <- layer_normalization(axis = NULL)
adapt(layer, c(0, 2))
model <- keras_model_sequential() |> layer()
predict(model, c(0, 1, 2), verbose = FALSE) # [1] -1  0  1
## [1] -1  0  1

adapt(layer, c(-1, 1))
compile(model)  # This is needed to re-compile model.predict!
predict(model, c(0, 1, 2), verbose = FALSE) # [1] 0 1 2
## [1] 0 1 2

tfdatasets example with multiple adapts:

layer <- layer_normalization(axis = NULL)
adapt(layer, c(0, 2))
input_ds <- tfdatasets::range_dataset(0, 3)
normalized_ds <- input_ds |>
  tfdatasets::dataset_map(layer)
str(tfdatasets::iterate(normalized_ds))
## List of 3
##  $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([-1.], dtype=float32)>
##  $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([0.], dtype=float32)>
##  $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([1.], dtype=float32)>

adapt(layer, c(-1, 1))
normalized_ds <- input_ds |>
  tfdatasets::dataset_map(layer) # Re-map over the input dataset.

normalized_ds |>
  tfdatasets::as_array_iterator() |>
  tfdatasets::iterate(simplify = FALSE) |>
  str()
## List of 3
##  $ : num [1(1d)] 0
##  $ : num [1(1d)] 1
##  $ : num [1(1d)] 2

Value

Returns object, invisibly.


Instantiates the ConvNeXtBase architecture.

Description

Instantiates the ConvNeXtBase architecture.

Usage

application_convnext_base(
  include_top = TRUE,
  include_preprocessing = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "convnext_base"
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet-1k), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

References

For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The base, large, and xlarge models were first pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The pre-trained parameters of the models were assembled from the official repository. To get a sense of how these parameters were converted to Keras compatible parameters, please refer to this repository.

Note

Each Keras Application expects a specific kind of input preprocessing. For ConvNeXt, preprocessing is included in the model using a Normalization layer. ConvNeXt models expect their inputs to be float or uint8 tensors of pixels with values in the ⁠[0-255]⁠ range.

When calling the summary() method after instantiating a ConvNeXt model, prefer setting the expand_nested argument summary() to TRUE to better investigate the instantiated model.

See Also


Instantiates the ConvNeXtLarge architecture.

Description

Instantiates the ConvNeXtLarge architecture.

Usage

application_convnext_large(
  include_top = TRUE,
  include_preprocessing = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "convnext_large"
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet-1k), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

References

For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The base, large, and xlarge models were first pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The pre-trained parameters of the models were assembled from the official repository. To get a sense of how these parameters were converted to Keras compatible parameters, please refer to this repository.

Note

Each Keras Application expects a specific kind of input preprocessing. For ConvNeXt, preprocessing is included in the model using a Normalization layer. ConvNeXt models expect their inputs to be float or uint8 tensors of pixels with values in the ⁠[0-255]⁠ range.

When calling the summary() method after instantiating a ConvNeXt model, prefer setting the expand_nested argument summary() to TRUE to better investigate the instantiated model.

See Also


Instantiates the ConvNeXtSmall architecture.

Description

Instantiates the ConvNeXtSmall architecture.

Usage

application_convnext_small(
  include_top = TRUE,
  include_preprocessing = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "convnext_small"
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet-1k), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

References

For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The base, large, and xlarge models were first pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The pre-trained parameters of the models were assembled from the official repository. To get a sense of how these parameters were converted to Keras compatible parameters, please refer to this repository.

Note

Each Keras Application expects a specific kind of input preprocessing. For ConvNeXt, preprocessing is included in the model using a Normalization layer. ConvNeXt models expect their inputs to be float or uint8 tensors of pixels with values in the ⁠[0-255]⁠ range.

When calling the summary() method after instantiating a ConvNeXt model, prefer setting the expand_nested argument summary() to TRUE to better investigate the instantiated model.

See Also


Instantiates the ConvNeXtTiny architecture.

Description

Instantiates the ConvNeXtTiny architecture.

Usage

application_convnext_tiny(
  include_top = TRUE,
  include_preprocessing = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "convnext_tiny"
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet-1k), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

References

For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The base, large, and xlarge models were first pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The pre-trained parameters of the models were assembled from the official repository. To get a sense of how these parameters were converted to Keras compatible parameters, please refer to this repository.

Note

Each Keras Application expects a specific kind of input preprocessing. For ConvNeXt, preprocessing is included in the model using a Normalization layer. ConvNeXt models expect their inputs to be float or uint8 tensors of pixels with values in the ⁠[0-255]⁠ range.

When calling the summary() method after instantiating a ConvNeXt model, prefer setting the expand_nested argument summary() to TRUE to better investigate the instantiated model.

See Also


Instantiates the ConvNeXtXLarge architecture.

Description

Instantiates the ConvNeXtXLarge architecture.

Usage

application_convnext_xlarge(
  include_top = TRUE,
  include_preprocessing = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "convnext_xlarge"
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet-1k), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

References

For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The base, large, and xlarge models were first pre-trained on the ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The pre-trained parameters of the models were assembled from the official repository. To get a sense of how these parameters were converted to Keras compatible parameters, please refer to this repository.

Note

Each Keras Application expects a specific kind of input preprocessing. For ConvNeXt, preprocessing is included in the model using a Normalization layer. ConvNeXt models expect their inputs to be float or uint8 tensors of pixels with values in the ⁠[0-255]⁠ range.

When calling the summary() method after instantiating a ConvNeXt model, prefer setting the expand_nested argument summary() to TRUE to better investigate the instantiated model.

See Also


Instantiates the Densenet121 architecture.

Description

Instantiates the Densenet121 architecture.

Usage

application_densenet121(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "densenet121"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with 'channels_last' data format) or ⁠(3, 224, 224)⁠ (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Keras model instance.

Reference

Optionally loads weights pre-trained on ImageNet. Note that the data format convention used by the model is the one specified in your Keras config at ⁠~/.keras/keras.json⁠.

Note

Each Keras Application expects a specific kind of input preprocessing. For DenseNet, call application_preprocess_inputs() on your inputs before passing them to the model.

See Also


Instantiates the Densenet169 architecture.

Description

Instantiates the Densenet169 architecture.

Usage

application_densenet169(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "densenet169"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with 'channels_last' data format) or ⁠(3, 224, 224)⁠ (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Keras model instance.

Reference

Optionally loads weights pre-trained on ImageNet. Note that the data format convention used by the model is the one specified in your Keras config at ⁠~/.keras/keras.json⁠.

Note

Each Keras Application expects a specific kind of input preprocessing. For DenseNet, call application_preprocess_inputs() on your inputs before passing them to the model.

See Also


Instantiates the Densenet201 architecture.

Description

Instantiates the Densenet201 architecture.

Usage

application_densenet201(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "densenet201"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with 'channels_last' data format) or ⁠(3, 224, 224)⁠ (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Keras model instance.

Reference

Optionally loads weights pre-trained on ImageNet. Note that the data format convention used by the model is the one specified in your Keras config at ⁠~/.keras/keras.json⁠.

Note

Each Keras Application expects a specific kind of input preprocessing. For DenseNet, call application_preprocess_inputs() on your inputs before passing them to the model.

See Also


Instantiates the EfficientNetB0 architecture.

Description

Instantiates the EfficientNetB0 architecture.

Usage

application_efficientnet_b0(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb0",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB1 architecture.

Description

Instantiates the EfficientNetB1 architecture.

Usage

application_efficientnet_b1(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb1",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB2 architecture.

Description

Instantiates the EfficientNetB2 architecture.

Usage

application_efficientnet_b2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb2",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB3 architecture.

Description

Instantiates the EfficientNetB3 architecture.

Usage

application_efficientnet_b3(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb3",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB4 architecture.

Description

Instantiates the EfficientNetB4 architecture.

Usage

application_efficientnet_b4(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb4",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB5 architecture.

Description

Instantiates the EfficientNetB5 architecture.

Usage

application_efficientnet_b5(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb5",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB6 architecture.

Description

Instantiates the EfficientNetB6 architecture.

Usage

application_efficientnet_b6(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb6",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetB7 architecture.

Description

Instantiates the EfficientNetB7 architecture.

Usage

application_efficientnet_b7(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "efficientnetb7",
  ...
)

Arguments

include_top

Whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. 1000 is how many ImageNet classes there are. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to 'softmax'. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

...

For forward/backward compatability.

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNet, input preprocessing is included as part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. EfficientNet models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range.

See Also


Instantiates the EfficientNetV2B0 architecture.

Description

Instantiates the EfficientNetV2B0 architecture.

Usage

application_efficientnet_v2b0(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-b0"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2B1 architecture.

Description

Instantiates the EfficientNetV2B1 architecture.

Usage

application_efficientnet_v2b1(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-b1"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2B2 architecture.

Description

Instantiates the EfficientNetV2B2 architecture.

Usage

application_efficientnet_v2b2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-b2"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2B3 architecture.

Description

Instantiates the EfficientNetV2B3 architecture.

Usage

application_efficientnet_v2b3(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-b3"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2L architecture.

Description

Instantiates the EfficientNetV2L architecture.

Usage

application_efficientnet_v2l(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-l"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2M architecture.

Description

Instantiates the EfficientNetV2M architecture.

Usage

application_efficientnet_v2m(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-m"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the EfficientNetV2S architecture.

Description

Instantiates the EfficientNetV2S architecture.

Usage

application_efficientnet_v2s(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "efficientnetv2-s"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE. It should have exactly 3 inputs channels.

pooling

Optional pooling mode for feature extraction when include_top is FALSE. Defaults to NULL.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • "avg" means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • "max" means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000 (number of ImageNet classes).

classifier_activation

A string or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. Defaults to "softmax". When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer at the bottom of the network.

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For EfficientNetV2, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[0, 255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

See Also


Instantiates the Inception-ResNet v2 architecture.

Description

Instantiates the Inception-ResNet v2 architecture.

Usage

application_inception_resnet_v2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "inception_resnet_v2"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(299, 299, 3)⁠ (with 'channels_last' data format) or ⁠(3, 299, 299)⁠ (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. ⁠(150, 150, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • 'avg' means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • 'max' means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For InceptionResNetV2, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the Inception v3 architecture.

Description

Instantiates the Inception v3 architecture.

Usage

application_inception_v3(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "inception_v3"
)

Arguments

include_top

Boolean, whether to include the fully-connected layer at the top, as the last layer of the network. Defaults to TRUE.

weights

One of NULL (random initialization), imagenet (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model. input_tensor is useful for sharing inputs between multiple different networks. Defaults to NULL.

input_shape

Optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(299, 299, 3)⁠ (with channels_last data format) or ⁠(3, 299, 299)⁠ (with channels_first data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 75. E.g. ⁠(150, 150, 3)⁠ would be one valid value. input_shape will be ignored if the input_tensor is provided.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL (default) means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For InceptionV3, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the MobileNet architecture.

Description

Instantiates the MobileNet architecture.

Usage

application_mobilenet(
  input_shape = NULL,
  alpha = 1,
  depth_multiplier = 1L,
  dropout = 0.001,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = NULL
)

Arguments

input_shape

Optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value. Defaults to NULL. input_shape will be ignored if the input_tensor is provided.

alpha

Controls the width of the network. This is known as the width multiplier in the MobileNet paper.

  • If alpha < 1.0, proportionally decreases the number of filters in each layer.

  • If alpha > 1.0, proportionally increases the number of filters in each layer.

  • If alpha == 1, default number of filters from the paper are used at each layer. Defaults to 1.0.

depth_multiplier

Depth multiplier for depthwise convolution. This is called the resolution multiplier in the MobileNet paper. Defaults to 1.0.

dropout

Dropout rate. Defaults to 0.001.

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model. input_tensor is useful for sharing inputs between multiple different networks. Defaults to NULL.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL (default) means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For MobileNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the MobileNetV2 architecture.

Description

MobileNetV2 is very similar to the original MobileNet, except that it uses inverted residual blocks with bottlenecking features. It has a drastically lower parameter count than the original MobileNet. MobileNets support any input size greater than 32 x 32, with larger image sizes offering better performance.

Usage

application_mobilenet_v2(
  input_shape = NULL,
  alpha = 1,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = NULL
)

Arguments

input_shape

Optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value. Defaults to NULL. input_shape will be ignored if the input_tensor is provided.

alpha

Controls the width of the network. This is known as the width multiplier in the MobileNet paper.

  • If alpha < 1.0, proportionally decreases the number of filters in each layer.

  • If alpha > 1.0, proportionally increases the number of filters in each layer.

  • If alpha == 1, default number of filters from the paper are used at each layer. Defaults to 1.0.

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

One of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded. Defaults to "imagenet".

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model. input_tensor is useful for sharing inputs between multiple different networks. Defaults to NULL.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL (default) means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified. Defaults to 1000.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For MobileNetV2, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the MobileNetV3Large architecture.

Description

Instantiates the MobileNetV3Large architecture.

Usage

application_mobilenet_v3_large(
  input_shape = NULL,
  alpha = 1,
  minimalistic = FALSE,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  classes = 1000L,
  pooling = NULL,
  dropout_rate = 0.2,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "MobileNetV3Large"
)

Arguments

input_shape

Optional shape tuple, to be specified if you would like to use a model with an input image resolution that is not ⁠(224, 224, 3)⁠. It should have exactly 3 inputs channels. You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. ⁠(160, 160, 3)⁠ would be one valid value.

alpha

controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.

  • If alpha < 1.0, proportionally decreases the number of filters in each layer.

  • If alpha > 1.0, proportionally increases the number of filters in each layer.

  • If alpha == 1, default number of filters from the paper are used at each layer.

minimalistic

In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP.

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

String, one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

classes

Integer, optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

pooling

String, optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

dropout_rate

fraction of the input units to drop on the last layer.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer (Rescaling) at the bottom of the network. Defaults to TRUE.

name

The name of the model (string).

Value

A model instance.

Reference

The following table describes the performance of MobileNets v3:

MACs stands for Multiply Adds

Classification Checkpoint MACs(M) Parameters(M) Top1 Accuracy Pixel1 CPU(ms)
mobilenet_v3_large_1.0_224 217 5.4 75.6 51.2
mobilenet_v3_large_0.75_224 155 4.0 73.3 39.8
mobilenet_v3_large_minimalistic_1.0_224 209 3.9 72.3 44.1
mobilenet_v3_small_1.0_224 66 2.9 68.1 15.8
mobilenet_v3_small_0.75_224 44 2.4 65.4 12.8
mobilenet_v3_small_minimalistic_1.0_224 65 2.0 61.9 12.2

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For MobileNetV3, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, MobileNetV3 models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled MobileNetV3 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

Call Arguments

  • inputs: A floating point numpy.array or backend-native tensor, 4D with 3 color channels, with values in the range ⁠[0, 255]⁠ if include_preprocessing is TRUE and in the range ⁠[-1, 1]⁠ otherwise.

See Also


Instantiates the MobileNetV3Small architecture.

Description

Instantiates the MobileNetV3Small architecture.

Usage

application_mobilenet_v3_small(
  input_shape = NULL,
  alpha = 1,
  minimalistic = FALSE,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  classes = 1000L,
  pooling = NULL,
  dropout_rate = 0.2,
  classifier_activation = "softmax",
  include_preprocessing = TRUE,
  name = "MobileNetV3Small"
)

Arguments

input_shape

Optional shape tuple, to be specified if you would like to use a model with an input image resolution that is not ⁠(224, 224, 3)⁠. It should have exactly 3 inputs channels. You can also omit this option if you would like to infer input_shape from an input_tensor. If you choose to include both input_tensor and input_shape then input_shape will be used if they match, if the shapes do not match then we will throw an error. E.g. ⁠(160, 160, 3)⁠ would be one valid value.

alpha

controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.

  • If alpha < 1.0, proportionally decreases the number of filters in each layer.

  • If alpha > 1.0, proportionally increases the number of filters in each layer.

  • If alpha == 1, default number of filters from the paper are used at each layer.

minimalistic

In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP.

include_top

Boolean, whether to include the fully-connected layer at the top of the network. Defaults to TRUE.

weights

String, one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

classes

Integer, optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

pooling

String, optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

dropout_rate

fraction of the input units to drop on the last layer.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

include_preprocessing

Boolean, whether to include the preprocessing layer (Rescaling) at the bottom of the network. Defaults to TRUE.

name

The name of the model (string).

Value

A model instance.

Reference

The following table describes the performance of MobileNets v3:

MACs stands for Multiply Adds

Classification Checkpoint MACs(M) Parameters(M) Top1 Accuracy Pixel1 CPU(ms)
mobilenet_v3_large_1.0_224 217 5.4 75.6 51.2
mobilenet_v3_large_0.75_224 155 4.0 73.3 39.8
mobilenet_v3_large_minimalistic_1.0_224 209 3.9 72.3 44.1
mobilenet_v3_small_1.0_224 66 2.9 68.1 15.8
mobilenet_v3_small_0.75_224 44 2.4 65.4 12.8
mobilenet_v3_small_minimalistic_1.0_224 65 2.0 61.9 12.2

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For MobileNetV3, by default input preprocessing is included as a part of the model (as a Rescaling layer), and thus application_preprocess_inputs() is actually a pass-through function. In this use case, MobileNetV3 models expect their inputs to be float tensors of pixels with values in the ⁠[0-255]⁠ range. At the same time, preprocessing as a part of the model (i.e. Rescaling layer) can be disabled by setting include_preprocessing argument to FALSE. With preprocessing disabled MobileNetV3 models expect their inputs to be float tensors of pixels with values in the ⁠[-1, 1]⁠ range.

Call Arguments

  • inputs: A floating point numpy.array or backend-native tensor, 4D with 3 color channels, with values in the range ⁠[0, 255]⁠ if include_preprocessing is TRUE and in the range ⁠[-1, 1]⁠ otherwise.

See Also


Instantiates a NASNet model in ImageNet mode.

Description

Instantiates a NASNet model in ImageNet mode.

Usage

application_nasnet_large(
  input_shape = NULL,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "nasnet_large"
)

Arguments

input_shape

Optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(331, 331, 3)⁠ for NASNetLarge. It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(224, 224, 3)⁠ would be one valid value.

include_top

Whether to include the fully-connected layer at the top of the network.

weights

NULL (random initialization) or imagenet (ImageNet weights). For loading imagenet weights, input_shape should be (331, 331, 3)

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Keras model instance.

Reference

Optionally loads weights pre-trained on ImageNet. Note that the data format convention used by the model is the one specified in your Keras config at ⁠~/.keras/keras.json⁠.

Note

Each Keras Application expects a specific kind of input preprocessing. For NASNet, call application_preprocess_inputs() on your inputs before passing them to the model.

See Also


Instantiates a Mobile NASNet model in ImageNet mode.

Description

Instantiates a Mobile NASNet model in ImageNet mode.

Usage

application_nasnet_mobile(
  input_shape = NULL,
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "nasnet_mobile"
)

Arguments

input_shape

Optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ for NASNetMobile It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(224, 224, 3)⁠ would be one valid value.

include_top

Whether to include the fully-connected layer at the top of the network.

weights

NULL (random initialization) or imagenet (ImageNet weights). For loading imagenet weights, input_shape should be (224, 224, 3)

input_tensor

Optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional layer.

  • avg means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

Optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Keras model instance.

Reference

Optionally loads weights pre-trained on ImageNet. Note that the data format convention used by the model is the one specified in your Keras config at ⁠~/.keras/keras.json⁠.

Note

Each Keras Application expects a specific kind of input preprocessing. For NASNet, call application_preprocess_inputs() on your inputs before passing them to the model.

See Also


Instantiates the ResNet101 architecture.

Description

Instantiates the ResNet101 architecture.

Usage

application_resnet101(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet101"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.

See Also


Instantiates the ResNet101V2 architecture.

Description

Instantiates the ResNet101V2 architecture.

Usage

application_resnet101_v2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet101v2"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the ResNet152 architecture.

Description

Instantiates the ResNet152 architecture.

Usage

application_resnet152(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet152"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.

See Also


Instantiates the ResNet152V2 architecture.

Description

Instantiates the ResNet152V2 architecture.

Usage

application_resnet152_v2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet152v2"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the ResNet50 architecture.

Description

Instantiates the ResNet50 architecture.

Usage

application_resnet50(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet50"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.

See Also


Instantiates the ResNet50V2 architecture.

Description

Instantiates the ResNet50V2 architecture.

Usage

application_resnet50_v2(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "resnet50v2"
)

Arguments

include_top

whether to include the fully-connected layer at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with "channels_last" data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

Note

Each Keras Application expects a specific kind of input preprocessing. For ResNet, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Instantiates the VGG16 model.

Description

Instantiates the VGG16 model.

Usage

application_vgg16(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "vgg16"
)

Arguments

include_top

whether to include the 3 fully-connected layers at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with channels_last data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 input channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A Model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The default input size for this model is 224x224.

Note

Each Keras Application expects a specific kind of input preprocessing. For VGG16, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.

See Also


Instantiates the VGG19 model.

Description

Instantiates the VGG19 model.

Usage

application_vgg19(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "vgg19"
)

Arguments

include_top

whether to include the 3 fully-connected layers at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(224, 224, 3)⁠ (with channels_last data format) or ⁠(3, 224, 224)⁠ (with "channels_first" data format). It should have exactly 3 input channels, and width and height should be no smaller than 32. E.g. ⁠(200, 200, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The default input size for this model is 224x224.

Note

Each Keras Application expects a specific kind of input preprocessing. For VGG19, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.

See Also


Instantiates the Xception architecture.

Description

Instantiates the Xception architecture.

Usage

application_xception(
  include_top = TRUE,
  weights = "imagenet",
  input_tensor = NULL,
  input_shape = NULL,
  pooling = NULL,
  classes = 1000L,
  classifier_activation = "softmax",
  name = "xception"
)

Arguments

include_top

whether to include the 3 fully-connected layers at the top of the network.

weights

one of NULL (random initialization), "imagenet" (pre-training on ImageNet), or the path to the weights file to be loaded.

input_tensor

optional Keras tensor (i.e. output of keras_input()) to use as image input for the model.

input_shape

optional shape tuple, only to be specified if include_top is FALSE (otherwise the input shape has to be ⁠(299, 299, 3)⁠. It should have exactly 3 inputs channels, and width and height should be no smaller than 71. E.g. ⁠(150, 150, 3)⁠ would be one valid value.

pooling

Optional pooling mode for feature extraction when include_top is FALSE.

  • NULL means that the output of the model will be the 4D tensor output of the last convolutional block.

  • avg means that global average pooling will be applied to the output of the last convolutional block, and thus the output of the model will be a 2D tensor.

  • max means that global max pooling will be applied.

classes

optional number of classes to classify images into, only to be specified if include_top is TRUE, and if no weights argument is specified.

classifier_activation

A str or callable. The activation function to use on the "top" layer. Ignored unless include_top=TRUE. Set classifier_activation=NULL to return the logits of the "top" layer. When loading pretrained weights, classifier_activation can only be NULL or "softmax".

name

The name of the model (string).

Value

A model instance.

Reference

For image classification use cases, see this page for detailed examples.

For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.

The default input image size for this model is 299x299.

Note

Each Keras Application expects a specific kind of input preprocessing. For Xception, call application_preprocess_inputs() on your inputs before passing them to the model. application_preprocess_inputs() will scale input pixels between -1 and 1.

See Also


Generates a tf.data.Dataset from audio files in a directory.

Description

If your directory structure is:

main_directory/
...class_a/
......a_audio_1.wav
......a_audio_2.wav
...class_b/
......b_audio_1.wav
......b_audio_2.wav

Then calling audio_dataset_from_directory(main_directory, labels = 'inferred') will return a tf.data.Dataset that yields batches of audio files from the subdirectories class_a and class_b, together with labels 0 and 1 (0 corresponding to class_a and 1 corresponding to class_b).

Only .wav files are supported at this time.

Usage

audio_dataset_from_directory(
  directory,
  labels = "inferred",
  label_mode = "int",
  class_names = NULL,
  batch_size = 32L,
  sampling_rate = NULL,
  output_sequence_length = NULL,
  ragged = FALSE,
  shuffle = TRUE,
  seed = NULL,
  validation_split = NULL,
  subset = NULL,
  follow_links = FALSE,
  verbose = TRUE
)

Arguments

directory

Directory where the data is located. If labels is "inferred", it should contain subdirectories, each containing audio files for a class. Otherwise, the directory structure is ignored.

labels

Either "inferred" (labels are generated from the directory structure), NULL (no labels), or a list/tuple of integer labels of the same size as the number of audio files found in the directory. Labels should be sorted according to the alphanumeric order of the audio file paths (obtained via os.walk(directory) in Python).

label_mode

String describing the encoding of labels. Options are:

  • "int": means that the labels are encoded as integers (e.g. for sparse_categorical_crossentropy loss).

  • "categorical" means that the labels are encoded as a categorical vector (e.g. for categorical_crossentropy loss)

  • "binary" means that the labels (there can be only 2) are encoded as float32 scalars with values 0 or 1 (e.g. for binary_crossentropy).

  • NULL (no labels).

class_names

Only valid if "labels" is "inferred". This is the explicit list of class names (must match names of subdirectories). Used to control the order of the classes (otherwise alphanumerical order is used).

batch_size

Size of the batches of data. Default: 32. If NULL, the data will not be batched (the dataset will yield individual samples).

sampling_rate

Audio sampling rate (in samples per second).

output_sequence_length

Maximum length of an audio sequence. Audio files longer than this will be truncated to output_sequence_length. If set to NULL, then all sequences in the same batch will be padded to the length of the longest sequence in the batch.

ragged

Whether to return a Ragged dataset (where each sequence has its own length). Defaults to FALSE.

shuffle

Whether to shuffle the data. Defaults to TRUE. If set to FALSE, sorts the data in alphanumeric order.

seed

Optional random seed for shuffling and transformations.

validation_split

Optional float between 0 and 1, fraction of data to reserve for validation.

subset

Subset of the data to return. One of "training", "validation" or "both". Only used if validation_split is set.

follow_links

Whether to visits subdirectories pointed to by symlinks. Defaults to FALSE.

verbose

Whether to display number information on classes and number of files found. Defaults to TRUE.

Value

A tf.data.Dataset object.

  • If label_mode is NULL, it yields string tensors of shape ⁠(batch_size,)⁠, containing the contents of a batch of audio files.

  • Otherwise, it yields a tuple ⁠(audio, labels)⁠, where audio has shape ⁠(batch_size, sequence_length, num_channels)⁠ and labels follows the format described below.

Rules regarding labels format:

  • if label_mode is int, the labels are an int32 tensor of shape ⁠(batch_size,)⁠.

  • if label_mode is binary, the labels are a float32 tensor of 1s and 0s of shape ⁠(batch_size, 1)⁠.

  • if label_mode is categorical, the labels are a float32 tensor of shape ⁠(batch_size, num_classes)⁠, representing a one-hot encoding of the class index.

See Also

Other dataset utils:
image_dataset_from_directory()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()

Other utils:
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Define a custom Callback class

Description

Callbacks can be passed to keras methods such as fit(), evaluate(), and predict() in order to hook into the various stages of the model training, evaluation, and inference lifecycle.

To create a custom callback, call Callback() and override the method associated with the stage of interest.

Usage

Callback(
  classname,
  on_epoch_begin = NULL,
  on_epoch_end = NULL,
  on_train_begin = NULL,
  on_train_end = NULL,
  on_train_batch_begin = NULL,
  on_train_batch_end = NULL,
  on_test_begin = NULL,
  on_test_end = NULL,
  on_test_batch_begin = NULL,
  on_test_batch_end = NULL,
  on_predict_begin = NULL,
  on_predict_end = NULL,
  on_predict_batch_begin = NULL,
  on_predict_batch_end = NULL,
  ...,
  public = list(),
  private = list(),
  inherit = NULL,
  parent_env = parent.frame()
)

Arguments

classname

String, the name of the custom class. (Conventionally, CamelCase).

on_epoch_begin
\(epoch, logs = NULL)

Called at the start of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

  • epoch: Integer, index of epoch.

  • logs: Named List. Currently no data is passed to this argument for this method but that may change in the future.

on_epoch_end
\(epoch, logs = NULL)

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

  • epoch: Integer, index of epoch.

  • logs: Named List, metric results for this training epoch, and for the validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the Model's metrics are returned. Example: list(loss = 0.2, accuracy = 0.7).

on_train_begin
\(logs = NULL)

Called at the beginning of training.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_train_end
\(logs = NULL)

Called at the end of training.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently the output of the last call to on_epoch_end() is passed to this argument for this method but that may change in the future.

on_train_batch_begin
\(batch, logs = NULL)

Called at the beginning of a training batch in fit() methods.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_train_batch_end
\(batch, logs=NULL)

Called at the end of a training batch in fit() methods.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Aggregated metric results up until this batch.

on_test_begin
\(logs = NULL)

Called at the beginning of evaluation or validation.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_test_end
\(logs = NULL)

Called at the end of evaluation or validation.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently the output of the last call to on_test_batch_end() is passed to this argument for this method but that may change in the future.

on_test_batch_begin
\(batch, logs = NULL)

Called at the beginning of a batch in evaluate() methods.

Also called at the beginning of a validation batch in the fit() methods, if validation data is provided.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile() in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_test_batch_end
\(batch, logs = NULL)

Called at the end of a batch in evaluate() methods.

Also called at the end of a validation batch in the fit() methods, if validation data is provided.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile() in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Aggregated metric results up until this batch.

on_predict_begin
\(logs = NULL)

Called at the beginning of prediction.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_predict_end
\(logs = NULL)

Called at the end of prediction.

Subclasses should override for any actions to run.

Args:

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_predict_batch_begin
\(batch, logs = NULL)

Called at the beginning of a batch in predict() methods.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile() in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Currently no data is passed to this argument for this method but that may change in the future.

on_predict_batch_end
\(batch, logs = NULL)

Called at the end of a batch in predict() methods.

Subclasses should override for any actions to run.

Note that if the steps_per_execution argument to compile in Model is set to N, this method will only be called every N batches.

Args:

  • batch: Integer, index of batch within the current epoch.

  • logs: Named list. Aggregated metric results up until this batch.

..., public

Additional methods or public members of the custom class.

private

Named list of R objects (typically, functions) to include in instance private environments. private methods will have all the same symbols in scope as public methods (See section "Symbols in Scope"). Each instance will have it's own private environment. Any objects in private will be invisible from the Keras framework and the Python runtime.

inherit

What the custom class will subclass. By default, the base keras class.

parent_env

The R environment that all class methods will have as a grandparent.

Value

A function that returns the custom Callback instances, similar to the builtin callback functions.

Examples

training_finished <- FALSE
callback_mark_finished <- Callback("MarkFinished",
  on_train_end = function(logs = NULL) {
    training_finished <<- TRUE
  }
)

model <- keras_model_sequential(input_shape = c(1)) |>
  layer_dense(1)
model |> compile(loss = 'mean_squared_error')
model |> fit(op_ones(c(1, 1)), op_ones(c(1, 1)),
             callbacks = callback_mark_finished())
stopifnot(isTRUE(training_finished))

All R function custom methods (public and private) will have the following symbols in scope:

  • self: the Layer instance.

  • super: the Layer superclass.

  • private: An R environment specific to the class instance. Any objects defined here will be invisible to the Keras framework.

  • ⁠__class__⁠ the current class type object. This will also be available as an alias symbol, the value supplied to Layer(classname = )

Attributes (accessible via ⁠self$⁠)

  • params: Named list, Training parameters (e.g. verbosity, batch size, number of epochs, ...).

  • model: Instance of Model. Reference of the model being trained.

The logs named list that callback methods take as argument will contain keys for quantities relevant to the current batch or epoch (see method-specific docstrings).

Symbols in scope

All R function custom methods (public and private) will have the following symbols in scope:

  • self: The custom class instance.

  • super: The custom class superclass.

  • private: An R environment specific to the class instance. Any objects assigned here are invisible to the Keras framework.

  • ⁠__class__⁠ and as.symbol(classname): the custom class type object.

See Also

Other callbacks:
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Callback to back up and restore the training state.

Description

callback_backup_and_restore() callback is intended to recover training from an interruption that has happened in the middle of a fit execution, by backing up the training states in a temporary checkpoint file, at the end of each epoch. Each backup overwrites the previously written checkpoint file, so at any given time there is at most one such checkpoint file for backup/restoring purpose.

If training restarts before completion, the training state (which includes the model weights and epoch number) is restored to the most recently saved state at the beginning of a new fit run. At the completion of a fit run, the temporary checkpoint file is deleted.

Note that the user is responsible to bring jobs back after the interruption. This callback is important for the backup and restore mechanism for fault tolerance purpose, and the model to be restored from a previous checkpoint is expected to be the same as the one used to back up. If user changes arguments passed to compile or fit, the checkpoint saved for fault tolerance can become invalid.

Usage

callback_backup_and_restore(
  backup_dir,
  save_freq = "epoch",
  delete_checkpoint = TRUE
)

Arguments

backup_dir

String, path of directory where to store the data needed to restore the model. The directory cannot be reused elsewhere to store other files, e.g. by the backup_and_restore callback of another training run, or by another callback (e.g. callback_model_checkpoint) of the same training run.

save_freq

"epoch", integer, or FALSE. When set to "epoch", the callback saves the checkpoint at the end of each epoch. When set to an integer, the callback saves the checkpoint every save_freq batches. Set save_freq = FALSE only if using preemption checkpointing (i.e. with save_before_preemption = TRUE).

delete_checkpoint

Boolean. This backup_and_restore callback works by saving a checkpoint to back up the training state. If delete_checkpoint = TRUE, the checkpoint will be deleted after training is finished. Use FALSE if you'd like to keep the checkpoint for future usage. Defaults to TRUE.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

callback_interrupting <- new_callback_class(
  "InterruptingCallback",
  on_epoch_begin = function(epoch, logs = NULL) {
    if (epoch == 4) {
      stop('Interrupting!')
    }
  }
)

backup_dir <- tempfile()
callback <- callback_backup_and_restore(backup_dir = backup_dir)
model <- keras_model_sequential() %>%
  layer_dense(10)
model %>% compile(optimizer = optimizer_sgd(), loss = 'mse')

# ensure model is built (i.e., weights are initialized) for
# callback_backup_and_restore()
model(op_ones(c(5, 20))) |> invisible()

tryCatch({
  model %>% fit(x = op_ones(c(5, 20)),
                y = op_zeros(5),
                epochs = 10, batch_size = 1,
                callbacks = list(callback, callback_interrupting()),
                verbose = 0)
}, python.builtin.RuntimeError = function(e) message("Interrupted!"))
## Interrupted!
model$history$epoch
## [1] 0 1 2

# model$history %>% keras3:::to_keras_training_history() %>% as.data.frame() %>% print()

history <- model %>% fit(x = op_ones(c(5, 20)),
                         y = op_zeros(5),
                         epochs = 10, batch_size = 1,
                         callbacks = list(callback),
                         verbose = 0)

# Only 6 more epochs are run, since first training got interrupted at
# zero-indexed epoch 4, second training will continue from 4 to 9.
nrow(as.data.frame(history))
## [1] 10

See Also

Other callbacks:
Callback()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Callback that streams epoch results to a CSV file.

Description

Supports all values that can be represented as a string, including 1D iterables such as atomic vectors.

Usage

callback_csv_logger(filename, separator = ",", append = FALSE)

Arguments

filename

Filename of the CSV file, e.g. 'run/log.csv'.

separator

String used to separate elements in the CSV file.

append

Boolean. TRUE: append if file exists (useful for continuing training). FALSE: overwrite existing file.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

csv_logger <- callback_csv_logger('training.log')
model %>% fit(X_train, Y_train, callbacks = list(csv_logger))

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Stop training when a monitored metric has stopped improving.

Description

Assuming the goal of a training is to minimize the loss. With this, the metric to be monitored would be 'loss', and mode would be 'min'. A model$fit() training loop will check at end of every epoch whether the loss is no longer decreasing, considering the min_delta and patience if applicable. Once it's found no longer decreasing, model$stop_training is marked TRUE and the training terminates.

The quantity to be monitored needs to be available in logs list. To make it so, pass the loss or metrics at model$compile().

Usage

callback_early_stopping(
  monitor = "val_loss",
  min_delta = 0L,
  patience = 0L,
  verbose = 0L,
  mode = "auto",
  baseline = NULL,
  restore_best_weights = FALSE,
  start_from_epoch = 0L
)

Arguments

monitor

Quantity to be monitored. Defaults to "val_loss".

min_delta

Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement. Defaults to 0.

patience

Number of epochs with no improvement after which training will be stopped. Defaults to 0.

verbose

Verbosity mode, 0 or 1. Mode 0 is silent, and mode 1 displays messages when the callback takes an action. Defaults to 0.

mode

One of ⁠{"auto", "min", "max"}⁠. In min mode, training will stop when the quantity monitored has stopped decreasing; in "max" mode it will stop when the quantity monitored has stopped increasing; in "auto" mode, the direction is automatically inferred from the name of the monitored quantity. Defaults to "auto".

baseline

Baseline value for the monitored quantity. If not NULL, training will stop if the model doesn't show improvement over the baseline. Defaults to NULL.

restore_best_weights

Whether to restore model weights from the epoch with the best value of the monitored quantity. If FALSE, the model weights obtained at the last step of training are used. An epoch will be restored regardless of the performance relative to the baseline. If no epoch improves on baseline, training will run for patience epochs and restore weights from the best epoch in that set. Defaults to FALSE.

start_from_epoch

Number of epochs to wait before starting to monitor improvement. This allows for a warm-up period in which no improvement is expected and thus training will not be stopped. Defaults to 0.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

callback <- callback_early_stopping(monitor = 'loss',
                                   patience = 3)
# This callback will stop the training when there is no improvement in
# the loss for three consecutive epochs.
model <- keras_model_sequential() %>%
  layer_dense(10)
model %>% compile(optimizer = optimizer_sgd(), loss = 'mse')
history <- model %>% fit(x = op_ones(c(5, 20)),
                         y = op_zeros(5),
                         epochs = 10, batch_size = 1,
                         callbacks = list(callback),
                         verbose = 0)
nrow(as.data.frame(history))  # Only 4 epochs are run.
## [1] 10

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Callback for creating simple, custom callbacks on-the-fly.

Description

This callback is constructed with anonymous functions that will be called at the appropriate time (during ⁠Model.{fit | evaluate | predict}⁠). Note that the callbacks expects positional arguments, as:

  • on_epoch_begin and on_epoch_end expect two positional arguments: epoch, logs

  • on_train_begin and on_train_end expect one positional argument: logs

  • on_train_batch_begin and on_train_batch_end expect two positional arguments: batch, logs

  • See Callback class definition for the full list of functions and their expected arguments.

Usage

callback_lambda(
  on_epoch_begin = NULL,
  on_epoch_end = NULL,
  on_train_begin = NULL,
  on_train_end = NULL,
  on_train_batch_begin = NULL,
  on_train_batch_end = NULL,
  ...
)

Arguments

on_epoch_begin

called at the beginning of every epoch.

on_epoch_end

called at the end of every epoch.

on_train_begin

called at the beginning of model training.

on_train_end

called at the end of model training.

on_train_batch_begin

called at the beginning of every train batch.

on_train_batch_end

called at the end of every train batch.

...

Any function in Callback() that you want to override by passing ⁠function_name = function⁠. For example, callback_lambda(.., on_train_end = train_end_fn). The custom function needs to have same arguments as the ones defined in Callback().

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

# Print the batch number at the beginning of every batch.
batch_print_callback <- callback_lambda(
  on_train_batch_begin = function(batch, logs) {
    print(batch)
  }
)

# Stream the epoch loss to a file in new-line delimited JSON format
# (one valid JSON object per line)
json_log <- file('loss_log.json', open = 'wt')
json_logging_callback <- callback_lambda(
  on_epoch_end = function(epoch, logs) {
    jsonlite::write_json(
      list(epoch = epoch, loss = logs$loss),
      json_log,
      append = TRUE
    )
  },
  on_train_end = function(logs) {
    close(json_log)
  }
)

# Terminate some processes after having finished model training.
processes <- ...
cleanup_callback <- callback_lambda(
  on_train_end = function(logs) {
    for (p in processes) {
      if (is_alive(p)) {
        terminate(p)
      }
    }
  }
)

model %>% fit(
  ...,
  callbacks = list(
    batch_print_callback,
    json_logging_callback,
    cleanup_callback
  )
)

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Learning rate scheduler.

Description

At the beginning of every epoch, this callback gets the updated learning rate value from schedule function provided, with the current epoch and current learning rate, and applies the updated learning rate on the optimizer.

Usage

callback_learning_rate_scheduler(schedule, verbose = 0L)

Arguments

schedule

A function that takes an epoch index (integer, indexed from 0) and current learning rate (float) as inputs and returns a new learning rate as output (float).

verbose

Integer. 0: quiet, 1: log update messages.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

# This function keeps the initial learning rate steady for the first ten epochs
# and decreases it exponentially after that.
scheduler <- function(epoch, lr) {
  if (epoch < 10)
    return(lr)
  else
    return(lr * exp(-0.1))
}

model <- keras_model_sequential() |> layer_dense(units = 10)
model |> compile(optimizer = optimizer_sgd(), loss = 'mse')
model$optimizer$learning_rate |> as.array() |> round(5)
## [1] 0.01

callback <- callback_learning_rate_scheduler(schedule = scheduler)
history <- model |> fit(x = array(runif(100), c(5, 20)),
                        y = array(0, c(5, 1)),
                        epochs = 15, callbacks = list(callback), verbose = 0)
model$optimizer$learning_rate |> as.array() |> round(5)
## [1] 0.00607

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Callback to save the Keras model or model weights at some frequency.

Description

callback_model_checkpoint() is used in conjunction with training using model |> fit() to save a model or weights (in a checkpoint file) at some interval, so the model or weights can be loaded later to continue the training from the state saved.

A few options this callback provides include:

  • Whether to only keep the model that has achieved the "best performance" so far, or whether to save the model at the end of every epoch regardless of performance.

  • Definition of "best"; which quantity to monitor and whether it should be maximized or minimized.

  • The frequency it should save at. Currently, the callback supports saving at the end of every epoch, or after a fixed number of training batches.

  • Whether only weights are saved, or the whole model is saved.

Usage

callback_model_checkpoint(
  filepath,
  monitor = "val_loss",
  verbose = 0L,
  save_best_only = FALSE,
  save_weights_only = FALSE,
  mode = "auto",
  save_freq = "epoch",
  initial_value_threshold = NULL
)

Arguments

filepath

string, path to save the model file. filepath can contain named formatting options, which will be filled the value of epoch and keys in logs (passed in on_epoch_end). The filepath name needs to end with ".weights.h5" when save_weights_only = TRUE or should end with ".keras" when checkpoint saving the whole model (default). For example: if filepath is "{epoch:02d}-{val_loss:.2f}.keras", then the model checkpoints will be saved with the epoch number and the validation loss in the filename. The directory of the filepath should not be reused by any other callbacks to avoid conflicts.

monitor

The metric name to monitor. Typically the metrics are set by the model |> compile() method. Note:

  • Prefix the name with "val_" to monitor validation metrics.

  • Use "loss" or "val_loss" to monitor the model's total loss.

  • If you specify metrics as strings, like "accuracy", pass the same string (with or without the "val_" prefix).

  • If you pass Metric objects (created by one of ⁠metric_*()⁠), monitor should be set to metric$name.

  • If you're not sure about the metric names you can check the contents of the history$metrics list returned by history <- model |> fit()

  • Multi-output models set additional prefixes on the metric names.

verbose

Verbosity mode, 0 or 1. Mode 0 is silent, and mode 1 displays messages when the callback takes an action.

save_best_only

if save_best_only = TRUE, it only saves when the model is considered the "best" and the latest best model according to the quantity monitored will not be overwritten. If filepath doesn't contain formatting options like {epoch} then filepath will be overwritten by each new better model.

save_weights_only

if TRUE, then only the model's weights will be saved (model |> save_model_weights(filepath)), else the full model is saved (model |> save_model(filepath)).

mode

one of {"auto", "min", "max"}. If save_best_only = TRUE, the decision to overwrite the current save file is made based on either the maximization or the minimization of the monitored quantity. For val_acc, this should be "max", for val_loss this should be "min", etc. In "auto" mode, the mode is set to "max" if the quantities monitored are "acc" or start with "fmeasure" and are set to "min" for the rest of the quantities.

save_freq

"epoch" or integer. When using "epoch", the callback saves the model after each epoch. When using integer, the callback saves the model at end of this many batches. If the Model is compiled with steps_per_execution = N, then the saving criteria will be checked every Nth batch. Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch). Defaults to "epoch".

initial_value_threshold

Floating point initial "best" value of the metric to be monitored. Only applies if save_best_value = TRUE. Only overwrites the model weights already saved if the performance of current model is better than this value.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

model <- keras_model_sequential(input_shape = c(10)) |>
  layer_dense(1, activation = "sigmoid") |>
  compile(loss = "binary_crossentropy", optimizer = "adam",
          metrics = c('accuracy'))

EPOCHS <- 10
checkpoint_filepath <- tempfile('checkpoint-model-', fileext = ".keras")
model_checkpoint_callback <- callback_model_checkpoint(
  filepath = checkpoint_filepath,
  monitor = 'val_accuracy',
  mode = 'max',
  save_best_only = TRUE
)

# Model is saved at the end of every epoch, if it's the best seen so far.
model |> fit(x = random_uniform(c(2, 10)), y = op_ones(2, 1),
             epochs = EPOCHS, validation_split = .5, verbose = 0,
             callbacks = list(model_checkpoint_callback))

# The model (that are considered the best) can be loaded as -
load_model(checkpoint_filepath)
## Model: "sequential"
## +---------------------------------+------------------------+---------------+
## | Layer (type)                    | Output Shape           |       Param # |
## +=================================+========================+===============+
## | dense (Dense)                   | (None, 1)              |            11 |
## +---------------------------------+------------------------+---------------+
##  Total params: 35 (144.00 B)
##  Trainable params: 11 (44.00 B)
##  Non-trainable params: 0 (0.00 B)
##  Optimizer params: 24 (100.00 B)

# Alternatively, one could checkpoint just the model weights as -
checkpoint_filepath <- tempfile('checkpoint-', fileext = ".weights.h5")
model_checkpoint_callback <- callback_model_checkpoint(
  filepath = checkpoint_filepath,
  save_weights_only = TRUE,
  monitor = 'val_accuracy',
  mode = 'max',
  save_best_only = TRUE
)

# Model weights are saved at the end of every epoch, if it's the best seen
# so far.
# same as above
model |> fit(x = random_uniform(c(2, 10)), y = op_ones(2, 1),
             epochs = EPOCHS, validation_split = .5, verbose = 0,
             callbacks = list(model_checkpoint_callback))

# The model weights (that are considered the best) can be loaded
model |> load_model_weights(checkpoint_filepath)

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Reduce learning rate when a metric has stopped improving.

Description

Models often benefit from reducing the learning rate by a factor of 2-10 once learning stagnates. This callback monitors a quantity and if no improvement is seen for a 'patience' number of epochs, the learning rate is reduced.

Usage

callback_reduce_lr_on_plateau(
  monitor = "val_loss",
  factor = 0.1,
  patience = 10L,
  verbose = 0L,
  mode = "auto",
  min_delta = 1e-04,
  cooldown = 0L,
  min_lr = 0,
  ...
)

Arguments

monitor

String. Quantity to be monitored.

factor

Float. Factor by which the learning rate will be reduced. new_lr = lr * factor.

patience

Integer. Number of epochs with no improvement after which learning rate will be reduced.

verbose

Integer. 0: quiet, 1: update messages.

mode

String. One of ⁠{'auto', 'min', 'max'}⁠. In 'min' mode, the learning rate will be reduced when the quantity monitored has stopped decreasing; in 'max' mode it will be reduced when the quantity monitored has stopped increasing; in 'auto' mode, the direction is automatically inferred from the name of the monitored quantity.

min_delta

Float. Threshold for measuring the new optimum, to only focus on significant changes.

cooldown

Integer. Number of epochs to wait before resuming normal operation after the learning rate has been reduced.

min_lr

Float. Lower bound on the learning rate.

...

For forward/backward compatability.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

reduce_lr <- callback_reduce_lr_on_plateau(monitor = 'val_loss', factor = 0.2,
                                           patience = 5, min_lr = 0.001)
model %>% fit(x_train, y_train, callbacks = list(reduce_lr))

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Callback used to stream events to a server.

Description

Requires the requests library. Events are sent to root + '/publish/epoch/end/' by default. Calls are HTTP POST, with a data argument which is a JSON-encoded named list of event data. If send_as_json = TRUE, the content type of the request will be "application/json". Otherwise the serialized JSON will be sent within a form.

Usage

callback_remote_monitor(
  root = "http://localhost:9000",
  path = "/publish/epoch/end/",
  field = "data",
  headers = NULL,
  send_as_json = FALSE
)

Arguments

root

String; root url of the target server.

path

String; path relative to root to which the events will be sent.

field

String; JSON field under which the data will be stored. The field is used only if the payload is sent within a form (i.e. when send_as_json = FALSE).

headers

Named list; optional custom HTTP headers.

send_as_json

Boolean; whether the request should be sent as "application/json".

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()


Swaps model weights and EMA weights before and after evaluation.

Description

This callbacks replaces the model's weight values with the values of the optimizer's EMA weights (the exponential moving average of the past model weights values, implementing "Polyak averaging") before model evaluation, and restores the previous weights after evaluation.

The SwapEMAWeights callback is to be used in conjunction with an optimizer that sets use_ema = TRUE.

Note that the weights are swapped in-place in order to save memory. The behavior is undefined if you modify the EMA weights or model weights in other callbacks.

Usage

callback_swap_ema_weights(swap_on_epoch = FALSE)

Arguments

swap_on_epoch

Whether to perform swapping at on_epoch_begin() and on_epoch_end(). This is useful if you want to use EMA weights for other callbacks such as callback_model_checkpoint(). Defaults to FALSE.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

# Remember to set `use_ema=TRUE` in the optimizer
optimizer <- optimizer_sgd(use_ema = TRUE)
model |> compile(optimizer = optimizer, loss = ..., metrics = ...)

# Metrics will be computed with EMA weights
model |> fit(X_train, Y_train,
             callbacks = c(callback_swap_ema_weights()))

# If you want to save model checkpoint with EMA weights, you can set
# `swap_on_epoch=TRUE` and place ModelCheckpoint after SwapEMAWeights.
model |> fit(
  X_train, Y_train,
  callbacks = c(
    callback_swap_ema_weights(swap_on_epoch = TRUE),
    callback_model_checkpoint(...)
  )
)

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_tensorboard()
callback_terminate_on_nan()


Enable visualizations for TensorBoard.

Description

TensorBoard is a visualization tool provided with TensorFlow. A TensorFlow installation is required to use this callback.

This callback logs events for TensorBoard, including:

  • Metrics summary plots

  • Training graph visualization

  • Weight histograms

  • Sampled profiling

When used in model |> evaluate() or regular validation in addition to epoch summaries, there will be a summary that records evaluation metrics vs model$optimizer$iterations written. The metric names will be prepended with evaluation, with model$optimizer$iterations being the step in the visualized TensorBoard.

If you have installed TensorFlow with pip or reticulate::py_install(), you should be able to launch TensorBoard from the command line:

tensorboard --logdir=path_to_your_logs

or from R with tensorflow::tensorboard().

You can find more information about TensorBoard here.

Usage

callback_tensorboard(
  log_dir = "logs",
  histogram_freq = 0L,
  write_graph = TRUE,
  write_images = FALSE,
  write_steps_per_second = FALSE,
  update_freq = "epoch",
  profile_batch = 0L,
  embeddings_freq = 0L,
  embeddings_metadata = NULL
)

Arguments

log_dir

the path of the directory where to save the log files to be parsed by TensorBoard. e.g., log_dir = file.path(working_dir, 'logs'). This directory should not be reused by any other callbacks.

histogram_freq

frequency (in epochs) at which to compute weight histograms for the layers of the model. If set to 0, histograms won't be computed. Validation data (or split) must be specified for histogram visualizations.

write_graph

(Not supported at this time) Whether to visualize the graph in TensorBoard. Note that the log file can become quite large when write_graph is set to TRUE.

write_images

whether to write model weights to visualize as image in TensorBoard.

write_steps_per_second

whether to log the training steps per second into TensorBoard. This supports both epoch and batch frequency logging.

update_freq

"batch" or "epoch" or integer. When using "epoch", writes the losses and metrics to TensorBoard after every epoch. If using an integer, let's say 1000, all metrics and losses (including custom ones added by Model.compile) will be logged to TensorBoard every 1000 batches. "batch" is a synonym for 1, meaning that they will be written every batch. Note however that writing too frequently to TensorBoard can slow down your training, especially when used with distribution strategies as it will incur additional synchronization overhead. Batch-level summary writing is also available via train_step override. Please see TensorBoard Scalars tutorial # noqa: E501 for more details.

profile_batch

(Not supported at this time) Profile the batch(es) to sample compute characteristics. profile_batch must be a non-negative integer or a tuple of integers. A pair of positive integers signify a range of batches to profile. By default, profiling is disabled.

embeddings_freq

frequency (in epochs) at which embedding layers will be visualized. If set to 0, embeddings won't be visualized.

embeddings_metadata

Named list which maps embedding layer names to the filename of a file in which to save metadata for the embedding layer. In case the same metadata file is to be used for all embedding layers, a single filename can be passed.

Value

A Callback instance that can be passed to fit.keras.src.models.model.Model().

Examples

tensorboard_callback <- callback_tensorboard(log_dir = "./logs")
model %>% fit(x_train, y_train, epochs = 2, callbacks = list(tensorboard_callback))
# Then run the tensorboard command to view the visualizations.

Custom batch-level summaries in a subclassed Model:

MyModel <- new_model_class("MyModel",
  initialize = function() {
    self$dense <- layer_dense(units = 10)
  },
  call = function(x) {
    outputs <- x |> self$dense()
    tf$summary$histogram('outputs', outputs)
    outputs
  }
)

model <- MyModel()
model |> compile(optimizer = 'sgd', loss = 'mse')

# Make sure to set `update_freq = N` to log a batch-level summary every N
# batches. In addition to any `tf$summary` contained in `model$call()`,
# metrics added in `model |>compile` will be logged every N batches.
tb_callback <- callback_tensorboard(log_dir = './logs', update_freq = 1)
model |> fit(x_train, y_train, callbacks = list(tb_callback))

Custom batch-level summaries in a Functional API Model:

my_summary <- function(x) {
  tf$summary$histogram('x', x)
  x
}

inputs <- layer_input(10)
outputs <- inputs |>
  layer_dense(10) |>
  layer_lambda(my_summary)

model <- keras_model(inputs, outputs)
model |> compile(optimizer = 'sgd', loss = 'mse')

# Make sure to set `update_freq = N` to log a batch-level summary every N
# batches. In addition to any `tf.summary` contained in `Model.call`,
# metrics added in `Model.compile` will be logged every N batches.
tb_callback <- callback_tensorboard(log_dir = './logs', update_freq = 1)
model |> fit(x_train, y_train, callbacks = list(tb_callback))

Profiling:

# Profile a single batch, e.g. the 5th batch.
tensorboard_callback <- callback_tensorboard(
  log_dir = './logs', profile_batch = 5)
model |> fit(x_train, y_train, epochs = 2,
             callbacks = list(tensorboard_callback))

# Profile a range of batches, e.g. from 10 to 20.
tensorboard_callback <- callback_tensorboard(
  log_dir = './logs', profile_batch = c(10, 20))
model |> fit(x_train, y_train, epochs = 2,
             callbacks = list(tensorboard_callback))

See Also

Other callbacks:
Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_terminate_on_nan()


Resets all state generated by Keras.

Description

Keras manages a global state, which it uses to implement the Functional model-building API and to uniquify autogenerated layer names.

If you are creating many models in a loop, this global state will consume an increasing amount of memory over time, and you may want to clear it. Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

Example 1: calling clear_session() when creating models in a loop

for (i in 1:100) {
  # Without `clear_session()`, each iteration of this loop will
  # slightly increase the size of the global state managed by Keras
  model <- keras_model_sequential()
  for (j in 1:10) {
    model <- model |> layer_dense(units = 10)
  }
}

for (i in 1:100) {
  # With `clear_session()` called at the beginning,
  # Keras starts with a blank state at each iteration
  # and memory consumption is constant over time.
  clear_session()
  model <- keras_model_sequential()
  for (j in 1:10) {
    model <- model |> layer_dense(units = 10)
  }
}

Example 2: resetting the layer name generation counter

layers <- lapply(1:10, \(i) layer_dense(units = 10))

new_layer <- layer_dense(units = 10)
print(new_layer$name)
## [1] "dense_10"

clear_session()
new_layer <- layer_dense(units = 10)
print(new_layer$name)
## [1] "dense"

Usage

clear_session(free_memory = TRUE)

Arguments

free_memory

Whether to call Python garbage collection. It's usually a good practice to call it to make sure memory used by deleted objects is immediately freed. However, it may take a few seconds to execute, so when using clear_session() in a short loop, you may want to skip it.

Value

NULL, invisibly, called for side effects.

See Also

Other backend:
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()

Other utils:
audio_dataset_from_directory()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Clone a Functional or Sequential Model instance.

Description

Model cloning is similar to calling a model on new inputs, except that it creates new layers (and thus new weights) instead of sharing the weights of the existing layers.

Note that clone_model() will not preserve the uniqueness of shared objects within the model (e.g. a single variable attached to two distinct layers will be restored as two separate variables).

Usage

clone_model(
  model,
  input_tensors = NULL,
  clone_function = NULL,
  call_function = NULL,
  recursive = FALSE,
  ...
)

Arguments

model

Instance of Model (could be a Functional model or a Sequential model).

input_tensors

Optional list of input tensors to build the model upon. If not provided, new keras_input() objects will be created.

clone_function

Callable with signature ⁠function(layer)⁠ to be used to clone each layer in the target model (except Input instances). It takes as argument the layer instance to be cloned, and returns the corresponding layer instance to be used in the model copy. If unspecified, this callable defaults to the following serialization/deserialization function: function(layer) layer$`__class__`$from_config(layer$get_config()). By passing a custom callable, you can customize your copy of the model, e.g. by wrapping certain layers of interest (you might want to replace all LSTM instances with equivalent Bidirectional(LSTM(...)) instances, for example). Defaults to NULL.

call_function

Callable with signature ⁠function(layer, ...)⁠ to be used to call each cloned layer and a set of inputs. It takes the layer instance, and the call arguments, and returns the call outputs. If unspecified, this callable defaults to the regular call() method: function(layer, ...) do.call(layer, list(...)). By passing a custom callable, you can insert new layers before or after a given layer.

recursive

Note, This argument can only be used with Functional models. Boolean. Whether to recursively clone any Sequential or Functional models encountered in the original Sequential/Functional model. If FALSE, then inner models are cloned by calling clone_function(). If TRUE, then inner models are cloned by calling clone_model() with the same clone_function, call_function, and recursive arguments. Note that in this case, call_function will not be propagated to any Sequential model (since it is not applicable to Sequential models).

...

For forward/backward compatability.

Value

An instance of Model reproducing the behavior of the original model, on top of new inputs tensors, using newly instantiated weights. The cloned model may behave differently from the original model if a custom clone_function or call_function modifies a layer or layer call.

Examples

# Create a test Sequential model.
model <- keras_model_sequential(input_shape = c(728)) |>
  layer_dense(32, activation = 'relu') |>
  layer_dense(1, activation = 'sigmoid')

# Create a copy of the test model (with freshly initialized weights).
new_model <- clone_model(model)

Using a clone_function to make a model deterministic by setting the random seed everywhere:

clone_function <- function(layer) {
  config <- layer$get_config()
  if ("seed" %in% names(config))
    config$seed <- 1337L
  layer$`__class__`$from_config(config)
}

new_model <- clone_model(model, clone_function = clone_function)

Using a call_function to add a Dropout layer after each Dense layer (without recreating new layers):

call_function <- function(layer, ...) {
  out <- layer(...)
  if (inherits(layer, keras$layers$Dense))
    out <- out |> layer_dropout(0.5)
  out
}

inputs <- keras_input(c(728))
outputs <- inputs |>
  layer_dense(32, activation = 'relu') |>
  layer_dense(1, activation = 'sigmoid')
model <- keras_model(inputs, outputs)

new_model <- clone_model(
  model,
  clone_function = function(x) x, # Reuse the same layers.
  call_function = call_function,
)
new_model
## Model: "functional_4"
## +-----------------------------------+--------------------------+---------------+
## | Layer (type)                      | Output Shape             |       Param # |
## +===================================+==========================+===============+
## | keras_tensor_8 (InputLayer)       | (None, 728)              |             0 |
## +-----------------------------------+--------------------------+---------------+
## | dense_2 (Dense)                   | (None, 32)               |        23,328 |
## +-----------------------------------+--------------------------+---------------+
## | dropout (Dropout)                 | (None, 32)               |             0 |
## +-----------------------------------+--------------------------+---------------+
## | dense_3 (Dense)                   | (None, 1)                |            33 |
## +-----------------------------------+--------------------------+---------------+
## | dropout_1 (Dropout)               | (None, 1)                |             0 |
## +-----------------------------------+--------------------------+---------------+
##  Total params: 23,361 (91.25 KB)
##  Trainable params: 23,361 (91.25 KB)
##  Non-trainable params: 0 (0.00 B)

Note that subclassed models cannot be cloned by default, since their internal layer structure is not known. To achieve equivalent functionality as clone_model in the case of a subclassed model, simply make sure that the model class implements get_config() (and optionally from_config()), and call:

new_model <- model$`__class__`$from_config(model$get_config())

In the case of a subclassed model, you cannot using a custom clone_function.


Configure a model for training.

Description

Configure a model for training.

Usage

## S3 method for class 'keras.src.models.model.Model'
compile(
  object,
  optimizer = "rmsprop",
  loss = NULL,
  metrics = NULL,
  ...,
  loss_weights = NULL,
  weighted_metrics = NULL,
  run_eagerly = FALSE,
  steps_per_execution = 1L,
  jit_compile = "auto",
  auto_scale_loss = TRUE
)

Arguments

object

Keras model object

optimizer

String (name of optimizer) or optimizer instance. See ⁠optimizer_*⁠ family.

loss

Loss function. May be:

  • a string (name of builtin loss function),

  • a custom function, or

  • a Loss instance (returned by the ⁠loss_*⁠ family of functions).

A loss function is any callable with the signature loss = fn(y_true, y_pred), where y_true are the ground truth values, and y_pred are the model's predictions. y_true should have shape ⁠(batch_size, d1, .. dN)⁠ (except in the case of sparse loss functions such as sparse categorical crossentropy which expects integer arrays of shape ⁠(batch_size, d1, .. dN-1)⁠). y_pred should have shape ⁠(batch_size, d1, .. dN)⁠. The loss function should return a float tensor.

metrics

List of metrics to be evaluated by the model during training and testing. Each of these can be:

  • a string (name of a built-in function),

  • a function, optionally with a "name" attribute or

  • a Metric() instance. See the ⁠metric_*⁠ family of functions.

Typically you will use metrics = c('accuracy'). A function is any callable with the signature result = fn(y_true, y_pred). To specify different metrics for different outputs of a multi-output model, you could also pass a named list, such as metrics = list(a = 'accuracy', b = c('accuracy', 'mse')). You can also pass a list to specify a metric or a list of metrics for each output, such as metrics = list(c('accuracy'), c('accuracy', 'mse')) or metrics = list('accuracy', c('accuracy', 'mse')). When you pass the strings 'accuracy' or 'acc', we convert this to one of metric_binary_accuracy(), metric_categorical_accuracy(), metric_sparse_categorical_accuracy() based on the shapes of the targets and of the model output. A similar conversion is done for the strings "crossentropy" and "ce" as well. The metrics passed here are evaluated without sample weighting; if you would like sample weighting to apply, you can specify your metrics via the weighted_metrics argument instead.

If providing an anonymous R function, you can customize the printed name during training by assigning ⁠attr(<fn>, "name") <- "my_custom_metric_name"⁠, or by calling custom_metric("my_custom_metric_name", <fn>)

...

Additional arguments passed on to the compile() model method.

loss_weights

Optional list (named or unnamed) specifying scalar coefficients (R numerics) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If an unnamed list, it is expected to have a 1:1 mapping to the model's outputs. If a named list, it is expected to map output names (strings) to scalar coefficients.

weighted_metrics

List of metrics to be evaluated and weighted by sample_weight or class_weight during training and testing.

run_eagerly

Bool. If TRUE, this model's forward pass will never be compiled. It is recommended to leave this as FALSE when training (for best performance), and to set it to TRUE when debugging.

steps_per_execution

Int. The number of batches to run during each a single compiled function call. Running multiple batches inside a single compiled function call can greatly improve performance on TPUs or small models with a large R/Python overhead. At most, one full epoch will be run each execution. If a number larger than the size of the epoch is passed, the execution will be truncated to the size of the epoch. Note that if steps_per_execution is set to N, Callback$on_batch_begin and Callback$on_batch_end methods will only be called every N batches (i.e. before/after each compiled function execution). Not supported with the PyTorch backend.

jit_compile

Bool or "auto". Whether to use XLA compilation when compiling a model. For jax and tensorflow backends, jit_compile="auto" enables XLA compilation if the model supports it, and disabled otherwise. For torch backend, "auto" will default to eager execution and jit_compile=True will run with torch.compile with the "inductor" backend.

auto_scale_loss

Bool. If TRUE and the model dtype policy is "mixed_float16", the passed optimizer will be automatically wrapped in a LossScaleOptimizer, which will dynamically scale the loss to prevent underflow.

Value

This is called primarily for the side effect of modifying object in-place. The first argument object is also returned, invisibly, to enable usage with the pipe.

Examples

model |> compile(
  optimizer = optimizer_adam(learning_rate = 1e-3),
  loss = loss_binary_crossentropy(),
  metrics = c(metric_binary_accuracy(),
              metric_false_negatives())
)

See Also

Other model training:
evaluate.keras.src.models.model.Model()
predict.keras.src.models.model.Model()
predict_on_batch()
test_on_batch()
train_on_batch()


Turn off traceback filtering.

Description

Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).

See also config_enable_traceback_filtering() and config_is_traceback_filtering_enabled().

If you have previously disabled traceback filtering via config_disable_traceback_filtering(), you can re-enable it via config_enable_traceback_filtering().

Usage

config_disable_traceback_filtering()

Value

No return value, called for side effects.

See Also

Other traceback utils:
config_enable_traceback_filtering()
config_is_traceback_filtering_enabled()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()

Other config:
config_backend()
config_disable_interactive_logging()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()


Returns the current default dtype policy object.

Description

Returns the current default dtype policy object.

Usage

config_dtype_policy()

Value

A DTypePolicy object.


Turn on traceback filtering.

Description

Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).

See also config_disable_traceback_filtering() and config_is_traceback_filtering_enabled().

If you have previously disabled traceback filtering via config_disable_traceback_filtering(), you can re-enable it via config_enable_traceback_filtering().

Usage

config_enable_traceback_filtering()

Value

No return value, called for side effects.

See Also

Other traceback utils:
config_disable_traceback_filtering()
config_is_traceback_filtering_enabled()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()


Check if traceback filtering is enabled.

Description

Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).

See also config_enable_traceback_filtering() and config_disable_traceback_filtering().

If you have previously disabled traceback filtering via config_disable_traceback_filtering(), you can re-enable it via config_enable_traceback_filtering().

Usage

config_is_traceback_filtering_enabled()

Value

Boolean, TRUE if traceback filtering is enabled, and FALSE otherwise.

See Also

Other traceback utils:
config_disable_traceback_filtering()
config_enable_traceback_filtering()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()


Reload the backend (and the Keras package).

Description

Reload the backend (and the Keras package).

Usage

config_set_backend(backend)

Arguments

backend

String

Value

Nothing, this function is called for its side effect.

Examples

config_set_backend("jax")

WARNING

Using this function is dangerous and should be done carefully. Changing the backend will NOT convert the type of any already-instantiated objects. Thus, any layers / tensors / etc. already created will no longer be usable without errors. It is strongly recommended not to keep around any Keras-originated objects instances created before calling config_set_backend().

This includes any function or class instance that uses any Keras functionality. All such code needs to be re-executed after calling config_set_backend().

See Also

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()


Sets the default dtype policy globally.

Description

Sets the default dtype policy globally.

Usage

config_set_dtype_policy(policy)

Arguments

policy

A string or DTypePolicy object.

Value

No return value, called for side effects.

Examples

config_set_dtype_policy("mixed_float16")

Set the value of the fuzz factor used in numeric expressions.

Description

Set the value of the fuzz factor used in numeric expressions.

Usage

config_set_epsilon(value)

Arguments

value

float. New value of epsilon.

Value

No return value, called for side effects.

Examples

config_epsilon()
## [1] 1e-07

config_set_epsilon(1e-5)
config_epsilon()
## [1] 1e-05

# Set it back to the default value.
config_set_epsilon(1e-7)

See Also

Other config backend:
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_floatx()
config_set_image_data_format()

Other backend:
clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_floatx()
config_set_image_data_format()

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_floatx()
config_set_image_data_format()


Set the default float dtype.

Description

Set the default float dtype.

Usage

config_set_floatx(value)

Arguments

value

String; 'bfloat16', 'float16', 'float32', or 'float64'.

Value

No return value, called for side effects.

Note

It is not recommended to set this to "float16" for training, as this will likely cause numeric stability issues. Instead, mixed precision, which leverages a mix of float16 and float32. It can be configured by calling keras3::keras$mixed_precision$set_dtype_policy('mixed_float16').

Examples

config_floatx()
## [1] "float32"

config_set_floatx('float64')
config_floatx()
## [1] "float64"

# Set it back to float32
config_set_floatx('float32')

Raises

ValueError: In case of invalid value.

See Also

Other config backend:
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_image_data_format()

Other backend:
clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_image_data_format()

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_image_data_format()


Set the value of the image data format convention.

Description

Set the value of the image data format convention.

Usage

config_set_image_data_format(data_format)

Arguments

data_format

string. 'channels_first' or 'channels_last'.

Value

No return value, called for side effects.

Examples

config_image_data_format()
## [1] "channels_last"

# 'channels_last'
keras3::config_set_image_data_format('channels_first')
config_image_data_format()
## [1] "channels_first"

# Set it back to `'channels_last'`
keras3::config_set_image_data_format('channels_last')

See Also

Other config backend:
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()

Other backend:
clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()

Other config:
config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()


Define a custom Constraint class

Description

Base class for weight constraints.

A Constraint() instance works like a stateless function. Users who subclass the Constraint class should override the call() method, which takes a single weight parameter and return a projected version of that parameter (e.g. normalized or clipped). Constraints can be used with various Keras layers via the kernel_constraint or bias_constraint arguments.

Here's a simple example of a non-negative weight constraint:

constraint_nonnegative <- Constraint("NonNegative",
  call = function(w) {
    w * op_cast(w >= 0, dtype = w$dtype)
  }
)
weight <- op_convert_to_tensor(c(-1, 1))
constraint_nonnegative()(weight)
## tf.Tensor([-0.  1.], shape=(2), dtype=float32)

Usage in a layer:

layer_dense(units = 4, kernel_constraint = constraint_nonnegative())
## <Dense name=dense, built=False>
##  signature: (*args, **kwargs)

Usage

Constraint(
  classname,
  call = NULL,
  get_config = NULL,
  ...,
  public = list(),
  private = list(),
  inherit = NULL,
  parent_env = parent.frame()
)

Arguments

classname

String, the name of the custom class. (Conventionally, CamelCase).

call
\(w)

Applies the constraint to the input weight variable.

By default, the inputs weight variable is not modified. Users should override this method to implement their own projection function.

Args:

  • w: Input weight variable.

Returns: Projected variable (by default, returns unmodified inputs).

get_config
\()

Function that returns a named list of the object config.

A constraint config is a named list (JSON-serializable) that can be used to reinstantiate the same object (via ⁠do.call(<constraint_class>, <config>)⁠).

..., public

Additional methods or public members of the custom class.

private

Named list of R objects (typically, functions) to include in instance private environments. private methods will have all the same symbols in scope as public methods (See section "Symbols in Scope"). Each instance will have it's own private environment. Any objects in private will be invisible from the Keras framework and the Python runtime.

inherit

What the custom class will subclass. By default, the base keras class.

parent_env

The R environment that all class methods will have as a grandparent.

Value

A function that returns Constraint instances, similar to the builtin constraint functions like constraint_maxnorm().

Symbols in scope

All R function custom methods (public and private) will have the following symbols in scope:

  • self: The custom class instance.

  • super: The custom class superclass.

  • private: An R environment specific to the class instance. Any objects assigned here are invisible to the Keras framework.

  • ⁠__class__⁠ and as.symbol(classname): the custom class type object.

See Also

Other constraints:
constraint_maxnorm()
constraint_minmaxnorm()
constraint_nonneg()
constraint_unitnorm()


MaxNorm weight constraint.

Description

Constrains the weights incident to each hidden unit to have a norm less than or equal to a desired value.

Usage

constraint_maxnorm(max_value = 2L, axis = 1L)

Arguments

max_value

the maximum norm value for the incoming weights.

axis

integer, axis along which to calculate weight norms. For instance, in a Dense layer the weight matrix has shape ⁠(input_dim, output_dim)⁠, set axis to 0 to constrain each weight vector of length ⁠(input_dim,)⁠. In a Conv2D layer with data_format = "channels_last", the weight tensor has shape ⁠(rows, cols, input_depth, output_depth)⁠, set axis to ⁠[0, 1, 2]⁠ to constrain the weights of each filter tensor of size ⁠(rows, cols, input_depth)⁠.

Value

A Constraint instance, a callable that can be passed to layer constructors or used directly by calling it with tensors.

See Also

Other constraints:
Constraint()
constraint_minmaxnorm()
constraint_nonneg()
constraint_unitnorm()


MinMaxNorm weight constraint.

Description

Constrains the weights incident to each hidden unit to have the norm between a lower bound and an upper bound.

Usage

constraint_minmaxnorm(min_value = 0, max_value = 1, rate = 1, axis = 1L)

Arguments

min_value

the minimum norm for the incoming weights.

max_value

the maximum norm for the incoming weights.

rate

rate for enforcing the constraint: weights will be rescaled to yield op_clip? (1 - rate) * norm + rate * op_clip(norm, min_value, max_value). Effectively, this means that rate = 1.0 stands for strict enforcement of the constraint, while rate<1.0 means that weights will be rescaled at each step to slowly move towards a value inside the desired interval.

axis

integer, axis along which to calculate weight norms. For instance, in a Dense layer the weight matrix has shape ⁠(input_dim, output_dim)⁠, set axis to 0 to constrain each weight vector of length ⁠(input_dim,)⁠. In a Conv2D layer with data_format = "channels_last", the weight tensor has shape ⁠(rows, cols, input_depth, output_depth)⁠, set axis to ⁠[0, 1, 2]⁠ to constrain the weights of each filter tensor of size ⁠(rows, cols, input_depth)⁠.

Value

A Constraint instance, a callable that can be passed to layer constructors or used directly by calling it with tensors.

See Also

Other constraints:
Constraint()
constraint_maxnorm()
constraint_nonneg()
constraint_unitnorm()


Constrains the weights to be non-negative.

Description

Constrains the weights to be non-negative.

Usage

constraint_nonneg()

Value

A Constraint instance, a callable that can be passed to layer constructors or used directly by calling it with tensors.

See Also

Other constraints:
Constraint()
constraint_maxnorm()
constraint_minmaxnorm()
constraint_unitnorm()


Constrains the weights incident to each hidden unit to have unit norm.

Description

Constrains the weights incident to each hidden unit to have unit norm.

Usage

constraint_unitnorm(axis = 1L)

Arguments

axis

integer, axis along which to calculate weight norms. For instance, in a Dense layer the weight matrix has shape ⁠(input_dim, output_dim)⁠, set axis to 0 to constrain each weight vector of length ⁠(input_dim,)⁠. In a Conv2D layer with data_format = "channels_last", the weight tensor has shape ⁠(rows, cols, input_depth, output_depth)⁠, set axis to ⁠[0, 1, 2]⁠ to constrain the weights of each filter tensor of size ⁠(rows, cols, input_depth)⁠.

Value

A Constraint instance, a callable that can be passed to layer constructors or used directly by calling it with tensors.

See Also

Other constraints:
Constraint()
constraint_maxnorm()
constraint_minmaxnorm()
constraint_nonneg()


Count the total number of scalars composing the weights.

Description

Count the total number of scalars composing the weights.

Usage

count_params(object)

Arguments

object

Layer or model object

Value

An integer count

See Also

Other layer methods:
get_config()
get_weights()
quantize_weights()
reset_state()


Custom metric function

Description

Custom metric function

Usage

custom_metric(name, metric_fn)

Arguments

name

name used to show training progress output

metric_fn

An R function with signature ⁠function(y_true, y_pred)⁠ that accepts tensors.

Details

You can provide an arbitrary R function as a custom metric. Note that the y_true and y_pred parameters are tensors, so computations on them should use ⁠op_*⁠ tensor functions.

Use the custom_metric() function to define a custom metric. Note that a name ('mean_pred') is provided for the custom metric function: this name is used within training progress output.

If you want to save and load a model with custom metrics, you should also call register_keras_serializable(), or specify the metric in the call the load_model(). For example: load_model("my_model.keras", c('mean_pred' = metric_mean_pred)).

Alternatively, you can wrap all of your code in a call to with_custom_object_scope() which will allow you to refer to the metric by name just like you do with built in keras metrics.

Alternative ways of supplying custom metrics:

  • ⁠custom_metric():⁠ Arbitrary R function.

  • metric_mean_wrapper(): Wrap an arbitrary R function in a Metric instance.

  • Create a custom Metric() subclass.

Value

A callable function with a ⁠__name__⁠ attribute.

See Also

Other metrics:
Metric()
metric_auc()
metric_binary_accuracy()
metric_binary_crossentropy()
metric_binary_focal_crossentropy()
metric_binary_iou()
metric_categorical_accuracy()
metric_categorical_crossentropy()
metric_categorical_focal_crossentropy()
metric_categorical_hinge()
metric_cosine_similarity()
metric_f1_score()
metric_false_negatives()
metric_false_positives()
metric_fbeta_score()
metric_hinge()
metric_huber()
metric_iou()
metric_kl_divergence()
metric_log_cosh()
metric_log_cosh_error()
metric_mean()
metric_mean_absolute_error()
metric_mean_absolute_percentage_error()
metric_mean_iou()
metric_mean_squared_error()
metric_mean_squared_logarithmic_error()
metric_mean_wrapper()
metric_one_hot_iou()
metric_one_hot_mean_iou()
metric_poisson()
metric_precision()
metric_precision_at_recall()
metric_r2_score()
metric_recall()
metric_recall_at_precision()
metric_root_mean_squared_error()
metric_sensitivity_at_specificity()
metric_sparse_categorical_accuracy()
metric_sparse_categorical_crossentropy()
metric_sparse_top_k_categorical_accuracy()
metric_specificity_at_sensitivity()
metric_squared_hinge()
metric_sum()
metric_top_k_categorical_accuracy()
metric_true_negatives()
metric_true_positives()


Boston housing price regression dataset

Description

Dataset taken from the StatLib library which is maintained at Carnegie Mellon University.

Usage

dataset_boston_housing(
  path = "boston_housing.npz",
  test_split = 0.2,
  seed = 113L
)

Arguments

path

Path where to cache the dataset locally (relative to ~/.keras/datasets).

test_split

fraction of the data to reserve as test set.

seed

Random seed for shuffling the data before computing the test split.

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠.

Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Targets are the median values of the houses at a location (in k$).

See Also

Other datasets:
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()


CIFAR10 small image classification

Description

Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images.

Usage

dataset_cifar10()

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠.

The x data is an array of RGB image data with shape (num_samples, 3, 32, 32).

The y data is an array of category labels (integers in range 0-9) with shape (num_samples).

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()


CIFAR100 small image classification

Description

Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images.

Usage

dataset_cifar100(label_mode = c("fine", "coarse"))

Arguments

label_mode

one of "fine", "coarse".

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠.

The x data is an array of RGB image data with shape (num_samples, 3, 32, 32).

The y data is an array of category labels with shape (num_samples).

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar10()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()


Fashion-MNIST database of fashion articles

Description

Dataset of 60,000 28x28 grayscale images of the 10 fashion article classes, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are encoded as integers from 0-9 which correspond to T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt,

Usage

dataset_fashion_mnist()

Details

Dataset of 60,000 28x28 grayscale images of 10 fashion categories, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are:

  • 0 - T-shirt/top

  • 1 - Trouser

  • 2 - Pullover

  • 3 - Dress

  • 4 - Coat

  • 5 - Sandal

  • 6 - Shirt

  • 7 - Sneaker

  • 8 - Bag

  • 9 - Ankle boot

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠, where x is an array of grayscale image data with shape (num_samples, 28, 28) and y is an array of article labels (integers in range 0-9) with shape (num_samples).

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_imdb()
dataset_mnist()
dataset_reuters()


IMDB Movie reviews sentiment classification

Description

Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".

Usage

dataset_imdb(
  path = "imdb.npz",
  num_words = NULL,
  skip_top = 0L,
  maxlen = NULL,
  seed = 113L,
  start_char = 1L,
  oov_char = 2L,
  index_from = 3L
)

dataset_imdb_word_index(path = "imdb_word_index.json")

Arguments

path

Where to cache the data (relative to ⁠~/.keras/dataset⁠).

num_words

Max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept

skip_top

Skip the top N most frequently occuring words (which may not be informative).

maxlen

sequences longer than this will be filtered out.

seed

random seed for sample shuffling.

start_char

The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.

oov_char

Words that were cut out because of the num_words or skip_top limit will be replaced with this character.

index_from

Index actual words with this index and higher.

Details

As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown word.

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠.

The x data includes integer sequences. If the num_words argument was specific, the maximum possible index value is num_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen.

The y data includes a set of integer labels (0 or 1).

The dataset_imdb_word_index() function returns a list where the names are words and the values are integer.

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_mnist()
dataset_reuters()


MNIST database of handwritten digits

Description

Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.

Usage

dataset_mnist(path = "mnist.npz")

Arguments

path

Path where to cache the dataset locally (relative to ~/.keras/datasets).

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠, where x is an array of grayscale image data with shape (num_samples, 28, 28) and y is an array of digit labels (integers in range 0-9) with shape (num_samples).

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_reuters()


Reuters newswire topics classification

Description

Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with dataset_imdb() , each wire is encoded as a sequence of word indexes (same conventions).

Usage

dataset_reuters(
  path = "reuters.npz",
  num_words = NULL,
  skip_top = 0L,
  maxlen = NULL,
  test_split = 0.2,
  seed = 113L,
  start_char = 1L,
  oov_char = 2L,
  index_from = 3L
)

dataset_reuters_word_index(path = "reuters_word_index.pkl")

Arguments

path

Where to cache the data (relative to ⁠~/.keras/dataset⁠).

num_words

Max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept

skip_top

Skip the top N most frequently occuring words (which may not be informative).

maxlen

Truncate sequences after this length.

test_split

Fraction of the dataset to be used as test data.

seed

Random seed for sample shuffling.

start_char

The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.

oov_char

words that were cut out because of the num_words or skip_top limit will be replaced with this character.

index_from

index actual words with this index and higher.

Value

Lists of training and test data: ⁠train$x, train$y, test$x, test$y⁠ with same format as dataset_imdb(). The dataset_reuters_word_index() function returns a list where the names are words and the values are integer. e.g. word_index[["giraffe"]] might return 1234.

See Also

Other datasets:
dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()


Retrieve the object by deserializing the config dict.

Description

The config dict is a Python dictionary that consists of a set of key-value pairs, and represents a Keras object, such as an Optimizer, Layer, Metrics, etc. The saving and loading library uses the following keys to record information of a Keras object:

  • class_name: String. This is the name of the class, as exactly defined in the source code, such as "LossesContainer".

  • config: Named List. Library-defined or user-defined key-value pairs that store the configuration of the object, as obtained by object$get_config().

  • module: String. The path of the python module. Built-in Keras classes expect to have prefix keras.

  • registered_name: String. The key the class is registered under via register_keras_serializable(package, name) API. The key has the format of '{package}>{name}', where package and name are the arguments passed to register_keras_serializable(). If name is not provided, it uses the class name. If registered_name successfully resolves to a class (that was registered), the class_name and config values in the config dict will not be used. registered_name is only used for non-built-in classes.

For example, the following config list represents the built-in Adam optimizer with the relevant config:

config <- list(
  class_name = "Adam",
  config = list(
    amsgrad = FALSE,
    beta_1 = 0.8999999761581421,
    beta_2 = 0.9990000128746033,
    epsilon = 1e-07,
    learning_rate = 0.0010000000474974513,
    name = "Adam"
  ),
  module = "keras.optimizers",
  registered_name = NULL
)
# Returns an `Adam` instance identical to the original one.
deserialize_keras_object(config)
## <keras.src.optimizers.adam.Adam object>

If the class does not have an exported Keras namespace, the library tracks it by its module and class_name. For example:

config <- list(
  class_name = "MetricsList",
  config =  list(
    ...
  ),
  module = "keras.trainers.compile_utils",
  registered_name = "MetricsList"
)

# Returns a `MetricsList` instance identical to the original one.
deserialize_keras_object(config)

And the following config represents a user-customized MeanSquaredError loss:

# define a custom object
loss_modified_mse <- Loss(
  "ModifiedMeanSquaredError",
  inherit = loss_mean_squared_error)

# register the custom object
register_keras_serializable(loss_modified_mse)

# confirm object is registered
get_custom_objects()
## $`keras3>ModifiedMeanSquaredError`
## <class '<r-namespace:keras3>.ModifiedMeanSquaredError'>
##  signature: (
##    reduction='sum_over_batch_size',
##    name='mean_squared_error',
##    dtype=None
## )

get_registered_name(loss_modified_mse)
## [1] "keras3>ModifiedMeanSquaredError"

# now custom object instances can be serialized
full_config <- serialize_keras_object(loss_modified_mse())

# the `config` arguments will be passed to loss_modified_mse()
str(full_config)
## List of 4
##  $ module         : chr "<r-namespace:keras3>"
##  $ class_name     : chr "ModifiedMeanSquaredError"
##  $ config         :List of 2
##   ..$ name     : chr "mean_squared_error"
##   ..$ reduction: chr "sum_over_batch_size"
##  $ registered_name: chr "keras3>ModifiedMeanSquaredError"

# and custom object instances can be deserialized
deserialize_keras_object(full_config)
## <<r-namespace:keras3>.ModifiedMeanSquaredError object>
##  signature: (y_true, y_pred, sample_weight=None)

# Returns the `ModifiedMeanSquaredError` object

Usage

deserialize_keras_object(config, custom_objects = NULL, safe_mode = TRUE, ...)

Arguments

config

Named list describing the object.

custom_objects

Named list containing a mapping between custom object names the corresponding classes or functions.

safe_mode

Boolean, whether to disallow unsafe lambda deserialization. When safe_mode=FALSE, loading an object has the potential to trigger arbitrary code execution. This argument is only applicable to the Keras v3 model format. Defaults to TRUE.

...

For forward/backward compatability.

Value

The object described by the config dictionary.

See Also

Other serialization utilities:
get_custom_objects()
get_registered_name()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()


Evaluate a Keras Model

Description

This functions returns the loss value and metrics values for the model in test mode. Computation is done in batches (see the batch_size arg.)

Usage

## S3 method for class 'keras.src.models.model.Model'
evaluate(
  object,
  x = NULL,
  y = NULL,
  ...,
  batch_size = NULL,
  verbose = getOption("keras.verbose", default = "auto"),
  sample_weight = NULL,
  steps = NULL,
  callbacks = NULL
)

Arguments

object

Keras model object

x

Input data. It could be:

  • An R array (or array-like), or a list of arrays (in case the model has multiple inputs).

  • A tensor, or a list of tensors (in case the model has multiple inputs).

  • A named list mapping input names to the corresponding array/tensors, if the model has named inputs.

  • A tf.data.Dataset. Should return a tuple of either ⁠(inputs, targets)⁠ or ⁠(inputs, targets, sample_weights)⁠.

  • A generator returning ⁠(inputs, targets)⁠ or ⁠(inputs, targets, sample_weights)⁠.

y

Target data. Like the input data x, it could be either R array(s) or backend-native tensor(s). If x is a tf.data.Dataset or generator function, y should not be specified (since targets will be obtained from the iterator/dataset).

...

For forward/backward compatability.

batch_size

Integer or NULL. Number of samples per batch of computation. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of a a tf dataset or generator (since they generate batches).

verbose

"auto", 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = single line. "auto" becomes 1 for most cases, 2 if in a knitr render or running on a distributed training server. Note that the progress bar is not particularly useful when logged to a file, so verbose=2 is recommended when not running interactively (e.g. in a production environment). Defaults to "auto".

sample_weight

Optional array of weights for the test samples, used for weighting the loss function. You can either pass a flat (1D) R array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape ⁠(samples, sequence_length)⁠, to apply a different weight to every timestep of every sample. This argument is not supported when x is a tfdataset, instead pass sample weights as the third element of x.

steps

Integer or NULL. Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of NULL. If x is a tf.data.Dataset and steps is NULL, evaluation will run until the dataset is exhausted.

callbacks

List of Callback instances. List of callbacks to apply during evaluation.

Value

Scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model$metrics_names will give you the display labels for the scalar outputs.

See Also

Other model training:
compile.keras.src.models.model.Model()
predict.keras.src.models.model.Model()
predict_on_batch()
test_on_batch()
train_on_batch()


Create a TF SavedModel artifact for inference (e.g. via TF-Serving).

Description

(e.g. via TF-Serving).

Note: This can currently only be used with the TensorFlow or JAX backends.

This method lets you export a model to a lightweight SavedModel artifact that contains the model's forward pass only (its call() method) and can be served via e.g. TF-Serving. The forward pass is registered under the name serve() (see example below).

The original code of the model (including any custom layers you may have used) is no longer necessary to reload the artifact – it is entirely standalone.

Usage

## S3 method for class 'keras.src.models.model.Model'
export_savedmodel(object, export_dir_base, ...)

Arguments

object

A keras model.

export_dir_base

string, file path where to save the artifact.

...

For forward/backward compatability.

Value

This is called primarily for the side effect of exporting object. The first argument, object is also returned, invisibly, to enable usage with the pipe.

Examples

# Create the artifact
model |> tensorflow::export_savedmodel("path/to/location")

# Later, in a different process/environment...
library(tensorflow)
reloaded_artifact <- tf$saved_model$load("path/to/location")
predictions <- reloaded_artifact$serve(input_data)

# see tfdeploy::serve_savedmodel() for serving a model over a local web api.

See Also

Other saving and loading functions:
layer_tfsm()
load_model()
load_model_weights()
register_keras_serializable()
save_model()
save_model_config()
save_model_weights()
with_custom_object_scope()


Train a model for a fixed number of epochs (dataset iterations).

Description

Train a model for a fixed number of epochs (dataset iterations).

Usage

## S3 method for class 'keras.src.models.model.Model'
fit(
  object,
  x = NULL,
  y = NULL,
  ...,
  batch_size = NULL,
  epochs = 1L,
  callbacks = NULL,
  validation_split = 0,
  validation_data = NULL,
  shuffle = TRUE,
  class_weight = NULL,
  sample_weight = NULL,
  initial_epoch = 1L,
  steps_per_epoch = NULL,
  validation_steps = NULL,
  validation_batch_size = NULL,
  validation_freq = 1L,
  verbose = getOption("keras.verbose", default = "auto"),
  view_metrics = getOption("keras.view_metrics", default = "auto")
)

Arguments

object

Keras model object

x

Input data. It could be:

  • An array (or array-like), or a list of arrays (in case the model has multiple inputs).

  • A tensor, or a list of tensors (in case the model has multiple inputs).

  • A named list mapping input names to the corresponding array/tensors, if the model has named inputs.

  • A tf.data.Dataset. Should return a tuple of either ⁠(inputs, targets)⁠ or ⁠(inputs, targets, sample_weights)⁠.

  • A generator returning ⁠(inputs, targets)⁠ or ⁠(inputs, targets, sample_weights)⁠.

y

Target data. Like the input data x, it could be either array(s) or backend-native tensor(s). If x is a TF Dataset or generator, y should not be specified (since targets will be obtained from x).

...

Additional arguments passed on to the model fit() method.

batch_size

Integer or NULL. Number of samples per gradient update. If unspecified, batch_size will default to 32. Do not specify the batch_size if your data is in the form of TF Datasets or generators, (since they generate batches).

epochs

Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided (unless the steps_per_epoch flag is set to something other than NULL). Note that in conjunction with initial_epoch, epochs is to be understood as "final epoch". The model is not trained for a number of iterations given by epochs, but merely until the epoch of index epochs is reached.

callbacks

List of Callback() instances. List of callbacks to apply during training. See ⁠callback_*⁠.

validation_split

Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling. This argument is not supported when x is a TF Dataset or generator. If both validation_data and validation_split are provided, validation_data will override validation_split.

validation_data

Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. Thus, note the fact that the validation loss of data provided using validation_split or validation_data is not affected by regularization layers like noise and dropout. validation_data will override validation_split. It could be:

  • A tuple ⁠(x_val, y_val)⁠ of arrays or tensors.

  • A tuple ⁠(x_val, y_val, val_sample_weights)⁠ of arrays.

  • A generator returning ⁠(inputs, targets)⁠ or ⁠(inputs, targets, sample_weights)⁠.

shuffle

Boolean, whether to shuffle the training data before each epoch. This argument is ignored when x is a generator or a TF Dataset.

class_weight

Optional named list mapping class indices (integers, 0-based) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class. When class_weight is specified and targets have a rank of 2 or greater, either y must be one-hot encoded, or an explicit final dimension of 1 must be included for sparse class labels.

sample_weight

Optional array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) array/vector with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array (matrix) with shape ⁠(samples, sequence_length)⁠, to apply a different weight to every timestep of every sample. This argument is not supported when x is a TF Dataset or generator, instead provide the sample_weights as the third element of x. Note that sample weighting does not apply to metrics specified via the metrics argument in compile(). To apply sample weighting to your metrics, you can specify them via the weighted_metrics in compile() instead.

initial_epoch

Integer. Epoch at which to start training (useful for resuming a previous training run).

steps_per_epoch

Integer or NULL. Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as backend-native tensors, the default NULL is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. If x is a TF Dataset, and steps_per_epoch is NULL, the epoch will run until the input dataset is exhausted. When passing an infinitely repeating dataset, you must specify the steps_per_epoch argument. If steps_per_epoch = -1 the training will run indefinitely with an infinitely repeating dataset.

validation_steps

Only relevant if validation_data is provided. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch. If validation_steps is NULL, validation will run until the validation_data dataset is exhausted. In the case of an infinitely repeated dataset, it will run into an infinite loop. If validation_steps is specified and only part of the dataset will be consumed, the evaluation will start from the beginning of the dataset at each epoch. This ensures that the same validation samples are used every time.

validation_batch_size

Integer or NULL. Number of samples per validation batch. If unspecified, will default to batch_size. Do not specify the validation_batch_size if your data is in the form of TF Datasets or generator instances (since they generate batches).

validation_freq

Only relevant if validation data is provided. Specifies how many training epochs to run before a new validation run is performed, e.g. validation_freq=2 runs validation every 2 epochs.

verbose

"auto", 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. "auto" becomes 1 for most cases, 2 if in a knitr render or running on a distributed training server. Note that the progress bar is not particularly useful when logged to a file, so verbose=2 is recommended when not running interactively (e.g., in a production environment). Defaults to "auto".

view_metrics

View realtime plot of training metrics (by epoch). The default ("auto") will display the plot when running within RStudio, metrics were specified during model compile(), epochs > 1 and verbose > 0. Set the global options(keras.view_metrics = ) option to establish a different default.

Details

Unpacking behavior for iterator-like inputs:

A common pattern is to pass an iterator like object such as a tf.data.Dataset or a generator to fit(), which will in fact yield not only features (x) but optionally targets (y) and sample weights (sample_weight). Keras requires that the output of such iterator-likes be unambiguous. The iterator should return a tuple() of length 1, 2, or 3, where the optional second and third elements will be used for y and sample_weight respectively. Any other type provided will be wrapped in a length-one tuple(), effectively treating everything as x. When yielding named lists, they should still adhere to the top-level tuple structure, e.g. tuple(list(x0 = x0, x = x1), y). Keras will not attempt to separate features, targets, and weights from the keys of a single dict.

Value

A keras_training_history object, which is a named list: ⁠list(params = <params>, metrics = <metrics>")⁠, with S3 methods print(), plot(), and as.data.frame(). The metrics field is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable).

See Also


Freeze and unfreeze weights

Description

Freeze weights in a model or layer so that they are no longer trainable.

Usage

freeze_weights(object, from = NULL, to = NULL, which = NULL)

unfreeze_weights(object, from = NULL, to = NULL, which = NULL)

Arguments

object

Keras model or layer object

from

Layer instance, layer name, or layer index within model

to

Layer instance, layer name, or layer index within model

which

layer names, integer positions, layers, logical vector (of length(object$layers)), or a function returning a logical vector.

Value

The input object with frozen weights is returned, invisibly. Note, object is modified in place, and the return value is only provided to make usage with the pipe convenient.

Examples

# instantiate a VGG16 model
conv_base <- application_vgg16(
  weights = "imagenet",
  include_top = FALSE,
  input_shape = c(150, 150, 3)
)

# freeze it's weights
freeze_weights(conv_base)

# Note the "Trainable" column
conv_base
## Model: "vgg16"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | input_layer (InputLayer)    | (None, 150, 150, 3)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv1 (Conv2D)       | (None, 150, 150, 64)  |      1,792 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv2 (Conv2D)       | (None, 150, 150, 64)  |     36,928 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_pool (MaxPooling2D)  | (None, 75, 75, 64)    |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv1 (Conv2D)       | (None, 75, 75, 128)   |     73,856 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv2 (Conv2D)       | (None, 75, 75, 128)   |    147,584 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_pool (MaxPooling2D)  | (None, 37, 37, 128)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv1 (Conv2D)       | (None, 37, 37, 256)   |    295,168 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv2 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv3 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_pool (MaxPooling2D)  | (None, 18, 18, 256)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv1 (Conv2D)       | (None, 18, 18, 512)   |  1,180,160 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv2 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv3 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_pool (MaxPooling2D)  | (None, 9, 9, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv1 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv2 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv3 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_pool (MaxPooling2D)  | (None, 4, 4, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 14,714,688 (56.13 MB)
##  Trainable params: 0 (0.00 B)
##  Non-trainable params: 14,714,688 (56.13 MB)

# create a composite model that includes the base + more layers
model <- keras_model_sequential(input_batch_shape = shape(conv_base$input)) |>
  conv_base() |>
  layer_flatten() |>
  layer_dense(units = 256, activation = "relu") |>
  layer_dense(units = 1, activation = "sigmoid")

# compile
model |> compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(learning_rate = 2e-5),
  metrics = c("accuracy")
)

model
## Model: "sequential"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | vgg16 (Functional)          | (None, 4, 4, 512)     | 14,714,688 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | flatten (Flatten)           | (None, 8192)          |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | dense (Dense)               | (None, 256)           |  2,097,408 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | dense_1 (Dense)             | (None, 1)             |        257 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 16,812,353 (64.13 MB)
##  Trainable params: 2,097,665 (8.00 MB)
##  Non-trainable params: 14,714,688 (56.13 MB)

print(model, expand_nested = TRUE)
## Model: "sequential"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | vgg16 (Functional)          | (None, 4, 4, 512)     | 14,714,688 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > input_layer            | (None, 150, 150, 3)   |          0 |   -   |
## | (InputLayer)                |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_conv1 (Conv2D)  | (None, 150, 150, 64)  |      1,792 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_conv2 (Conv2D)  | (None, 150, 150, 64)  |     36,928 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_pool            | (None, 75, 75, 64)    |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_conv1 (Conv2D)  | (None, 75, 75, 128)   |     73,856 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_conv2 (Conv2D)  | (None, 75, 75, 128)   |    147,584 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_pool            | (None, 37, 37, 128)   |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv1 (Conv2D)  | (None, 37, 37, 256)   |    295,168 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv2 (Conv2D)  | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv3 (Conv2D)  | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_pool            | (None, 18, 18, 256)   |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv1 (Conv2D)  | (None, 18, 18, 512)   |  1,180,160 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv2 (Conv2D)  | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv3 (Conv2D)  | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_pool            | (None, 9, 9, 512)     |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv1 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv2 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv3 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_pool            | (None, 4, 4, 512)     |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## | flatten (Flatten)           | (None, 8192)          |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | dense (Dense)               | (None, 256)           |  2,097,408 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | dense_1 (Dense)             | (None, 1)             |        257 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 16,812,353 (64.13 MB)
##  Trainable params: 2,097,665 (8.00 MB)
##  Non-trainable params: 14,714,688 (56.13 MB)


# unfreeze weights from "block5_conv1" on
unfreeze_weights(conv_base, from = "block5_conv1")

# compile again since we froze or unfroze weights
model |> compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(learning_rate = 2e-5),
  metrics = c("accuracy")
)

conv_base
## Model: "vgg16"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | input_layer (InputLayer)    | (None, 150, 150, 3)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv1 (Conv2D)       | (None, 150, 150, 64)  |      1,792 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv2 (Conv2D)       | (None, 150, 150, 64)  |     36,928 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_pool (MaxPooling2D)  | (None, 75, 75, 64)    |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv1 (Conv2D)       | (None, 75, 75, 128)   |     73,856 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv2 (Conv2D)       | (None, 75, 75, 128)   |    147,584 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_pool (MaxPooling2D)  | (None, 37, 37, 128)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv1 (Conv2D)       | (None, 37, 37, 256)   |    295,168 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv2 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv3 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_pool (MaxPooling2D)  | (None, 18, 18, 256)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv1 (Conv2D)       | (None, 18, 18, 512)   |  1,180,160 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv2 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv3 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_pool (MaxPooling2D)  | (None, 9, 9, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv1 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv2 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv3 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_pool (MaxPooling2D)  | (None, 4, 4, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 14,714,688 (56.13 MB)
##  Trainable params: 7,079,424 (27.01 MB)
##  Non-trainable params: 7,635,264 (29.13 MB)

print(model, expand_nested = TRUE)
## Model: "sequential"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | vgg16 (Functional)          | (None, 4, 4, 512)     | 14,714,688 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## |    > input_layer            | (None, 150, 150, 3)   |          0 |   -   |
## | (InputLayer)                |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_conv1 (Conv2D)  | (None, 150, 150, 64)  |      1,792 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_conv2 (Conv2D)  | (None, 150, 150, 64)  |     36,928 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block1_pool            | (None, 75, 75, 64)    |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_conv1 (Conv2D)  | (None, 75, 75, 128)   |     73,856 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_conv2 (Conv2D)  | (None, 75, 75, 128)   |    147,584 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block2_pool            | (None, 37, 37, 128)   |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv1 (Conv2D)  | (None, 37, 37, 256)   |    295,168 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv2 (Conv2D)  | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_conv3 (Conv2D)  | (None, 37, 37, 256)   |    590,080 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block3_pool            | (None, 18, 18, 256)   |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv1 (Conv2D)  | (None, 18, 18, 512)   |  1,180,160 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv2 (Conv2D)  | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_conv3 (Conv2D)  | (None, 18, 18, 512)   |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block4_pool            | (None, 9, 9, 512)     |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv1 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv2 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_conv3 (Conv2D)  | (None, 9, 9, 512)     |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## |    > block5_pool            | (None, 4, 4, 512)     |          0 |   -   |
## | (MaxPooling2D)              |                       |            |       |
## +-----------------------------+-----------------------+------------+-------+
## | flatten (Flatten)           | (None, 8192)          |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | dense (Dense)               | (None, 256)           |  2,097,408 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | dense_1 (Dense)             | (None, 1)             |        257 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 16,812,353 (64.13 MB)
##  Trainable params: 9,177,089 (35.01 MB)
##  Non-trainable params: 7,635,264 (29.13 MB)

# freeze only the last 5 layers
freeze_weights(conv_base, from = -5)
conv_base
## Model: "vgg16"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | input_layer (InputLayer)    | (None, 150, 150, 3)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv1 (Conv2D)       | (None, 150, 150, 64)  |      1,792 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv2 (Conv2D)       | (None, 150, 150, 64)  |     36,928 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_pool (MaxPooling2D)  | (None, 75, 75, 64)    |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv1 (Conv2D)       | (None, 75, 75, 128)   |     73,856 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv2 (Conv2D)       | (None, 75, 75, 128)   |    147,584 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_pool (MaxPooling2D)  | (None, 37, 37, 128)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv1 (Conv2D)       | (None, 37, 37, 256)   |    295,168 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv2 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv3 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_pool (MaxPooling2D)  | (None, 18, 18, 256)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv1 (Conv2D)       | (None, 18, 18, 512)   |  1,180,160 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv2 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv3 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_pool (MaxPooling2D)  | (None, 9, 9, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv1 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv2 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv3 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_pool (MaxPooling2D)  | (None, 4, 4, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 14,714,688 (56.13 MB)
##  Trainable params: 7,635,264 (29.13 MB)
##  Non-trainable params: 7,079,424 (27.01 MB)

# freeze only the last 5 layers, a different way
unfreeze_weights(conv_base, to = -6)
conv_base
## Model: "vgg16"
## +-----------------------------+-----------------------+------------+-------+
## | Layer (type)                | Output Shape          |    Param # | Trai… |
## +=============================+=======================+============+=======+
## | input_layer (InputLayer)    | (None, 150, 150, 3)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv1 (Conv2D)       | (None, 150, 150, 64)  |      1,792 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_conv2 (Conv2D)       | (None, 150, 150, 64)  |     36,928 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block1_pool (MaxPooling2D)  | (None, 75, 75, 64)    |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv1 (Conv2D)       | (None, 75, 75, 128)   |     73,856 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_conv2 (Conv2D)       | (None, 75, 75, 128)   |    147,584 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block2_pool (MaxPooling2D)  | (None, 37, 37, 128)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv1 (Conv2D)       | (None, 37, 37, 256)   |    295,168 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv2 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_conv3 (Conv2D)       | (None, 37, 37, 256)   |    590,080 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block3_pool (MaxPooling2D)  | (None, 18, 18, 256)   |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv1 (Conv2D)       | (None, 18, 18, 512)   |  1,180,160 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv2 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_conv3 (Conv2D)       | (None, 18, 18, 512)   |  2,359,808 |   Y   |
## +-----------------------------+-----------------------+------------+-------+
## | block4_pool (MaxPooling2D)  | (None, 9, 9, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv1 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv2 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_conv3 (Conv2D)       | (None, 9, 9, 512)     |  2,359,808 |   N   |
## +-----------------------------+-----------------------+------------+-------+
## | block5_pool (MaxPooling2D)  | (None, 4, 4, 512)     |          0 |   -   |
## +-----------------------------+-----------------------+------------+-------+
##  Total params: 14,714,688 (56.13 MB)
##  Trainable params: 7,635,264 (29.13 MB)
##  Non-trainable params: 7,079,424 (27.01 MB)

# Freeze only layers of a certain type, e.g, BatchNorm layers
batch_norm_layer_class_name <- class(layer_batch_normalization())[1]
is_batch_norm_layer <- function(x) inherits(x, batch_norm_layer_class_name)

model <- application_efficientnet_b0()
freeze_weights(model, which = is_batch_norm_layer)
# print(model)

# equivalent to:
for(layer in model$layers) {
  if(is_batch_norm_layer(layer))
    layer$trainable <- FALSE
  else
    layer$trainable <- TRUE
}

Note

The from and to layer arguments are both inclusive.

When applied to a model, the freeze or unfreeze is a global operation over all layers in the model (i.e. layers not within the specified range will be set to the opposite value, e.g. unfrozen for a call to freeze).

Models must be compiled again after weights are frozen or unfrozen.


Layer/Model configuration

Description

A layer config is an object returned from get_config() that contains the configuration of a layer or model. The same layer or model can be reinstantiated later (without its trained weights) from this configuration using from_config(). The config does not include connectivity information, nor the class name (those are handled externally).

Usage

get_config(object)

from_config(config, custom_objects = NULL)

Arguments

object

Layer or model object

config

Object with layer or model configuration

custom_objects

list of custom objects needed to instantiate the layer, e.g., custom layers defined by new_layer_class() or similar.

Value

get_config() returns an object with the configuration, from_config() returns a re-instantiation of the object.

Note

Objects returned from get_config() are not serializable via RDS. If you want to save and restore a model across sessions, you can use save_model_config() (for model configuration only, not weights) or save_model() to save the model configuration and weights to the filesystem.

See Also

Other model functions:
get_layer()
keras_model()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()

Other layer methods:
count_params()
get_weights()
quantize_weights()
reset_state()


Get/set the currently registered custom objects.

Description

Custom objects set using custom_object_scope() are not added to the global list of custom objects, and will not appear in the returned list.

Usage

get_custom_objects()

set_custom_objects(objects = named_list(), clear = TRUE)

Arguments

objects

A named list of custom objects, as returned by get_custom_objects() and set_custom_objects().

clear

bool, whether to clear the custom object registry before populating it with objects.

Value

An R named list mapping registered names to registered objects. set_custom_objects() returns the registry values before updating, invisibly.

Examples

get_custom_objects()

You can use set_custom_objects() to restore a previous registry state.

# within a function, if you want to temporarily modify the registry,
function() {
  orig_objects <- set_custom_objects(clear = TRUE)
  on.exit(set_custom_objects(orig_objects))

  ## temporarily modify the global registry
  # register_keras_serializable(....)
  # ....  <do work>
  # on.exit(), the previous registry state is restored.
}

Note

register_keras_serializable() is preferred over set_custom_objects() for registering new objects.

See Also

Other serialization utilities:
deserialize_keras_object()
get_registered_name()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()


Downloads a file from a URL if it not already in the cache.

Description

By default the file at the url origin is downloaded to the cache_dir ⁠~/.keras⁠, placed in the cache_subdir datasets, and given the filename fname. The final location of a file example.txt would therefore be ⁠~/.keras/datasets/example.txt⁠. Files in .tar, .tar.gz, .tar.bz, and .zip formats can also be extracted.

Passing a hash will verify the file after download. The command line programs shasum and sha256sum can compute the hash.

Usage

get_file(
  fname = NULL,
  origin = NULL,
  ...,
  file_hash = NULL,
  cache_subdir = "datasets",
  hash_algorithm = "auto",
  extract = FALSE,
  archive_format = "auto",
  cache_dir = NULL,
  force_download = FALSE
)

Arguments

fname

Name of the file. If an absolute path, e.g. "/path/to/file.txt" is specified, the file will be saved at that location. If NULL, the name of the file at origin will be used.

origin

Original URL of the file.

...

For forward/backward compatability.

file_hash

The expected hash string of the file after download. The sha256 and md5 hash algorithms are both supported.

cache_subdir

Subdirectory under the Keras cache dir where the file is saved. If an absolute path, e.g. "/path/to/folder" is specified, the file will be saved at that location.

hash_algorithm

Select the hash algorithm to verify the file. options are ⁠"md5'⁠, ⁠"sha256'⁠, and ⁠"auto'⁠. The default 'auto' detects the hash algorithm in use.

extract

TRUE tries extracting the file as an Archive, like tar or zip.

archive_format

Archive format to try for extracting the file. Options are ⁠"auto'⁠, ⁠"tar'⁠, ⁠"zip'⁠, and NULL. "tar" includes tar, tar.gz, and tar.bz files. The default "auto" corresponds to c("tar", "zip"). NULL or an empty list will return no matches found.

cache_dir

Location to store cached files, when NULL it defaults to Sys.getenv("KERAS_HOME", "~/.keras/").

force_download

If TRUE, the file will always be re-downloaded regardless of the cache state.

Value

Path to the downloaded file.

** Warning on malicious downloads **

Downloading something from the Internet carries a risk. NEVER download a file/archive if you do not trust the source. We recommend that you specify the file_hash argument (if the hash of the source file is known) to make sure that the file you are getting is the one you expect.

Examples

path_to_downloaded_file <- get_file(
    origin = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz",
    extract = TRUE
)

See Also

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Retrieves a layer based on either its name (unique) or index.

Description

Indices are based on order of horizontal graph traversal (bottom-up) and are 1-based. If name and index are both provided, index will take precedence.

Usage

get_layer(object, name = NULL, index = NULL)

Arguments

object

Keras model object

name

String, name of layer.

index

Integer, index of layer (1-based). Also valid are negative values, which count from the end of model.

Value

A layer instance.

See Also

Other model functions:
get_config()
keras_model()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()


Returns the name registered to an object within the Keras framework.

Description

This function is part of the Keras serialization and deserialization framework. It maps objects to the string names associated with those objects for serialization/deserialization.

Usage

get_registered_name(obj)

Arguments

obj

The object to look up.

Value

The name associated with the object, or the default name if the object is not registered.

See Also

Other serialization utilities:
deserialize_keras_object()
get_custom_objects()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()


Returns the class associated with name if it is registered with Keras.

Description

This function is part of the Keras serialization and deserialization framework. It maps strings to the objects associated with them for serialization/deserialization.

Usage

get_registered_object(name, custom_objects = NULL, module_objects = NULL)

Arguments

name

The name to look up.

custom_objects

A named list of custom objects to look the name up in. Generally, custom_objects is provided by the user.

module_objects

A named list of custom objects to look the name up in. Generally, module_objects is provided by midlevel library implementers.

Value

An instantiable class associated with name, or NULL if no such class exists.

Examples

from_config <- function(cls, config, custom_objects = NULL) {
  if ('my_custom_object_name' \%in\% names(config)) {
    config$hidden_cls <- get_registered_object(
      config$my_custom_object_name,
      custom_objects = custom_objects)
  }
}

See Also

Other serialization utilities:
deserialize_keras_object()
get_custom_objects()
get_registered_name()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()


Returns the list of input tensors necessary to compute tensor.

Description

Output will always be a list of tensors (potentially with 1 element).

Usage

get_source_inputs(tensor)

Arguments

tensor

The tensor to start from.

Value

List of input tensors.

Example

input <- keras_input(c(3))
output <- input |> layer_dense(4) |> op_multiply(5)
reticulate::py_id(get_source_inputs(output)[[1]]) ==
reticulate::py_id(input)
## [1] TRUE

See Also

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Layer/Model weights as R arrays

Description

Layer/Model weights as R arrays

Usage

get_weights(object, trainable = NA)

set_weights(object, weights)

Arguments

object

Layer or model object

trainable

if NA (the default), all weights are returned. If TRUE, only weights of trainable variables are returned. If FALSE, only weights of non-trainable variables are returned.

weights

Weights as R array

Value

A list of R arrays.

Note

You can access the Layer/Model as KerasVariables (which are also backend-native tensors like tf.Variable) at object$weights, object$trainable_weights, or object$non_trainable_weights

See Also

Other layer methods:
count_params()
get_config()
quantize_weights()
reset_state()


Saves an image stored as an array to a path or file object.

Description

Saves an image stored as an array to a path or file object.

Usage

image_array_save(
  x,
  path,
  data_format = NULL,
  file_format = NULL,
  scale = TRUE,
  ...
)

Arguments

x

An array.

path

Path or file object.

data_format

Image data format, either "channels_first" or "channels_last".

file_format

Optional file format override. If omitted, the format to use is determined from the filename extension. If a file object was used instead of a filename, this parameter should always be used.

scale

Whether to rescale image values to be within ⁠[0, 255]⁠.

...

Additional keyword arguments passed to PIL.Image.save().

Value

Called primarily for side effects. The input x is returned, invisibly, to enable usage with the pipe.

See Also

Other image utils:
image_from_array()
image_load()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Generates a tf.data.Dataset from image files in a directory.

Description

If your directory structure is:

main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg

Then calling image_dataset_from_directory(main_directory, labels = 'inferred') will return a tf.data.Dataset that yields batches of images from the subdirectories class_a and class_b, together with labels 0 and 1 (0 corresponding to class_a and 1 corresponding to class_b).

Supported image formats: .jpeg, .jpg, .png, .bmp, .gif. Animated gifs are truncated to the first frame.

Usage

image_dataset_from_directory(
  directory,
  labels = "inferred",
  label_mode = "int",
  class_names = NULL,
  color_mode = "rgb",
  batch_size = 32L,
  image_size = c(256L, 256L),
  shuffle = TRUE,
  seed = NULL,
  validation_split = NULL,
  subset = NULL,
  interpolation = "bilinear",
  follow_links = FALSE,
  crop_to_aspect_ratio = FALSE,
  pad_to_aspect_ratio = FALSE,
  data_format = NULL,
  verbose = TRUE
)

Arguments

directory

Directory where the data is located. If labels is "inferred", it should contain subdirectories, each containing images for a class. Otherwise, the directory structure is ignored.

labels

Either "inferred" (labels are generated from the directory structure), NULL (no labels), or a list/tuple of integer labels of the same size as the number of image files found in the directory. Labels should be sorted according to the alphanumeric order of the image file paths (obtained via os.walk(directory) in Python).

label_mode

String describing the encoding of labels. Options are:

  • "int": means that the labels are encoded as integers (e.g. for sparse_categorical_crossentropy loss).

  • "categorical" means that the labels are encoded as a categorical vector (e.g. for categorical_crossentropy loss).

  • "binary" means that the labels (there can be only 2) are encoded as float32 scalars with values 0 or 1 (e.g. for binary_crossentropy).

  • NULL (no labels).

class_names

Only valid if labels is "inferred". This is the explicit list of class names (must match names of subdirectories). Used to control the order of the classes (otherwise alphanumerical order is used).

color_mode

One of "grayscale", "rgb", "rgba". Whether the images will be converted to have 1, 3, or 4 channels. Defaults to "rgb".

batch_size

Size of the batches of data. Defaults to 32. If NULL, the data will not be batched (the dataset will yield individual samples).

image_size

Size to resize images to after they are read from disk, specified as ⁠(height, width)⁠. Since the pipeline processes batches of images that must all have the same size, this must be provided. Defaults to ⁠(256, 256)⁠.

shuffle

Whether to shuffle the data. Defaults to TRUE. If set to FALSE, sorts the data in alphanumeric order.

seed

Optional random seed for shuffling and transformations.

validation_split

Optional float between 0 and 1, fraction of data to reserve for validation.

subset

Subset of the data to return. One of "training", "validation", or "both". Only used if validation_split is set. When subset = "both", the utility returns a tuple of two datasets (the training and validation datasets respectively).

interpolation

String, the interpolation method used when resizing images. Supports "bilinear", "nearest", "bicubic", "area", "lanczos3", "lanczos5", "gaussian", "mitchellcubic". Defaults to "bilinear".

follow_links

Whether to visit subdirectories pointed to by symlinks. Defaults to FALSE.

crop_to_aspect_ratio

If TRUE, resize the images without aspect ratio distortion. When the original aspect ratio differs from the target aspect ratio, the output image will be cropped so as to return the largest possible window in the image (of size image_size) that matches the target aspect ratio. By default (crop_to_aspect_ratio = FALSE), aspect ratio may not be preserved.

pad_to_aspect_ratio

If TRUE, resize the images without aspect ratio distortion. When the original aspect ratio differs from the target aspect ratio, the output image will be padded so as to return the largest possible window in the image (of size image_size) that matches the target aspect ratio. By default (pad_to_aspect_ratio=FALSE), aspect ratio may not be preserved.

data_format

If NULL uses config_image_data_format() otherwise either 'channel_last' or 'channel_first'.

verbose

Whether to display number information on classes and number of files found. Defaults to TRUE.

Value

A tf.data.Dataset object.

  • If label_mode is NULL, it yields float32 tensors of shape ⁠(batch_size, image_size[1], image_size[2], num_channels)⁠, encoding images (see below for rules regarding num_channels).

  • Otherwise, it yields a tuple ⁠(images, labels)⁠, where images has shape ⁠(batch_size, image_size[1], image_size[2], num_channels)⁠, and labels follows the format described below.

Rules regarding labels format:

  • if label_mode is "int", the labels are an int32 tensor of shape ⁠(batch_size,)⁠.

  • if label_mode is "binary", the labels are a float32 tensor of 1s and 0s of shape ⁠(batch_size, 1)⁠.

  • if label_mode is "categorical", the labels are a float32 tensor of shape ⁠(batch_size, num_classes)⁠, representing a one-hot encoding of the class index.

Rules regarding number of channels in the yielded images:

  • if color_mode is "grayscale", there's 1 channel in the image tensors.

  • if color_mode is "rgb", there are 3 channels in the image tensors.

  • if color_mode is "rgba", there are 4 channels in the image tensors.

See Also

Other dataset utils:
audio_dataset_from_directory()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()

Other preprocessing:
image_smart_resize()
text_dataset_from_directory()
timeseries_dataset_from_array()


Converts a 3D array to a PIL Image instance.

Description

Converts a 3D array to a PIL Image instance.

Usage

image_from_array(x, data_format = NULL, scale = TRUE, dtype = NULL)

Arguments

x

Input data, in any form that can be converted to an array.

data_format

Image data format, can be either "channels_first" or "channels_last". Defaults to NULL, in which case the global setting config_image_data_format() is used (unless you changed it, it defaults to "channels_last").

scale

Whether to rescale the image such that minimum and maximum values are 0 and 255 respectively. Defaults to TRUE.

dtype

Dtype to use. NULL means the global setting config_floatx() is used (unless you changed it, it defaults to "float32"). Defaults to NULL.

Value

A PIL Image instance.

Example

img <- array(runif(30000), dim = c(100, 100, 3))
pil_img <- image_from_array(img)
pil_img
## <PIL.Image.Image image mode=RGB size=100x100>

See Also

Other image utils:
image_array_save()
image_load()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Loads an image into PIL format.

Description

Loads an image into PIL format.

Usage

image_load(
  path,
  color_mode = "rgb",
  target_size = NULL,
  interpolation = "nearest",
  keep_aspect_ratio = FALSE
)

Arguments

path

Path to image file.

color_mode

One of "grayscale", "rgb", "rgba". Default: "rgb". The desired image format.

target_size

Either NULL (default to original size) or tuple of ints ⁠(img_height, img_width)⁠.

interpolation

Interpolation method used to resample the image if the target size is different from that of the loaded image. Supported methods are "nearest", "bilinear", and "bicubic". If PIL version 1.1.3 or newer is installed, "lanczos" is also supported. If PIL version 3.4.0 or newer is installed, "box" and "hamming" are also supported. By default, "nearest" is used.

keep_aspect_ratio

Boolean, whether to resize images to a target size without aspect ratio distortion. The image is cropped in the center with target aspect ratio before resizing.

Value

A PIL Image instance.

Example

image_path <- get_file(origin = "https://www.r-project.org/logo/Rlogo.png")
(image <- image_load(image_path))
## <PIL.Image.Image image mode=RGB size=724x561>

input_arr <- image_to_array(image)
str(input_arr)
##  num [1:561, 1:724, 1:3] 0 0 0 0 0 0 0 0 0 0 ...

input_arr %<>% array_reshape(dim = c(1, dim(input_arr))) # Convert single image to a batch.
model |> predict(input_arr)

See Also

Other image utils:
image_array_save()
image_from_array()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Resize images to a target size without aspect ratio distortion.

Description

Image datasets typically yield images that have each a different size. However, these images need to be batched before they can be processed by Keras layers. To be batched, images need to share the same height and width.

You could simply do, in TF (or JAX equivalent):

size <- c(200, 200)
ds <- ds$map(\(img) tf$image$resize(img, size))

However, if you do this, you distort the aspect ratio of your images, since in general they do not all have the same aspect ratio as size. This is fine in many cases, but not always (e.g. for image generation models this can be a problem).

Note that passing the argument preserve_aspect_ratio = TRUE to tf$image$resize() will preserve the aspect ratio, but at the cost of no longer respecting the provided target size.

This calls for:

size <- c(200, 200)
ds <- ds$map(\(img) image_smart_resize(img, size))

Your output images will actually be ⁠(200, 200)⁠, and will not be distorted. Instead, the parts of the image that do not fit within the target size get cropped out.

The resizing process is:

  1. Take the largest centered crop of the image that has the same aspect ratio as the target size. For instance, if size = c(200, 200) and the input image has size ⁠(340, 500)⁠, we take a crop of ⁠(340, 340)⁠ centered along the width.

  2. Resize the cropped image to the target size. In the example above, we resize the ⁠(340, 340)⁠ crop to ⁠(200, 200)⁠.

Usage

image_smart_resize(
  x,
  size,
  interpolation = "bilinear",
  data_format = "channels_last",
  backend_module = NULL
)

Arguments

x

Input image or batch of images (as a tensor or array). Must be in format ⁠(height, width, channels)⁠ or ⁠(batch_size, height, width, channels)⁠.

size

Tuple of ⁠(height, width)⁠ integer. Target size.

interpolation

String, interpolation to use for resizing. Supports "bilinear", "nearest", "bicubic", "lanczos3", "lanczos5". Defaults to 'bilinear'.

data_format

"channels_last" or "channels_first".

backend_module

Backend module to use (if different from the default backend).

Value

Array with shape ⁠(size[1], size[2], channels)⁠. If the input image was an array, the output is an array, and if it was a backend-native tensor, the output is a backend-native tensor.

See Also

Other image utils:
image_array_save()
image_from_array()
image_load()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()

Other preprocessing:
image_dataset_from_directory()
text_dataset_from_directory()
timeseries_dataset_from_array()


Converts a PIL Image instance to a matrix.

Description

Converts a PIL Image instance to a matrix.

Usage

image_to_array(img, data_format = NULL, dtype = NULL)

Arguments

img

Input PIL Image instance.

data_format

Image data format, can be either "channels_first" or "channels_last". Defaults to NULL, in which case the global setting config_image_data_format() is used (unless you changed it, it defaults to "channels_last").

dtype

Dtype to use. NULL means the global setting config_floatx() is used (unless you changed it, it defaults to "float32").

Value

A 3D array.

Example

image_path <- get_file(origin = "https://www.r-project.org/logo/Rlogo.png")
(img <- image_load(image_path))
## <PIL.Image.Image image mode=RGB size=724x561>

array <- image_to_array(img)
str(array)
##  num [1:561, 1:724, 1:3] 0 0 0 0 0 0 0 0 0 0 ...

See Also

Other image utils:
image_array_save()
image_from_array()
image_load()
image_smart_resize()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()

Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()


Initializer that generates tensors with constant values.

Description

Only scalar values are allowed. The constant value provided must be convertible to the dtype requested when calling the initializer.

Usage

initializer_constant(value = 0)

Arguments

value

A numeric scalar.

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_constant(10)
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_constant(10)
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other constant initializers:
initializer_identity()
initializer_ones()
initializer_zeros()

Other initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


The Glorot normal initializer, also called Xavier normal initializer.

Description

Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor.

Usage

initializer_glorot_normal(seed = NULL)

Arguments

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_glorot_normal()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_glorot_normal()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


The Glorot uniform initializer, also called Xavier uniform initializer.

Description

Draws samples from a uniform distribution within ⁠[-limit, limit]⁠, where limit = sqrt(6 / (fan_in + fan_out)) (fan_in is the number of input units in the weight tensor and fan_out is the number of output units).

Usage

initializer_glorot_uniform(seed = NULL)

Arguments

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_glorot_uniform()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_glorot_uniform()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


He normal initializer.

Description

It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in) where fan_in is the number of input units in the weight tensor.

Usage

initializer_he_normal(seed = NULL)

Arguments

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_he_normal()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_he_normal()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


He uniform variance scaling initializer.

Description

Draws samples from a uniform distribution within ⁠[-limit, limit]⁠, where limit = sqrt(6 / fan_in) (fan_in is the number of input units in the weight tensor).

Usage

initializer_he_uniform(seed = NULL)

Arguments

seed

A integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_he_uniform()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_he_uniform()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Initializer that generates the identity matrix.

Description

Only usable for generating 2D matrices.

Usage

initializer_identity(gain = 1)

Arguments

gain

Multiplicative factor to apply to the identity matrix.

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_identity()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_identity()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other constant initializers:
initializer_constant()
initializer_ones()
initializer_zeros()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Lecun normal initializer.

Description

Initializers allow you to pre-specify an initialization strategy, encoded in the Initializer object, without knowing the shape and dtype of the variable being initialized.

Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor.

Usage

initializer_lecun_normal(seed = NULL)

Arguments

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_lecun_normal()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_lecun_normal()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Lecun uniform initializer.

Description

Draws samples from a uniform distribution within ⁠[-limit, limit]⁠, where limit = sqrt(3 / fan_in) (fan_in is the number of input units in the weight tensor).

Usage

initializer_lecun_uniform(seed = NULL)

Arguments

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_lecun_uniform()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_lecun_uniform()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Initializer that generates tensors initialized to 1.

Description

Also available via the shortcut function ones.

Usage

initializer_ones()

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_ones()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_ones()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other constant initializers:
initializer_constant()
initializer_identity()
initializer_zeros()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Initializer that generates an orthogonal matrix.

Description

If the shape of the tensor to initialize is two-dimensional, it is initialized with an orthogonal matrix obtained from the QR decomposition of a matrix of random numbers drawn from a normal distribution. If the matrix has fewer rows than columns then the output will have orthogonal rows. Otherwise, the output will have orthogonal columns.

If the shape of the tensor to initialize is more than two-dimensional, a matrix of shape ⁠(shape[1] * ... * shape[n - 1], shape[n])⁠ is initialized, where n is the length of the shape vector. The matrix is subsequently reshaped to give a tensor of the desired shape.

Usage

initializer_orthogonal(gain = 1, seed = NULL)

Arguments

gain

Multiplicative factor to apply to the orthogonal matrix.

seed

An integer. Used to make the behavior of the initializer deterministic.

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_orthogonal()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_orthogonal()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

Reference

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Random normal initializer.

Description

Draws samples from a normal distribution for given parameters.

Usage

initializer_random_normal(mean = 0, stddev = 0.05, seed = NULL)

Arguments

mean

A numeric scalar. Mean of the random values to generate.

stddev

A numeric scalar. Standard deviation of the random values to generate.

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_random_normal(mean = 0.0, stddev = 1.0)
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_random_normal(mean = 0.0, stddev = 1.0)
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Random uniform initializer.

Description

Draws samples from a uniform distribution for given parameters.

Usage

initializer_random_uniform(minval = -0.05, maxval = 0.05, seed = NULL)

Arguments

minval

A numeric scalar or a scalar keras tensor. Lower bound of the range of random values to generate (inclusive).

maxval

A numeric scalar or a scalar keras tensor. Upper bound of the range of random values to generate (exclusive).

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_random_uniform(minval = 0.0, maxval = 1.0)
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_random_uniform(minval = 0.0, maxval = 1.0)
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_truncated_normal()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()


Initializer that generates a truncated normal distribution.

Description

The values generated are similar to values from a RandomNormal initializer, except that values more than two standard deviations from the mean are discarded and re-drawn.

Usage

initializer_truncated_normal(mean = 0, stddev = 0.05, seed = NULL)

Arguments

mean

A numeric scalar. Mean of the random values to generate.

stddev

A numeric scalar. Standard deviation of the random values to generate.

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_truncated_normal(mean = 0, stddev = 1)
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_truncated_normal(mean = 0, stddev = 1)
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_variance_scaling()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_variance_scaling()
initializer_zeros()


Initializer that adapts its scale to the shape of its input tensors.

Description

With ⁠distribution = "truncated_normal" or "untruncated_normal"⁠, samples are drawn from a truncated/untruncated normal distribution with a mean of zero and a standard deviation (after truncation, if used) stddev = sqrt(scale / n), where n is:

  • number of input units in the weight tensor, if mode = "fan_in"

  • number of output units, if mode = "fan_out"

  • average of the numbers of input and output units, if mode = "fan_avg"

With distribution = "uniform", samples are drawn from a uniform distribution within ⁠[-limit, limit]⁠, where limit = sqrt(3 * scale / n).

Usage

initializer_variance_scaling(
  scale = 1,
  mode = "fan_in",
  distribution = "truncated_normal",
  seed = NULL
)

Arguments

scale

Scaling factor (positive float).

mode

One of "fan_in", "fan_out", "fan_avg".

distribution

Random distribution to use. One of "truncated_normal", "untruncated_normal", or "uniform".

seed

An integer or instance of random_seed_generator(). Used to make the behavior of the initializer deterministic. Note that an initializer seeded with an integer or NULL (unseeded) will produce the same random values across multiple calls. To get different random values across multiple calls, use as seed an instance of random_seed_generator().

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_variance_scaling(scale = 0.1, mode = 'fan_in',
                                            distribution = 'uniform')
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_variance_scaling(scale = 0.1, mode = 'fan_in',
                                            distribution = 'uniform')
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other random initializers:
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_zeros()


Initializer that generates tensors initialized to 0.

Description

Initializer that generates tensors initialized to 0.

Usage

initializer_zeros()

Value

An Initializer instance that can be passed to layer or variable constructors, or called directly with a shape to return a Tensor.

Examples

# Standalone usage:
initializer <- initializer_zeros()
values <- initializer(shape = c(2, 2))
# Usage in a Keras layer:
initializer <- initializer_zeros()
layer <- layer_dense(units = 3, kernel_initializer = initializer)

See Also

Other constant initializers:
initializer_constant()
initializer_identity()
initializer_ones()

Other initializers:
initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()


Install Keras

Description

This function will install Keras along with a selected backend, including all Python dependencies.

Usage

install_keras(
  envname = "r-keras",
  ...,
  extra_packages = c("scipy", "pandas", "Pillow", "pydot", "ipython",
    "tensorflow_datasets"),
  python_version = ">=3.9,<=3.11",
  backend = c("tensorflow", "jax"),
  gpu = NA,
  restart_session = TRUE
)

Arguments

envname

Name of or path to a Python virtual environment

...

reserved for future compatibility.

extra_packages

Additional Python packages to install alongside Keras

python_version

Passed on to reticulate::virtualenv_starter()

backend

Which backend(s) to install. Accepted values include "tensorflow", "jax" and "torch"

gpu

whether to install a GPU capable version of the backend.

restart_session

Whether to restart the R session after installing (note this will only occur within RStudio).

Value

No return value, called for side effects.

See Also

tensorflow::install_tensorflow()


Main Keras module

Description

The keras module object is the equivalent of reticulate::import("keras") and provided mainly as a convenience.

Format

An object of class python.builtin.module

Value

the keras Python module


Create a Keras tensor (Functional API input).

Description

A Keras tensor is a symbolic tensor-like object, which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model.

For instance, if a, b and c are Keras tensors, it becomes possible to do: model <- keras_model(input = c(a, b), output = c)

Usage

keras_input(
  shape = NULL,
  batch_size = NULL,
  dtype = NULL,
  sparse = NULL,
  batch_shape = NULL,
  name = NULL,
  tensor = NULL,
  optional = FALSE
)

Arguments

shape

A shape list (list of integers or NULL objects), not including the batch size. For instance, shape = c(32) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this list can be NULL or NA; NULL/NA elements represent dimensions where the shape is not known and may vary (e.g. sequence length).

batch_size

Optional static batch size (integer).

dtype

The data type expected by the input, as a string (e.g. "float32", "int32"...)

sparse

A boolean specifying whether the expected input will be sparse tensors. Note that, if sparse is FALSE, sparse tensors can still be passed into the input - they will be densified with a default value of 0. This feature is only supported with the TensorFlow backend. Defaults to FALSE.

batch_shape

Optional shape list (list of integers or NULL objects), including the batch size.

name

Optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.

tensor

Optional existing tensor to wrap into the Input layer. If set, the layer will use this tensor rather than creating a new placeholder tensor.

optional

Boolean, whether the input is optional or not. An optional input can accept NULL values.

Value

A Keras tensor, which can passed to the inputs argument of (keras_model()).

Examples

# This is a logistic regression in Keras
input <- layer_input(shape=c(32))
output <- input |> layer_dense(16, activation='softmax')
model <- keras_model(input, output)

See Also

Other model creation:
keras_model()
keras_model_sequential()


Keras Model (Functional API)

Description

A model is a directed acyclic graph of layers.

Usage

keras_model(inputs = NULL, outputs = NULL, ...)

Arguments

inputs

Input tensor(s) (from keras_input())

outputs

Output tensors (from calling layers with inputs)

...

Any additional arguments

Value

A Model instance.

Examples

library(keras3)

# input tensor
inputs <- keras_input(shape = c(784))

# outputs compose input + dense layers
predictions <- inputs |>
  layer_dense(units = 64, activation = 'relu') |>
  layer_dense(units = 64, activation = 'relu') |>
  layer_dense(units = 10, activation = 'softmax')

# create and compile model
model <- keras_model(inputs = inputs, outputs = predictions)
model |> compile(
  optimizer = 'rmsprop',
  loss = 'categorical_crossentropy',
  metrics = c('accuracy')
)

See Also

Other model functions:
get_config()
get_layer()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()

Other model creation:
keras_input()
keras_model_sequential()


Keras Model composed of a linear stack of layers

Description

Keras Model composed of a linear stack of layers

Usage

keras_model_sequential(
  input_shape = NULL,
  name = NULL,
  ...,
  input_dtype = NULL,
  input_batch_size = NULL,
  input_sparse = NULL,
  input_batch_shape = NULL,
  input_name = NULL,
  input_tensor = NULL,
  input_optional = FALSE,
  trainable = TRUE,
  layers = list()
)

Arguments

input_shape

A shape integer vector, not including the batch size. For instance, shape=c(32) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this shape can be NA; NA elements represent dimensions where the shape is not known and may vary (e.g. sequence length).

name

Name of model

...

additional arguments passed on to keras.layers.InputLayer.

input_dtype

The data type expected by the input, as a string (e.g. "float32", "int32"...)

input_batch_size

Optional static batch size (integer).

input_sparse

A boolean specifying whether the expected input will be sparse tensors. Note that, if sparse is FALSE, sparse tensors can still be passed into the input - they will be densified with a default value of 0. This feature is only supported with the TensorFlow backend. Defaults to FALSE.

input_batch_shape

An optional way to specify batch_size and input_shape as one argument.

input_name

Optional name string for the input layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.

input_tensor

Optional existing tensor to wrap into the InputLayer. If set, the layer will use this tensor rather than creating a new placeholder tensor.

input_optional

Boolean, whether the input is optional or not. An optional input can accept NULL values.

trainable

Boolean, whether the model's variables should be trainable. You can also change the trainable status of a model/layer with freeze_weights() and unfreeze_weights().

layers

List of layers to add to the model.

Value

A Sequential model instance.

Examples

model <- keras_model_sequential(input_shape = c(784))
model |>
  layer_dense(units = 32) |>
  layer_activation('relu') |>
  layer_dense(units = 10) |>
  layer_activation('softmax')

model |> compile(
  optimizer = 'rmsprop',
  loss = 'categorical_crossentropy',
  metrics = c('accuracy')
)

model
## Model: "sequential"
## +---------------------------------+------------------------+---------------+
## | Layer (type)                    | Output Shape           |       Param # |
## +=================================+========================+===============+
## | dense (Dense)                   | (None, 32)             |        25,120 |
## +---------------------------------+------------------------+---------------+
## | activation (Activation)         | (None, 32)             |             0 |
## +---------------------------------+------------------------+---------------+
## | dense_1 (Dense)                 | (None, 10)             |           330 |
## +---------------------------------+------------------------+---------------+
## | activation_1 (Activation)       | (None, 10)             |             0 |
## +---------------------------------+------------------------+---------------+
##  Total params: 25,450 (99.41 KB)
##  Trainable params: 25,450 (99.41 KB)
##  Non-trainable params: 0 (0.00 B)

Note

If input_shape is omitted, then the model layer shapes, including the final model output shape, will not be known until the model is built, either by calling the model with an input tensor/array like model(input), (possibly via fit()/evaluate()/predict()), or by explicitly calling model$build(input_shape).

See Also

Other model functions:
get_config()
get_layer()
keras_model()
pop_layer()
summary.keras.src.models.model.Model()

Other model creation:
keras_input()
keras_model()


Define a custom Layer class.

Description

A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors. It involves computation, defined in the call() method, and a state (weight variables). State can be created:

  • in initialize(), for instance via self$add_weight();

  • in the optional build() method, which is invoked by the first call() to the layer, and supplies the shape(s) of the input(s), which may not have been known at initialization time.

Layers are recursively composable: If you assign a Layer instance as an attribute of another Layer, the outer layer will start tracking the weights created by the inner layer. Nested layers should be instantiated in the initialize() method or build() method.

Users will just instantiate a layer and then treat it as a callable.

Usage

Layer(
  classname,
  initialize = NULL,
  call = NULL,
  build = NULL,
  get_config = NULL,
  ...,
  public = list(),
  private = list(),
  inherit = NULL,
  parent_env = parent.frame()
)

Arguments

classname

String, the name of the custom class. (Conventionally, CamelCase).

initialize, call, build, get_config

Recommended methods to implement. See description and details sections.

..., public

Additional methods or public members of the custom class.

private

Named list of R objects (typically, functions) to include in instance private environments. private methods will have all the same symbols in scope as public methods (See section "Symbols in Scope"). Each instance will have it's own private environment. Any objects in private will be invisible from the Keras framework and the Python runtime.

inherit

What the custom class will subclass. By default, the base keras class.

parent_env

The R environment that all class methods will have as a grandparent.

Value

A composing layer constructor, with similar behavior to other layer functions like layer_dense(). The first argument of the returned function will be object, enabling initialize()ing and call() the layer in one step while composing the layer with the pipe, like

layer_foo <- Layer("Foo", ....)
output <- inputs |> layer_foo()

To only initialize() a layer instance and not call() it, pass a missing or NULL value to object, or pass all arguments to initialize() by name.

layer <- layer_dense(units = 2, activation = "relu")
layer <- layer_dense(NULL, 2, activation = "relu")
layer <- layer_dense(, 2, activation = "relu")

# then you can call() the layer in a separate step
outputs <- inputs |> layer()

Symbols in scope

All R function custom methods (public and private) will have the following symbols in scope:

  • self: The custom class instance.

  • super: The custom class superclass.

  • private: An R environment specific to the class instance. Any objects assigned here are invisible to the Keras framework.

  • ⁠__class__⁠ and as.symbol(classname): the custom class type object.

Attributes

  • name: The name of the layer (string).

  • dtype: Dtype of the layer's weights. Alias of layer$variable_dtype.

  • variable_dtype: Dtype of the layer's weights.

  • compute_dtype: The dtype of the layer's computations. Layers automatically cast inputs to this dtype, which causes the computations and output to also be in this dtype. When mixed precision is used with a keras$mixed_precision$DTypePolicy, this will be different than variable_dtype.

  • trainable_weights: List of variables to be included in backprop.

  • non_trainable_weights: List of variables that should not be included in backprop.

  • weights: The concatenation of the lists trainable_weights and non_trainable_weights (in this order).

  • trainable: Whether the layer should be trained (boolean), i.e. whether its potentially-trainable weights should be returned as part of layer$trainable_weights.

  • input_spec: Optional (list of) InputSpec object(s) specifying the constraints on inputs that can be accepted by the layer.

We recommend that custom Layers implement the following methods:

  • initialize(): Defines custom layer attributes, and creates layer weights that do not depend on input shapes, using add_weight(), or other state.

  • build(input_shape): This method can be used to create weights that depend on the shape(s) of the input(s), using add_weight(), or other state. Calling call() will automatically build the layer (if it has not been built yet) by calling build().

  • call(...): Method called after making sure build() has been called. call() performs the logic of applying the layer to the input arguments. Two reserved arguments you can optionally use in call() are:

    1. training (boolean, whether the call is in inference mode or training mode).

    2. mask (boolean tensor encoding masked timesteps in the input, used e.g. in RNN layers).

    A typical signature for this method is call(inputs), and user could optionally add training and mask if the layer need them.

  • get_config(): Returns a named list containing the configuration used to initialize this layer. If the list names differ from the arguments in initialize(), then override from_config() as well. This method is used when saving the layer or a model that contains this layer.

Examples

Here's a basic example: a layer with two variables, w and b, that returns y <- (w %*% x) + b. It shows how to implement build() and call(). Variables set as attributes of a layer are tracked as weights of the layers (in layer$weights).

layer_simple_dense <- Layer(
  "SimpleDense",
  initialize = function(units = 32) {
    super$initialize()
    self$units <- units
  },

  # Create the state of the layer (weights)
  build = function(input_shape) {
    self$kernel <- self$add_weight(
      shape = shape(tail(input_shape, 1), self$units),
      initializer = "glorot_uniform",
      trainable = TRUE,
      name = "kernel"
    )
    self$bias = self$add_weight(
      shape = shape(self$units),
      initializer = "zeros",
      trainable = TRUE,
      name = "bias"
    )
  },

  # Defines the computation
  call = function(self, inputs) {
    op_matmul(inputs, self$kernel) + self$bias
  }
)

# Instantiates the layer.
# Supply missing `object` arg to skip invoking `call()` and instead return
# the Layer instance
linear_layer <- layer_simple_dense(, 4)

# This will call `build(input_shape)` and create the weights,
# and then invoke `call()`.
y <- linear_layer(op_ones(c(2, 2)))
stopifnot(length(linear_layer$weights) == 2)

# These weights are trainable, so they're listed in `trainable_weights`:
stopifnot(length(linear_layer$trainable_weights) == 2)

Besides trainable weights, updated via backpropagation during training, layers can also have non-trainable weights. These weights are meant to be updated manually during call(). Here's a example layer that computes the running sum of its inputs:

layer_compute_sum <- Layer(
  classname = "ComputeSum",

  initialize = function(input_dim) {
    super$initialize()

    # Create a non-trainable weight.
    self$total <- self$add_weight(
      shape = shape(),
      initializer = "zeros",
      trainable = FALSE,
      name = "total"
    )
  },

  call = function(inputs) {
    self$total$assign(self$total + op_sum(inputs))
    self$total
  }
)

my_sum <- layer_compute_sum(, 2)
x <- op_ones(c(2, 2))
y <- my_sum(x)

stopifnot(exprs = {
  all.equal(my_sum$weights,               list(my_sum$total))
  all.equal(my_sum$non_trainable_weights, list(my_sum$total))
  all.equal(my_sum$trainable_weights,     list())
})

Methods available

  • initialize(...,
               activity_regularizer = NULL,
               trainable = TRUE,
               dtype = NULL,
               autocast = TRUE,
               name = NULL)
    

    Initialize self. This method is typically called from a custom initialize() method. Example:

    layer_my_layer <- Layer("MyLayer",
      initialize = function(units, ..., dtype = NULL, name = NULL) {
        super$initialize(..., dtype = dtype, name = name)
        # .... finish initializing `self` instance
      }
    )
    

    Args:

    • trainable: Boolean, whether the layer's variables should be trainable.

    • name: String name of the layer.

    • dtype: The dtype of the layer's computations and weights. Can also be a keras$DTypePolicy, which allows the computation and weight dtype to differ. Defaults to NULL. NULL means to use config_dtype_policy(), which is a "float32" policy unless set to different value (via config_set_dtype_policy()).

  • add_loss(loss)
    

    Can be called inside of the call() method to add a scalar loss.

    Example:

    Layer("MyLayer",
      ...
      call = function(x) {
        self$add_loss(op_sum(x))
        x
      }
    
  • add_metric(...)
    
  • add_variable(...)
    

    Add a weight variable to the layer.

    Alias of add_weight().

  • add_weight(shape = NULL,
               initializer = NULL,
               dtype = NULL,
               trainable = TRUE,
               autocast = TRUE,
               regularizer = NULL,
               constraint = NULL,
               aggregation = 'mean',
               name = NULL)
    

    Add a weight variable to the layer.

    Args:

    • shape: shape for the variable (as defined by shape()) Must be fully-defined (no NA/NULL/-1 entries). Defaults to ⁠()⁠ (scalar) if unspecified.

    • initializer: Initializer object to use to populate the initial variable value, or string name of a built-in initializer (e.g. "random_normal"). If unspecified, defaults to "glorot_uniform" for floating-point variables and to "zeros" for all other types (e.g. int, bool).

    • dtype: Dtype of the variable to create, e.g. "float32". If unspecified, defaults to the layer's variable dtype (which itself defaults to "float32" if unspecified).

    • trainable: Boolean, whether the variable should be trainable via backprop or whether its updates are managed manually. Defaults to TRUE.

    • autocast: Boolean, whether to autocast layers variables when accessing them. Defaults to TRUE.

    • regularizer: Regularizer object to call to apply penalty on the weight. These penalties are summed into the loss function during optimization. Defaults to NULL.

    • constraint: Constraint object to call on the variable after any optimizer update, or string name of a built-in constraint. Defaults to NULL.

    • aggregation: String, one of 'mean', 'sum', 'only_first_replica'. Annotates the variable with the type of multi-replica aggregation to be used for this variable when writing custom data parallel training loops.

    • name: String name of the variable. Useful for debugging purposes.

    Returns:

    A backend tensor, wrapped in a KerasVariable class. The KerasVariable class has

    Methods:

    • assign(value)

    • assign_add(value)

    • assign_sub(value)

    • numpy() (calling ⁠as.array(<variable>)⁠ is preferred)

    Properties/Attributes:

    • value

    • dtype

    • ndim

    • shape (calling ⁠shape(<variable>)⁠ is preferred)

    • trainable

  • build(input_shape)
    
  • build_from_config(config)
    

    Builds the layer's states with the supplied config (named list of args).

    By default, this method calls the do.call(build, config$input_shape) method, which creates weights based on the layer's input shape in the supplied config. If your config contains other information needed to load the layer's state, you should override this method.

    Args:

    • config: Named list containing the input shape associated with this layer.

  • call(...)
    

    See description above

  • compute_mask(inputs, previous_mask)
    
  • compute_output_shape(...)
    
  • compute_output_spec(...)
    
  • count_params()
    

    Count the total number of scalars composing the weights.

    Returns: An integer count.

  • get_build_config()
    

    Returns a named list with the layer's input shape.

    This method returns a config (named list) that can be used by build_from_config(config) to create all states (e.g. Variables and Lookup tables) needed by the layer.

    By default, the config only contains the input shape that the layer was built with. If you're writing a custom layer that creates state in an unusual way, you should override this method to make sure this state is already created when Keras attempts to load its value upon model loading.

    Returns: A named list containing the input shape associated with the layer.

  • get_config()
    

    Returns the config of the object.

    An object config is a named list (serializable) containing the information needed to re-instantiate it. The config is expected to be serializable to JSON, and is expected to consist of a (potentially complex, nested) structure of names lists consisting of simple objects like strings, ints.

  • get_weights()
    

    Return the values of layer$weights as a list of R or NumPy arrays.

  • quantize(mode, type_check = TRUE)
    

    Currently, only the Dense, EinsumDense and Embedding layers support in-place quantization via this quantize() method.

    Example:

    model$quantize("int8") # quantize model in-place
    model |> predict(data) # faster inference
    
  • quantized_build(input_shape, mode)
    
  • quantized_call(...)
    
  • load_own_variables(store)
    

    Loads the state of the layer.

    You can override this method to take full control of how the state of the layer is loaded upon calling load_model().

    Args:

    • store: Named list from which the state of the model will be loaded.

  • save_own_variables(store)
    

    Saves the state of the layer.

    You can override this method to take full control of how the state of the layer is saved upon calling save_model().

    Args:

    • store: Named list where the state of the model will be saved.

  • set_weights(weights)
    

    Sets the values of weights from a list of R or NumPy arrays.

  • stateless_call(trainable_variables, non_trainable_variables,
                   ..., return_losses = FALSE)
    

    Call the layer without any side effects.

    Args:

    • trainable_variables: List of trainable variables of the model.

    • non_trainable_variables: List of non-trainable variables of the model.

    • ...: Positional and named arguments to be passed to call().

    • return_losses: If TRUE, stateless_call() will return the list of losses created during call() as part of its return values.

    Returns: An unnamed list. By default, returns list(outputs, non_trainable_variables). If return_losses = TRUE, then returns list(outputs, non_trainable_variables, losses).

    Note: non_trainable_variables include not only non-trainable weights such as BatchNormalization statistics, but also RNG seed state (if there are any random operations part of the layer, such as dropout), and Metric state (if there are any metrics attached to the layer). These are all elements of state of the layer.

    Example:

    model <- ...
    data <- ...
    trainable_variables <- model$trainable_variables
    non_trainable_variables <- model$non_trainable_variables
    # Call the model with zero side effects
    c(outputs, non_trainable_variables) %<-% model$stateless_call(
        trainable_variables,
        non_trainable_variables,
        data
    )
    # Attach the updated state to the model
    # (until you do this, the model is still in its pre-call state).
    purrr::walk2(
      model$non_trainable_variables, non_trainable_variables,
      \(variable, value) variable$assign(value))
    
  • symbolic_call(...)
    
  • from_config(config)
    

    Creates a layer from its config.

    This is a class method, meaning, the R function will not have a self symbol (a class instance) in scope. Use ⁠__class__⁠ or the classname symbol provided when the Layer() was constructed) to resolve the class definition. The default implementation is:

    from_config = function(config) {
      do.call(`__class__`, config)
    }
    

    This method is the reverse of get_config(), capable of instantiating the same layer from the config named list. It does not handle layer connectivity (handled by Network), nor weights (handled by set_weights()).

    Args:

    • config: A named list, typically the output of get_config().

    Returns: A layer instance.

Readonly properties:

  • compute_dtype The dtype of the computations performed by the layer.

  • dtype Alias of layer$variable_dtype.

  • input_dtype The dtype layer inputs should be converted to.

  • losses List of scalar losses from add_loss(), regularizers and sublayers.

  • metrics List of all metrics.

  • metrics_variables List of all metric variables.

  • non_trainable_variables List of all non-trainable layer state.

    This extends layer$non_trainable_weights to include all state used by the layer including state for metrics and SeedGenerators.

  • non_trainable_weights List of all non-trainable weight variables of the layer.

    These are the weights that should not be updated by the optimizer during training. Unlike, layer$non_trainable_variables this excludes metric state and random seeds.

  • trainable_variables List of all trainable layer state.

    This is equivalent to layer$trainable_weights.

  • trainable_weights List of all trainable weight variables of the layer.

    These are the weights that get updated by the optimizer during training.

  • path The path of the layer.

    If the layer has not been built yet, it will be NULL.

  • quantization_mode The quantization mode of this layer, NULL if not quantized.

  • variable_dtype The dtype of the state (weights) of the layer.

  • variables List of all layer state, including random seeds.

    This extends layer$weights to include all state used by the layer including SeedGenerators.

    Note that metrics variables are not included here, use metrics_variables to visit all the metric variables.

  • weights List of all weight variables of the layer.

    Unlike, layer$variables this excludes metric state and random seeds.

  • input Retrieves the input tensor(s) of a symbolic operation.

    Only returns the tensor(s) corresponding to the first time the operation was called.

    Returns: Input tensor or list of input tensors.

  • output Retrieves the output tensor(s) of a layer.

    Only returns the tensor(s) corresponding to the first time the operation was called.

    Returns: Output tensor or list of output tensors.

Data descriptors (Attributes):

  • dtype_policy

  • input_spec

  • supports_masking Whether this layer supports computing a mask using compute_mask.

  • trainable Settable boolean, whether this layer should be trainable or not.

See Also

Other layers:
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Applies an activation function to an output.

Description

Applies an activation function to an output.

Usage

layer_activation(object, activation, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

activation

Activation function. It could be a callable, or the name of an activation from the ⁠keras3::activation_*⁠ namespace.

...

Base layer keyword arguments, such as name and dtype.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Examples

x <- array(c(-3, -1, 0, 2))
layer <- layer_activation(activation = 'relu')
layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)

layer <- layer_activation(activation = activation_relu)
layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)

layer <- layer_activation(activation = op_relu)
layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)

See Also

Other activation layers:
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()

Other layers:
Layer()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Applies an Exponential Linear Unit function to an output.

Description

Formula:

f(x) = alpha * (exp(x) - 1.) for x < 0
f(x) = x for x >= 0

Usage

layer_activation_elu(object, alpha = 1, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

alpha

float, slope of negative section. Defaults to 1.0.

...

Base layer keyword arguments, such as name and dtype.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

See Also

Other activation layers:
layer_activation()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()

Other layers:
Layer()
layer_activation()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Leaky version of a Rectified Linear Unit activation layer.

Description

This layer allows a small gradient when the unit is not active.

Formula:

f <- function(x) ifelse(x >= 0, x, alpha * x)

Usage

layer_activation_leaky_relu(object, negative_slope = 0.3, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

negative_slope

Float >= 0.0. Negative slope coefficient. Defaults to 0.3.

...

Base layer keyword arguments, such as name and dtype.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Examples

leaky_relu_layer <- layer_activation_leaky_relu(negative_slope=0.5)
input <- array(c(-10, -5, 0.0, 5, 10))
result <- leaky_relu_layer(input)
as.array(result)
## [1] -5.0 -2.5  0.0  5.0 10.0

See Also

Other activation layers:
layer_activation()
layer_activation_elu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Parametric Rectified Linear Unit activation layer.

Description

Formula:

f <- function(x) ifelse(x >= 0, x, alpha * x)

where alpha is a learned array with the same shape as x.

Usage

layer_activation_parametric_relu(
  object,
  alpha_initializer = "Zeros",
  alpha_regularizer = NULL,
  alpha_constraint = NULL,
  shared_axes = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

alpha_initializer

Initializer function for the weights.

alpha_regularizer

Regularizer for the weights.

alpha_constraint

Constraint for the weights.

shared_axes

The axes along which to share learnable parameters for the activation function. For example, if the incoming feature maps are from a 2D convolution with output shape ⁠(batch, height, width, channels)⁠, and you wish to share parameters across space so that each filter only has one set of parameters, set ⁠shared_axes=[1, 2]⁠.

...

Base layer keyword arguments, such as name and dtype.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

See Also

Other activation layers:
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_relu()
layer_activation_softmax()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Rectified Linear Unit activation function layer.

Description

Formula:

f <- function(x, max_value = Inf, negative_slope = 0, threshold = 0) {
 x <- max(x,0)
 if (x >= max_value)
   max_value
 else if (threshold <= x && x < max_value)
   x
 else
   negative_slope * (x - threshold)
}

Usage

layer_activation_relu(
  object,
  max_value = NULL,
  negative_slope = 0,
  threshold = 0,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

max_value

Float >= 0. Maximum activation value. NULL means unlimited. Defaults to NULL.

negative_slope

Float >= 0. Negative slope coefficient. Defaults to 0.0.

threshold

Float >= 0. Threshold value for thresholded activation. Defaults to 0.0.

...

Base layer keyword arguments, such as name and dtype.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Examples

relu_layer <- layer_activation_relu(max_value = 10,
                                    negative_slope = 0.5,
                                    threshold = 0)
input <- array(c(-10, -5, 0.0, 5, 10))
result <- relu_layer(input)
as.array(result)
## [1] -5.0 -2.5  0.0  5.0 10.0

See Also

Other activation layers:
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_softmax()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Softmax activation layer.

Description

Formula:

exp_x = exp(x - max(x))
f(x) = exp_x / sum(exp_x)

Usage

layer_activation_softmax(object, axis = -1L, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

axis

Integer, or list of Integers, axis along which the softmax normalization is applied.

...

Base layer keyword arguments, such as name and dtype.

Value

Softmaxed output with the same shape as inputs.

Examples

softmax_layer <- layer_activation_softmax()
input <- op_array(c(1, 2, 1))
softmax_layer(input)
## tf.Tensor([0.21194157 0.5761169  0.21194157], shape=(3), dtype=float32)

Call Arguments

  • inputs: The inputs (logits) to the softmax layer.

  • mask: A boolean mask of the same shape as inputs. The mask specifies 1 to keep and 0 to mask. Defaults to NULL.

See Also

Other activation layers:
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Layer that applies an update to the cost function based input activity.

Description

Layer that applies an update to the cost function based input activity.

Usage

layer_activity_regularization(object, l1 = 0, l2 = 0, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

l1

L1 regularization factor (positive float).

l2

L2 regularization factor (positive float).

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Input Shape

Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.

Output Shape

Same shape as input.

See Also

Other regularization layers:
layer_alpha_dropout()
layer_dropout()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Performs elementwise addition operation.

Description

It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape).

Usage

layer_add(inputs, ...)

Arguments

inputs

layers to combine

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Examples

input_shape <- c(1, 2, 3)
x1 <- op_ones(input_shape)
x2 <- op_ones(input_shape)
layer_add(x1, x2)
## tf.Tensor(
## [[[2. 2. 2.]
##   [2. 2. 2.]]], shape=(1, 2, 3), dtype=float32)

Usage in a Keras model:

input1 <- layer_input(shape = c(16))
x1 <- input1 |> layer_dense(8, activation = 'relu')

input2 <- layer_input(shape = c(32))
x2 <- input2 |> layer_dense(8, activation = 'relu')

# equivalent to `added = layer_add([x1, x2))`
added <- layer_add(x1, x2)
output <- added |> layer_dense(4)

model <- keras_model(inputs = c(input1, input2), outputs = output)

See Also

Other merging layers:
layer_average()
layer_concatenate()
layer_dot()
layer_maximum()
layer_minimum()
layer_multiply()
layer_subtract()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Additive attention layer, a.k.a. Bahdanau-style attention.

Description

Inputs are a list with 2 or 3 elements:

  1. A query tensor of shape ⁠(batch_size, Tq, dim)⁠.

  2. A value tensor of shape ⁠(batch_size, Tv, dim)⁠.

  3. A optional key tensor of shape ⁠(batch_size, Tv, dim)⁠. If none supplied, value will be used as key.

The calculation follows the steps:

  1. Calculate attention scores using query and key with shape ⁠(batch_size, Tq, Tv)⁠ as a non-linear sum scores = reduce_sum(tanh(query + key), axis=-1).

  2. Use scores to calculate a softmax distribution with shape ⁠(batch_size, Tq, Tv)⁠.

  3. Use the softmax distribution to create a linear combination of value with shape ⁠(batch_size, Tq, dim)⁠.

Usage

layer_additive_attention(object, use_scale = TRUE, dropout = 0, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

use_scale

If TRUE, will create a scalar variable to scale the attention scores.

dropout

Float between 0 and 1. Fraction of the units to drop for the attention scores. Defaults to 0.0.

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Call Arguments

  • inputs: List of the following tensors:

    • query: Query tensor of shape ⁠(batch_size, Tq, dim)⁠.

    • value: Value tensor of shape ⁠(batch_size, Tv, dim)⁠.

    • key: Optional key tensor of shape ⁠(batch_size, Tv, dim)⁠. If not given, will use value for both key and value, which is the most common case.

  • mask: List of the following tensors:

    • query_mask: A boolean mask tensor of shape ⁠(batch_size, Tq)⁠. If given, the output will be zero at the positions where mask==FALSE.

    • value_mask: A boolean mask tensor of shape ⁠(batch_size, Tv)⁠. If given, will apply the mask such that values at positions where mask==FALSE do not contribute to the result.

  • return_attention_scores: bool, it TRUE, returns the attention scores (after masking and softmax) as an additional output argument.

  • training: Python boolean indicating whether the layer should behave in training mode (adding dropout) or in inference mode (no dropout).

  • use_causal_mask: Boolean. Set to TRUE for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past. Defaults to FALSE.

Output

Attention outputs of shape ⁠(batch_size, Tq, dim)⁠. (Optional) Attention scores after masking and softmax with shape ⁠(batch_size, Tq, Tv)⁠.

See Also

Other attention layers:
layer_attention()
layer_group_query_attention()
layer_multi_head_attention()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Applies Alpha Dropout to the input.

Description

Alpha Dropout is a Dropout that keeps mean and variance of inputs to their original values, in order to ensure the self-normalizing property even after this dropout. Alpha Dropout fits well to Scaled Exponential Linear Units (SELU) by randomly setting activations to the negative saturation value.

Usage

layer_alpha_dropout(object, rate, noise_shape = NULL, seed = NULL, ...)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

rate

Float between 0 and 1. The multiplicative noise will have standard deviation sqrt(rate / (1 - rate)).

noise_shape

1D integer tensor representing the shape of the binary alpha dropout mask that will be multiplied with the input. For instance, if your inputs have shape ⁠(batch_size, timesteps, features)⁠ and you want the alpha dropout mask to be the same for all timesteps, you can use ⁠noise_shape = (batch_size, 1, features)⁠.

seed

An integer to use as random seed.

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Call Arguments

  • inputs: Input tensor (of any rank).

  • training: R boolean indicating whether the layer should behave in training mode (adding alpha dropout) or in inference mode (doing nothing).

See Also

Other regularization layers:
layer_activity_regularization()
layer_dropout()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Dot-product attention layer, a.k.a. Luong-style attention.

Description

Inputs are a list with 2 or 3 elements:

  1. A query tensor of shape ⁠(batch_size, Tq, dim)⁠.

  2. A value tensor of shape ⁠(batch_size, Tv, dim)⁠.

  3. A optional key tensor of shape ⁠(batch_size, Tv, dim)⁠. If none supplied, value will be used as a key.

The calculation follows the steps:

  1. Calculate attention scores using query and key with shape ⁠(batch_size, Tq, Tv)⁠.

  2. Use scores to calculate a softmax distribution with shape ⁠(batch_size, Tq, Tv)⁠.

  3. Use the softmax distribution to create a linear combination of value with shape ⁠(batch_size, Tq, dim)⁠.

Usage

layer_attention(
  object,
  use_scale = FALSE,
  score_mode = "dot",
  dropout = 0,
  seed = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

use_scale

If TRUE, will create a scalar variable to scale the attention scores.

score_mode

Function to use to compute attention scores, one of ⁠{"dot", "concat"}⁠. "dot" refers to the dot product between the query and key vectors. "concat" refers to the hyperbolic tangent of the concatenation of the query and key vectors.

dropout

Float between 0 and 1. Fraction of the units to drop for the attention scores. Defaults to 0.0.

seed

An integer to use as random seed incase of dropout.

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Call Arguments

  • inputs: List of the following tensors:

    • query: Query tensor of shape ⁠(batch_size, Tq, dim)⁠.

    • value: Value tensor of shape ⁠(batch_size, Tv, dim)⁠.

    • key: Optional key tensor of shape ⁠(batch_size, Tv, dim)⁠. If not given, will use value for both key and value, which is the most common case.

  • mask: List of the following tensors:

    • query_mask: A boolean mask tensor of shape ⁠(batch_size, Tq)⁠. If given, the output will be zero at the positions where mask==FALSE.

    • value_mask: A boolean mask tensor of shape ⁠(batch_size, Tv)⁠. If given, will apply the mask such that values at positions where mask==FALSE do not contribute to the result.

  • return_attention_scores: bool, it TRUE, returns the attention scores (after masking and softmax) as an additional output argument.

  • training: Python boolean indicating whether the layer should behave in training mode (adding dropout) or in inference mode (no dropout).

  • use_causal_mask: Boolean. Set to TRUE for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past. Defaults to FALSE.

Output

Attention outputs of shape ⁠(batch_size, Tq, dim)⁠. (Optional) Attention scores after masking and softmax with shape ⁠(batch_size, Tq, Tv)⁠.

See Also

Other attention layers:
layer_additive_attention()
layer_group_query_attention()
layer_multi_head_attention()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Averages a list of inputs element-wise..

Description

It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape).

Usage

layer_average(inputs, ...)

Arguments

inputs

layers to combine

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Examples

input_shape <- c(1, 2, 3)
x1 <- op_ones(input_shape)
x2 <- op_zeros(input_shape)
layer_average(x1, x2)
## tf.Tensor(
## [[[0.5 0.5 0.5]
##   [0.5 0.5 0.5]]], shape=(1, 2, 3), dtype=float32)

Usage in a Keras model:

input1 <- layer_input(shape = c(16))
x1 <- input1 |> layer_dense(8, activation = 'relu')

input2 <- layer_input(shape = c(32))
x2 <- input2 |> layer_dense(8, activation = 'relu')

added <- layer_average(x1, x2)
output <- added |> layer_dense(4)

model <- keras_model(inputs = c(input1, input2), outputs = output)

See Also

Other merging layers:
layer_add()
layer_concatenate()
layer_dot()
layer_maximum()
layer_minimum()
layer_multiply()
layer_subtract()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Average pooling for temporal data.

Description

Downsamples the input representation by taking the average value over the window defined by pool_size. The window is shifted by strides. The resulting output when using "valid" padding option has a shape of: ⁠output_shape = (input_shape - pool_size + 1) / strides)⁠

The resulting output shape when using the "same" padding option is: output_shape = input_shape / strides

Usage

layer_average_pooling_1d(
  object,
  pool_size,
  strides = NULL,
  padding = "valid",
  data_format = NULL,
  name = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

pool_size

int, size of the max pooling window.

strides

int or NULL. Specifies how much the pooling window moves for each pooling step. If NULL, it will default to pool_size.

padding

string, either "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.

data_format

string, either "channels_last" or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape ⁠(batch, steps, features)⁠ while "channels_first" corresponds to inputs with shape ⁠(batch, features, steps)⁠. It defaults to the image_data_format value found in your Keras config file at ⁠~/.keras/keras.json⁠. If you never set it, then it will be "channels_last".

name

String, name for the object

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Input Shape

  • If data_format="channels_last": 3D tensor with shape ⁠(batch_size, steps, features)⁠.

  • If data_format="channels_first": 3D tensor with shape ⁠(batch_size, features, steps)⁠.

Output Shape

  • If data_format="channels_last": 3D tensor with shape ⁠(batch_size, downsampled_steps, features)⁠.

  • If data_format="channels_first": 3D tensor with shape ⁠(batch_size, features, downsampled_steps)⁠.

Examples

strides=1 and padding="valid":

x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1))
output <- x |>
  layer_average_pooling_1d(pool_size = 2,
                           strides = 1,
                           padding = "valid")
output
## tf.Tensor(
## [[[1.5]
##   [2.5]
##   [3.5]
##   [4.5]]], shape=(1, 4, 1), dtype=float32)

strides=2 and padding="valid":

x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1))
output <- x |>
  layer_average_pooling_1d(pool_size = 2,
                           strides = 2,
                           padding = "valid")
output
## tf.Tensor(
## [[[1.5]
##   [3.5]]], shape=(1, 2, 1), dtype=float32)

strides=1 and padding="same":

x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1))
output <- x |>
  layer_average_pooling_1d(pool_size = 2,
                           strides = 1,
                           padding = "same")
output
## tf.Tensor(
## [[[1.5]
##   [2.5]
##   [3.5]
##   [4.5]
##   [5. ]]], shape=(1, 5, 1), dtype=float32)

See Also

Other pooling layers:
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Average pooling operation for 2D spatial data.

Description

Downsamples the input along its spatial dimensions (height and width) by taking the average value over an input window (of size defined by pool_size) for each channel of the input. The window is shifted by strides along each dimension.

The resulting output when using the "valid" padding option has a spatial shape (number of rows or columns) of: output_shape = math.floor((input_shape - pool_size) / strides) + 1 (when input_shape >= pool_size)

The resulting output shape when using the "same" padding option is: output_shape = math.floor((input_shape - 1) / strides) + 1

Usage

layer_average_pooling_2d(
  object,
  pool_size,
  strides = NULL,
  padding = "valid",
  data_format = NULL,
  name = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

pool_size

int or list of 2 integers, factors by which to downscale (dim1, dim2). If only one integer is specified, the same window length will be used for all dimensions.

strides

int or list of 2 integers, or NULL. Strides values. If NULL, it will default to pool_size. If only one int is specified, the same stride size will be used for all dimensions.

padding

string, either "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.

data_format

string, either "channels_last" or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape ⁠(batch, height, width, channels)⁠ while "channels_first" corresponds to inputs with shape ⁠(batch, channels, height, width)⁠. It defaults to the image_data_format value found in your Keras config file at ⁠~/.keras/keras.json⁠. If you never set it, then it will be "channels_last".

name

String, name for the object

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Input Shape

  • If data_format="channels_last": 4D tensor with shape ⁠(batch_size, height, width, channels)⁠.

  • If data_format="channels_first": 4D tensor with shape ⁠(batch_size, channels, height, width)⁠.

Output Shape

  • If data_format="channels_last": 4D tensor with shape ⁠(batch_size, pooled_height, pooled_width, channels)⁠.

  • If data_format="channels_first": 4D tensor with shape ⁠(batch_size, channels, pooled_height, pooled_width)⁠.

Examples

⁠strides=(1, 1)⁠ and padding="valid":

x <- op_array(1:9, "float32") |> op_reshape(c(1, 3, 3, 1))
output <- x |>
  layer_average_pooling_2d(pool_size = c(2, 2),
                           strides = c(1, 1),
                           padding = "valid")
output
## tf.Tensor(
## [[[[3.]
##    [4.]]
##
##   [[6.]
##    [7.]]]], shape=(1, 2, 2, 1), dtype=float32)

⁠strides=(2, 2)⁠ and padding="valid":

x <- op_array(1:12, "float32") |> op_reshape(c(1, 3, 4, 1))
output <- x |>
  layer_average_pooling_2d(pool_size = c(2, 2),
                           strides = c(2, 2),
                           padding = "valid")
output
## tf.Tensor(
## [[[[3.5]
##    [5.5]]]], shape=(1, 1, 2, 1), dtype=float32)

⁠stride=(1, 1)⁠ and padding="same":

x <- op_array(1:9, "float32") |> op_reshape(c(1, 3, 3, 1))
output <- x |>
  layer_average_pooling_2d(pool_size = c(2, 2),
                           strides = c(1, 1),
                           padding = "same")
output
## tf.Tensor(
## [[[[3. ]
##    [4. ]
##    [4.5]]
##
##   [[6. ]
##    [7. ]
##    [7.5]]
##
##   [[7.5]
##    [8.5]
##    [9. ]]]], shape=(1, 3, 3, 1), dtype=float32)

See Also

Other pooling layers:
layer_average_pooling_1d()
layer_average_pooling_3d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Average pooling operation for 3D data (spatial or spatio-temporal).

Description

Downsamples the input along its spatial dimensions (depth, height, and width) by taking the average value over an input window (of size defined by pool_size) for each channel of the input. The window is shifted by strides along each dimension.

Usage

layer_average_pooling_3d(
  object,
  pool_size,
  strides = NULL,
  padding = "valid",
  data_format = NULL,
  name = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

pool_size

int or list of 3 integers, factors by which to downscale (dim1, dim2, dim3). If only one integer is specified, the same window length will be used for all dimensions.

strides

int or list of 3 integers, or NULL. Strides values. If NULL, it will default to pool_size. If only one int is specified, the same stride size will be used for all dimensions.

padding

string, either "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.

data_format

string, either "channels_last" or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape ⁠(batch, spatial_dim1, spatial_dim2, spatial_dim3, channels)⁠ while "channels_first" corresponds to inputs with shape ⁠(batch, channels, spatial_dim1, spatial_dim2, spatial_dim3)⁠. It defaults to the image_data_format value found in your Keras config file at ⁠~/.keras/keras.json⁠. If you never set it, then it will be "channels_last".

name

String, name for the object

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Input Shape

  • If data_format="channels_last": 5D tensor with shape: ⁠(batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)⁠

  • If data_format="channels_first": 5D tensor with shape: ⁠(batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)⁠

Output Shape

  • If data_format="channels_last": 5D tensor with shape: ⁠(batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels)⁠

  • If data_format="channels_first": 5D tensor with shape: ⁠(batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3)⁠

Examples

depth <- height <- width <- 30
channels <- 3

inputs <- layer_input(shape = c(depth, height, width, channels))
outputs <- inputs |> layer_average_pooling_3d(pool_size = 3)
outputs # Shape: (batch_size, 10, 10, 10, 3)
## <KerasTensor shape=(None, 10, 10, 10, 3), dtype=float32, sparse=False, name=keras_tensor_1>

See Also

Other pooling layers:
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Layer that normalizes its inputs.

Description

Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1.

Importantly, batch normalization works differently during training and during inference.

During training (i.e. when using fit() or when calling the layer/model with the argument training = TRUE), the layer normalizes its output using the mean and standard deviation of the current batch of inputs. That is to say, for each channel being normalized, the layer returns gamma * (batch - mean(batch)) / sqrt(var(batch) + epsilon) + beta, where:

  • epsilon is small constant (configurable as part of the constructor arguments)

  • gamma is a learned scaling factor (initialized as 1), which can be disabled by passing scale = FALSE to the constructor.

  • beta is a learned offset factor (initialized as 0), which can be disabled by passing center = FALSE to the constructor.

During inference (i.e. when using evaluate() or predict() or when calling the layer/model with the argument training = FALSE (which is the default), the layer normalizes its output using a moving average of the mean and standard deviation of the batches it has seen during training. That is to say, it returns gamma * (batch - self$moving_mean) / sqrt(self$moving_var+epsilon) + beta.

self$moving_mean and self$moving_var are non-trainable variables that are updated each time the layer in called in training mode, as such:

  • moving_mean = moving_mean * momentum + mean(batch) * (1 - momentum)

  • moving_var = moving_var * momentum + var(batch) * (1 - momentum)

As such, the layer will only normalize its inputs during inference after having been trained on data that has similar statistics as the inference data.

About setting layer$trainable <- FALSE on a BatchNormalization layer:

The meaning of setting layer$trainable <- FALSE is to freeze the layer, i.e. its internal state will not change during training: its trainable weights will not be updated during fit() or train_on_batch(), and its state updates will not be run.

Usually, this does not necessarily mean that the layer is run in inference mode (which is normally controlled by the training argument that can be passed when calling a layer). "Frozen state" and "inference mode" are two separate concepts.

However, in the case of the BatchNormalization layer, setting trainable <- FALSE on the layer means that the layer will be subsequently run in inference mode (meaning that it will use the moving mean and the moving variance to normalize the current batch, rather than using the mean and variance of the current batch).

Note that:

  • Setting trainable on an model containing other layers will recursively set the trainable value of all inner layers.

  • If the value of the trainable attribute is changed after calling compile() on a model, the new value doesn't take effect for this model until compile() is called again.

Usage

layer_batch_normalization(
  object,
  axis = -1L,
  momentum = 0.99,
  epsilon = 0.001,
  center = TRUE,
  scale = TRUE,
  beta_initializer = "zeros",
  gamma_initializer = "ones",
  moving_mean_initializer = "zeros",
  moving_variance_initializer = "ones",
  beta_regularizer = NULL,
  gamma_regularizer = NULL,
  beta_constraint = NULL,
  gamma_constraint = NULL,
  synchronized = FALSE,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

axis

Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format = "channels_first", use axis = 2.

momentum

Momentum for the moving average.

epsilon

Small float added to variance to avoid dividing by zero.

center

If TRUE, add offset of beta to normalized tensor. If FALSE, beta is ignored.

scale

If TRUE, multiply by gamma. If FALSE, gamma is not used. When the next layer is linear this can be disabled since the scaling will be done by the next layer.

beta_initializer

Initializer for the beta weight.

gamma_initializer

Initializer for the gamma weight.

moving_mean_initializer

Initializer for the moving mean.

moving_variance_initializer

Initializer for the moving variance.

beta_regularizer

Optional regularizer for the beta weight.

gamma_regularizer

Optional regularizer for the gamma weight.

beta_constraint

Optional constraint for the beta weight.

gamma_constraint

Optional constraint for the gamma weight.

synchronized

Only applicable with the TensorFlow backend. If TRUE, synchronizes the global batch statistics (mean and variance) for the layer across all devices at each training step in a distributed training strategy. If FALSE, each replica uses its own local batch statistics.

...

Base layer keyword arguments (e.g. name and dtype).

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Call Arguments

  • inputs: Input tensor (of any rank).

  • training: R boolean indicating whether the layer should behave in training mode or in inference mode.

    • training = TRUE: The layer will normalize its inputs using the mean and variance of the current batch of inputs.

    • training = FALSE: The layer will normalize its inputs using the mean and variance of its moving statistics, learned during training.

  • mask: Binary tensor of shape broadcastable to inputs tensor, with TRUE values indicating the positions for which mean and variance should be computed. Masked elements of the current inputs are not taken into account for mean and variance computation during training. Any prior unmasked element values will be taken into account until their momentum expires.

Reference

See Also

Other normalization layers:
layer_group_normalization()
layer_layer_normalization()
layer_spectral_normalization()
layer_unit_normalization()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()


Bidirectional wrapper for RNNs.

Description

Bidirectional wrapper for RNNs.

Usage

layer_bidirectional(
  object,
  layer,
  merge_mode = "concat",
  weights = NULL,
  backward_layer = NULL,
  ...
)

Arguments

object

Object to compose the layer with. A tensor, array, or sequential model.

layer

RNN instance, such as layer_lstm() or layer_gru(). It could also be a Layer() instance that meets the following criteria:

  1. Be a sequence-processing layer (accepts 3D+ inputs).

  2. Have a go_backwards, return_sequences and return_state attribute (with the same semantics as for the RNN class).

  3. Have an input_spec attribute.

  4. Implement serialization via get_config() and from_config(). Note that the recommended way to create new RNN layers is to write a custom RNN cell and use it with layer_rnn(), instead of subclassing with Layer() directly. When return_sequences is TRUE, the output of the masked timestep will be zero regardless of the layer's original zero_output_for_mask value.

merge_mode

Mode by which outputs of the forward and backward RNNs will be combined. One of ⁠{"sum", "mul", "concat", "ave", NULL}⁠. If NULL, the outputs will not be combined, they will be returned as a list. Defaults to "concat".

weights

see description

backward_layer

Optional RNN, or Layer() instance to be used to handle backwards input processing. If backward_layer is not provided, the layer instance passed as the layer argument will be used to generate the backward layer automatically. Note that the provided backward_layer layer should have properties matching those of the layer argument, in particular it should have the same values for stateful, return_states, return_sequences, etc. In addition, backward_layer and layer should have different go_backwards argument values. A ValueError will be raised if these requirements are not met.

...

For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

  • a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.

  • a keras_input(), then the output tensor from calling layer(input) is returned.

  • NULL or missing, then a Layer instance is returned.

Call Arguments

The call arguments for this layer are the same as those of the wrapped RNN layer. Beware that when passing the initial_state argument during the call of this layer, the first half in the list of elements in the initial_state list will be passed to the forward RNN call and the last half in the list of elements will be passed to the backward RNN call.

Note

instantiating a Bidirectional layer from an existing RNN layer instance will not reuse the weights state of the RNN layer instance – the Bidirectional layer will have freshly initialized weights.

Examples

model <- keras_model_sequential(input_shape = c(5, 10)) %>%
  layer_bidirectional(layer_lstm(units = 10, return_sequences = TRUE)) %>%
  layer_bidirectional(layer_lstm(units = 10)) %>%
  layer_dense(5, activation = "softmax")

model %>% compile(loss = "categorical_crossentropy",
                  optimizer = "rmsprop")

# With custom backward layer
forward_layer <- layer_lstm(units = 10, return_sequences = TRUE)
backward_layer <- layer_lstm(units = 10, activation = "relu",
                             return_sequences = TRUE, go_backwards = TRUE)

model <- keras_model_sequential(input_shape = c(5, 10)) %>%
  bidirectional(forward_layer, backward_layer = backward_layer) %>%
  layer_dense(5, activation = "softmax")

model %>% compile(loss = "categorical_crossentropy",
                  optimizer = "rmsprop")

States

A Bidirectional layer instance has property states, which you can access with layer$states. You can also reset states using reset_state()

See Also

Other rnn layers:
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_gru()
layer_lstm()
layer_rnn()
layer_simple_rnn()
layer_time_distributed()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()