Title: | R Interface to 'Keras' |
---|---|
Description: | Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices. |
Authors: | Tomasz Kalinowski [aut, cph, cre], Daniel Falbel [ctb, cph], JJ Allaire [aut, cph], François Chollet [aut, cph], Posit Software, PBC [cph, fnd], Google [cph, fnd], Yuan Tang [ctb, cph] , Wouter Van Der Bijl [ctb, cph], Martin Studer [ctb, cph], Sigrid Keydana [ctb] |
Maintainer: | Tomasz Kalinowski <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2.0.9000 |
Built: | 2024-10-06 05:35:31 UTC |
Source: | https://github.com/rstudio/keras3 |
The exponential linear unit (ELU) with alpha > 0
is defined as:
x
if x > 0
alpha * exp(x) - 1
if x < 0
ELUs have negative values which pushes the mean of the activations closer to zero.
Mean activations that are closer to zero enable faster learning as they bring the gradient closer to the natural gradient. ELUs saturate to a negative value when the argument gets smaller. Saturation means a small derivative which decreases the variation and the information that is propagated to the next layer.
activation_elu(x, alpha = 1)
activation_elu(x, alpha = 1)
x |
Input tensor. |
alpha |
Numeric. See description for details. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
Exponential activation function.
activation_exponential(x)
activation_exponential(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
The Gaussian error linear unit (GELU) is defined as:
gelu(x) = x * P(X <= x)
where P(X) ~ N(0, 1)
,
i.e. gelu(x) = 0.5 * x * (1 + erf(x / sqrt(2)))
.
GELU weights inputs by their value, rather than gating inputs by their sign as in ReLU.
activation_gelu(x, approximate = FALSE)
activation_gelu(x, approximate = FALSE)
x |
Input tensor. |
approximate |
A |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
The hard sigmoid activation is defined as:
0
if if x <= -3
1
if x >= 3
(x/6) + 0.5
if -3 < x < 3
It's a faster, piecewise linear approximation of the sigmoid activation.
activation_hard_sigmoid(x)
activation_hard_sigmoid(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
It is defined as:
0
if if x < -3
x
if x > 3
x * (x + 3) / 6
if -3 <= x <= 3
It's a faster, piecewise linear approximation of the silu activation.
activation_hard_silu(x) activation_hard_swish(x)
activation_hard_silu(x) activation_hard_swish(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Leaky relu activation function.
activation_leaky_relu(x, negative_slope = 0.2)
activation_leaky_relu(x, negative_slope = 0.2)
x |
Input tensor. |
negative_slope |
A |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
A "linear" activation is an identity function: it returns the input, unmodified.
activation_linear(x)
activation_linear(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
Each input vector is handled independently.
The axis
argument sets which axis of the input the function
is applied along.
activation_log_softmax(x, axis = -1L)
activation_log_softmax(x, axis = -1L)
x |
Input tensor. |
axis |
Integer, axis along which the softmax is applied. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
It is defined as:
mish(x) = x * tanh(softplus(x))
where softplus
is defined as:
softplus(x) = log(exp(x) + 1)
activation_mish(x)
activation_mish(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
With default values, this returns the standard ReLU activation:
max(x, 0)
, the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds, change the max value of the activation, and to use a non-zero multiple of the input for values below the threshold.
activation_relu(x, negative_slope = 0, max_value = NULL, threshold = 0)
activation_relu(x, negative_slope = 0, max_value = NULL, threshold = 0)
x |
Input tensor. |
negative_slope |
A |
max_value |
A |
threshold |
A |
A tensor with the same shape and dtype as input x
.
x <- c(-10, -5, 0, 5, 10) activation_relu(x)
## tf.Tensor([ 0. 0. 0. 5. 10.], shape=(5), dtype=float32)
activation_relu(x, negative_slope = 0.5)
## tf.Tensor([-5. -2.5 0. 5. 10. ], shape=(5), dtype=float32)
activation_relu(x, max_value = 5)
## tf.Tensor([0. 0. 0. 5. 5.], shape=(5), dtype=float32)
activation_relu(x, threshold = 5)
## tf.Tensor([-0. -0. 0. 0. 10.], shape=(5), dtype=float32)
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
It's the ReLU function, but truncated to a maximum value of 6.
activation_relu6(x)
activation_relu6(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
The Scaled Exponential Linear Unit (SELU) activation function is defined as:
scale * x
if x > 0
scale * alpha * (exp(x) - 1)
if x < 0
where alpha
and scale
are pre-defined constants
(alpha = 1.67326324
and scale = 1.05070098
).
Basically, the SELU activation function multiplies scale
(> 1) with the
output of the activation_elu
function to ensure a slope larger
than one for positive inputs.
The values of alpha
and scale
are
chosen so that the mean and variance of the inputs are preserved
between two consecutive layers as long as the weights are initialized
correctly (see initializer_lecun_normal()
)
and the number of input units is "large enough"
(see reference paper for more information).
activation_selu(x)
activation_selu(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
To be used together with
initializer_lecun_normal()
.
To be used together with the dropout variant
layer_alpha_dropout()
(legacy, depracated).
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
It is defined as: sigmoid(x) = 1 / (1 + exp(-x))
.
For small values (<-5),
sigmoid
returns a value close to zero, and for large values (>5)
the result of the function gets close to 1.
Sigmoid is equivalent to a 2-element softmax, where the second element is assumed to be zero. The sigmoid function always returns a value between 0 and 1.
activation_sigmoid(x)
activation_sigmoid(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
It is defined as: swish(x) = x * sigmoid(x)
.
The Swish (or Silu) activation function is a smooth, non-monotonic function that is unbounded above and bounded below.
activation_silu(x)
activation_silu(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_softmax()
activation_softplus()
activation_softsign()
activation_tanh()
The elements of the output vector are in range [0, 1]
and sum to 1.
Each input vector is handled independently.
The axis
argument sets which axis of the input the function
is applied along.
Softmax is often used as the activation for the last layer of a classification network because the result could be interpreted as a probability distribution.
The softmax of each vector x is computed as
exp(x) / sum(exp(x))
.
The input values in are the log-odds of the resulting probability.
activation_softmax(x, axis = -1L)
activation_softmax(x, axis = -1L)
x |
Input tensor. |
axis |
Integer, axis along which the softmax is applied. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softplus()
activation_softsign()
activation_tanh()
It is defined as: softplus(x) = log(exp(x) + 1)
.
activation_softplus(x)
activation_softplus(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softsign()
activation_tanh()
Softsign is defined as: softsign(x) = x / (abs(x) + 1)
.
activation_softsign(x)
activation_softsign(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_tanh()
It is defined as:
tanh(x) = sinh(x) / cosh(x)
, i.e.
tanh(x) = ((exp(x) - exp(-x)) / (exp(x) + exp(-x)))
.
activation_tanh(x)
activation_tanh(x)
x |
Input tensor. |
A tensor, the result from applying the activation to the input tensor x
.
Other activations: activation_elu()
activation_exponential()
activation_gelu()
activation_hard_sigmoid()
activation_leaky_relu()
activation_linear()
activation_log_softmax()
activation_mish()
activation_relu()
activation_relu6()
activation_selu()
activation_sigmoid()
activation_silu()
activation_softmax()
activation_softplus()
activation_softsign()
Create an active property class method
active_property(fn)
active_property(fn)
fn |
An R function |
fn
, with an additional R attribute that will cause fn
to be
converted to an active property when being converted to a method of a
custom subclass.
layer_foo <- Model("Foo", ..., metrics = active_property(function() { list(self$d_loss_metric, self$g_loss_metric) }))
Fits the state of the preprocessing layer to the data being passed
adapt(object, data, ..., batch_size = NULL, steps = NULL)
adapt(object, data, ..., batch_size = NULL, steps = NULL)
object |
Preprocessing layer object |
data |
The data to train on. It can be passed either as a
|
... |
Used for forwards and backwards compatibility. Passed on to the underlying method. |
batch_size |
Integer or |
steps |
Integer or |
After calling adapt
on a layer, a preprocessing layer's state will not
update during training. In order to make preprocessing layers efficient in
any distribution context, they are kept constant with respect to any
compiled tf.Graph
s that call the layer. This does not affect the layer use
when adapting each layer only once, but if you adapt a layer multiple times
you will need to take care to re-compile any compiled functions as follows:
If you are adding a preprocessing layer to a keras model, you need to
call compile(model)
after each subsequent call to adapt()
.
If you are calling a preprocessing layer inside tfdatasets::dataset_map()
,
you should call dataset_map()
again on the input Dataset
after each
adapt()
.
If you are using a tensorflow::tf_function()
directly which calls a preprocessing
layer, you need to call tf_function()
again on your callable after
each subsequent call to adapt()
.
keras_model()
example with multiple adapts:
layer <- layer_normalization(axis = NULL) adapt(layer, c(0, 2)) model <- keras_model_sequential() |> layer() predict(model, c(0, 1, 2), verbose = FALSE) # [1] -1 0 1
## [1] -1 0 1
adapt(layer, c(-1, 1)) compile(model) # This is needed to re-compile model.predict! predict(model, c(0, 1, 2), verbose = FALSE) # [1] 0 1 2
## [1] 0 1 2
tfdatasets
example with multiple adapts:
layer <- layer_normalization(axis = NULL) adapt(layer, c(0, 2)) input_ds <- tfdatasets::range_dataset(0, 3) normalized_ds <- input_ds |> tfdatasets::dataset_map(layer) str(tfdatasets::iterate(normalized_ds))
## List of 3 ## $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([-1.], dtype=float32)> ## $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([0.], dtype=float32)> ## $ :<tf.Tensor: shape=(1), dtype=float32, numpy=array([1.], dtype=float32)>
adapt(layer, c(-1, 1)) normalized_ds <- input_ds |> tfdatasets::dataset_map(layer) # Re-map over the input dataset. normalized_ds |> tfdatasets::as_array_iterator() |> tfdatasets::iterate(simplify = FALSE) |> str()
## List of 3 ## $ : num [1(1d)] 0 ## $ : num [1(1d)] 1 ## $ : num [1(1d)] 2
Returns object
, invisibly.
Instantiates the ConvNeXtBase architecture.
application_convnext_base( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_base" )
application_convnext_base( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_base" )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
A ConvNet for the 2020s (CVPR 2022)
For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The base
, large
, and xlarge
models were first pre-trained on the
ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The
pre-trained parameters of the models were assembled from the
official repository. To get a
sense of how these parameters were converted to Keras compatible parameters,
please refer to
this repository.
Each Keras Application expects a specific kind of input preprocessing.
For ConvNeXt, preprocessing is included in the model using a Normalization
layer. ConvNeXt models expect their inputs to be float or uint8 tensors of
pixels with values in the [0-255]
range.
When calling the summary()
method after instantiating a ConvNeXt model,
prefer setting the expand_nested
argument summary()
to TRUE
to better
investigate the instantiated model.
Instantiates the ConvNeXtLarge architecture.
application_convnext_large( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_large" )
application_convnext_large( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_large" )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
A ConvNet for the 2020s (CVPR 2022)
For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The base
, large
, and xlarge
models were first pre-trained on the
ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The
pre-trained parameters of the models were assembled from the
official repository. To get a
sense of how these parameters were converted to Keras compatible parameters,
please refer to
this repository.
Each Keras Application expects a specific kind of input preprocessing.
For ConvNeXt, preprocessing is included in the model using a Normalization
layer. ConvNeXt models expect their inputs to be float or uint8 tensors of
pixels with values in the [0-255]
range.
When calling the summary()
method after instantiating a ConvNeXt model,
prefer setting the expand_nested
argument summary()
to TRUE
to better
investigate the instantiated model.
Instantiates the ConvNeXtSmall architecture.
application_convnext_small( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_small" )
application_convnext_small( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_small" )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
A ConvNet for the 2020s (CVPR 2022)
For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The base
, large
, and xlarge
models were first pre-trained on the
ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The
pre-trained parameters of the models were assembled from the
official repository. To get a
sense of how these parameters were converted to Keras compatible parameters,
please refer to
this repository.
Each Keras Application expects a specific kind of input preprocessing.
For ConvNeXt, preprocessing is included in the model using a Normalization
layer. ConvNeXt models expect their inputs to be float or uint8 tensors of
pixels with values in the [0-255]
range.
When calling the summary()
method after instantiating a ConvNeXt model,
prefer setting the expand_nested
argument summary()
to TRUE
to better
investigate the instantiated model.
Instantiates the ConvNeXtTiny architecture.
application_convnext_tiny( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_tiny" )
application_convnext_tiny( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_tiny" )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
A ConvNet for the 2020s (CVPR 2022)
For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The base
, large
, and xlarge
models were first pre-trained on the
ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The
pre-trained parameters of the models were assembled from the
official repository. To get a
sense of how these parameters were converted to Keras compatible parameters,
please refer to
this repository.
Each Keras Application expects a specific kind of input preprocessing.
For ConvNeXt, preprocessing is included in the model using a Normalization
layer. ConvNeXt models expect their inputs to be float or uint8 tensors of
pixels with values in the [0-255]
range.
When calling the summary()
method after instantiating a ConvNeXt model,
prefer setting the expand_nested
argument summary()
to TRUE
to better
investigate the instantiated model.
Instantiates the ConvNeXtXLarge architecture.
application_convnext_xlarge( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_xlarge" )
application_convnext_xlarge( include_top = TRUE, include_preprocessing = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "convnext_xlarge" )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
A ConvNet for the 2020s (CVPR 2022)
For image classification use cases, see this page for detailed examples. For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The base
, large
, and xlarge
models were first pre-trained on the
ImageNet-21k dataset and then fine-tuned on the ImageNet-1k dataset. The
pre-trained parameters of the models were assembled from the
official repository. To get a
sense of how these parameters were converted to Keras compatible parameters,
please refer to
this repository.
Each Keras Application expects a specific kind of input preprocessing.
For ConvNeXt, preprocessing is included in the model using a Normalization
layer. ConvNeXt models expect their inputs to be float or uint8 tensors of
pixels with values in the [0-255]
range.
When calling the summary()
method after instantiating a ConvNeXt model,
prefer setting the expand_nested
argument summary()
to TRUE
to better
investigate the instantiated model.
Instantiates the Densenet121 architecture.
application_densenet121( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet121" )
application_densenet121( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet121" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Keras model instance.
Densely Connected Convolutional Networks (CVPR 2017)
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at ~/.keras/keras.json
.
Each Keras Application expects a specific kind of input preprocessing.
For DenseNet, call application_preprocess_inputs()
on your inputs before passing them to the model.
Instantiates the Densenet169 architecture.
application_densenet169( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet169" )
application_densenet169( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet169" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Keras model instance.
Densely Connected Convolutional Networks (CVPR 2017)
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at ~/.keras/keras.json
.
Each Keras Application expects a specific kind of input preprocessing.
For DenseNet, call application_preprocess_inputs()
on your inputs before passing them to the model.
Instantiates the Densenet201 architecture.
application_densenet201( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet201" )
application_densenet201( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "densenet201" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Keras model instance.
Densely Connected Convolutional Networks (CVPR 2017)
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at ~/.keras/keras.json
.
Each Keras Application expects a specific kind of input preprocessing.
For DenseNet, call application_preprocess_inputs()
on your inputs before passing them to the model.
Instantiates the EfficientNetB0 architecture.
application_efficientnet_b0( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb0", ... )
application_efficientnet_b0( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb0", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB1 architecture.
application_efficientnet_b1( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb1", ... )
application_efficientnet_b1( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb1", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB2 architecture.
application_efficientnet_b2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb2", ... )
application_efficientnet_b2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb2", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB3 architecture.
application_efficientnet_b3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb3", ... )
application_efficientnet_b3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb3", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB4 architecture.
application_efficientnet_b4( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb4", ... )
application_efficientnet_b4( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb4", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB5 architecture.
application_efficientnet_b5( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb5", ... )
application_efficientnet_b5( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb5", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB6 architecture.
application_efficientnet_b6( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb6", ... )
application_efficientnet_b6( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb6", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetB7 architecture.
application_efficientnet_b7( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb7", ... )
application_efficientnet_b7( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "efficientnetb7", ... )
include_top |
Whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
... |
For forward/backward compatability. |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNet, input preprocessing is included as part of the model
(as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. EfficientNet models expect their inputs to be float
tensors of pixels with values in the [0-255]
range.
Instantiates the EfficientNetV2B0 architecture.
application_efficientnet_v2b0( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b0" )
application_efficientnet_v2b0( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b0" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2B1 architecture.
application_efficientnet_v2b1( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b1" )
application_efficientnet_v2b1( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b1" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2B2 architecture.
application_efficientnet_v2b2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b2" )
application_efficientnet_v2b2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b2" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2B3 architecture.
application_efficientnet_v2b3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b3" )
application_efficientnet_v2b3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-b3" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2L architecture.
application_efficientnet_v2l( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-l" )
application_efficientnet_v2l( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-l" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2M architecture.
application_efficientnet_v2m( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-m" )
application_efficientnet_v2m( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-m" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the EfficientNetV2S architecture.
application_efficientnet_v2s( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-s" )
application_efficientnet_v2s( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", include_preprocessing = TRUE, name = "efficientnetv2-s" )
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor
(i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A string or callable. The activation function to use
on the "top" layer. Ignored unless |
include_preprocessing |
Boolean, whether to include the preprocessing layer at the bottom of the network. |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For EfficientNetV2, by default input preprocessing is included as a part of
the model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, EfficientNetV2 models expect their
inputs to be float tensors of pixels with values in the [0, 255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled EfficientNetV2 models expect their inputs to be
float tensors of pixels with values in the [-1, 1]
range.
Instantiates the Inception-ResNet v2 architecture.
application_inception_resnet_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "inception_resnet_v2" )
application_inception_resnet_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "inception_resnet_v2" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of
input preprocessing. For InceptionResNetV2
, call
application_preprocess_inputs()
on your inputs before passing them to the model.
application_preprocess_inputs()
will scale input pixels between -1 and 1.
Instantiates the Inception v3 architecture.
application_inception_v3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "inception_v3" )
application_inception_v3( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "inception_v3" )
include_top |
Boolean, whether to include the fully-connected
layer at the top, as the last layer of the network.
Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor (i.e. output of |
input_shape |
Optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For InceptionV3
, call
application_preprocess_inputs()
on your inputs
before passing them to the model.
application_preprocess_inputs()
will scale input pixels between -1
and 1
.
Instantiates the MobileNet architecture.
application_mobilenet( input_shape = NULL, alpha = 1, depth_multiplier = 1L, dropout = 0.001, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = NULL )
application_mobilenet( input_shape = NULL, alpha = 1, depth_multiplier = 1L, dropout = 0.001, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = NULL )
input_shape |
Optional shape tuple, only to be specified if |
alpha |
Controls the width of the network. This is known as the width multiplier in the MobileNet paper.
|
depth_multiplier |
Depth multiplier for depthwise convolution.
This is called the resolution multiplier in the MobileNet paper.
Defaults to |
dropout |
Dropout rate. Defaults to |
include_top |
Boolean, whether to include the fully-connected layer
at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor (i.e. output of |
pooling |
Optional pooling mode for feature extraction when
|
classes |
Optional number of classes to classify images into,
only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For MobileNet, call application_preprocess_inputs()
on your inputs before passing them to the model.
application_preprocess_inputs()
will scale input pixels between -1
and 1
.
MobileNetV2 is very similar to the original MobileNet, except that it uses inverted residual blocks with bottlenecking features. It has a drastically lower parameter count than the original MobileNet. MobileNets support any input size greater than 32 x 32, with larger image sizes offering better performance.
application_mobilenet_v2( input_shape = NULL, alpha = 1, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = NULL )
application_mobilenet_v2( input_shape = NULL, alpha = 1, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = NULL )
input_shape |
Optional shape tuple, only to be specified if |
alpha |
Controls the width of the network. This is known as the width multiplier in the MobileNet paper.
|
include_top |
Boolean, whether to include the fully-connected layer
at the top of the network. Defaults to |
weights |
One of |
input_tensor |
Optional Keras tensor (i.e. output of |
pooling |
Optional pooling mode for feature extraction when
|
classes |
Optional number of classes to classify images into,
only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
This function returns a Keras image classification model, optionally loaded with weights pre-trained on ImageNet.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For MobileNetV2, call
application_preprocess_inputs()
on your inputs before passing them to the model.
application_preprocess_inputs()
will scale input pixels between -1
and 1
.
Instantiates the MobileNetV3Large architecture.
application_mobilenet_v3_large( input_shape = NULL, alpha = 1, minimalistic = FALSE, include_top = TRUE, weights = "imagenet", input_tensor = NULL, classes = 1000L, pooling = NULL, dropout_rate = 0.2, classifier_activation = "softmax", include_preprocessing = TRUE, name = "MobileNetV3Large" )
application_mobilenet_v3_large( input_shape = NULL, alpha = 1, minimalistic = FALSE, include_top = TRUE, weights = "imagenet", input_tensor = NULL, classes = 1000L, pooling = NULL, dropout_rate = 0.2, classifier_activation = "softmax", include_preprocessing = TRUE, name = "MobileNetV3Large" )
input_shape |
Optional shape tuple, to be specified if you would
like to use a model with an input image resolution that is not
|
alpha |
controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.
|
minimalistic |
In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP. |
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
String, one of |
input_tensor |
Optional Keras tensor (i.e. output of
|
classes |
Integer, optional number of classes to classify images
into, only to be specified if |
pooling |
String, optional pooling mode for feature extraction
when
|
dropout_rate |
fraction of the input units to drop on the last layer. |
classifier_activation |
A |
include_preprocessing |
Boolean, whether to include the preprocessing
layer ( |
name |
The name of the model (string). |
A model instance.
Searching for MobileNetV3 (ICCV 2019)
MACs stands for Multiply Adds
Classification Checkpoint | MACs(M) | Parameters(M) | Top1 Accuracy | Pixel1 CPU(ms) |
mobilenet_v3_large_1.0_224 | 217 | 5.4 | 75.6 | 51.2 |
mobilenet_v3_large_0.75_224 | 155 | 4.0 | 73.3 | 39.8 |
mobilenet_v3_large_minimalistic_1.0_224 | 209 | 3.9 | 72.3 | 44.1 |
mobilenet_v3_small_1.0_224 | 66 | 2.9 | 68.1 | 15.8 |
mobilenet_v3_small_0.75_224 | 44 | 2.4 | 65.4 | 12.8 |
mobilenet_v3_small_minimalistic_1.0_224 | 65 | 2.0 | 61.9 | 12.2 |
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For MobileNetV3, by default input preprocessing is included as a part of the
model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, MobileNetV3 models expect their
inputs to be float tensors of pixels with values in the [0-255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled MobileNetV3 models expect their inputs to be float
tensors of pixels with values in the [-1, 1]
range.
inputs
: A floating point numpy.array
or backend-native tensor,
4D with 3 color channels, with values in the range [0, 255]
if include_preprocessing
is TRUE
and in the range [-1, 1]
otherwise.
Instantiates the MobileNetV3Small architecture.
application_mobilenet_v3_small( input_shape = NULL, alpha = 1, minimalistic = FALSE, include_top = TRUE, weights = "imagenet", input_tensor = NULL, classes = 1000L, pooling = NULL, dropout_rate = 0.2, classifier_activation = "softmax", include_preprocessing = TRUE, name = "MobileNetV3Small" )
application_mobilenet_v3_small( input_shape = NULL, alpha = 1, minimalistic = FALSE, include_top = TRUE, weights = "imagenet", input_tensor = NULL, classes = 1000L, pooling = NULL, dropout_rate = 0.2, classifier_activation = "softmax", include_preprocessing = TRUE, name = "MobileNetV3Small" )
input_shape |
Optional shape tuple, to be specified if you would
like to use a model with an input image resolution that is not
|
alpha |
controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for consistency with MobileNetV1 in Keras.
|
minimalistic |
In addition to large and small models this module also contains so-called minimalistic models, these models have the same per-layer dimensions characteristic as MobilenetV3 however, they don't utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, and 5x5 convolutions). While these models are less efficient on CPU, they are much more performant on GPU/DSP. |
include_top |
Boolean, whether to include the fully-connected
layer at the top of the network. Defaults to |
weights |
String, one of |
input_tensor |
Optional Keras tensor (i.e. output of
|
classes |
Integer, optional number of classes to classify images
into, only to be specified if |
pooling |
String, optional pooling mode for feature extraction
when
|
dropout_rate |
fraction of the input units to drop on the last layer. |
classifier_activation |
A |
include_preprocessing |
Boolean, whether to include the preprocessing
layer ( |
name |
The name of the model (string). |
A model instance.
Searching for MobileNetV3 (ICCV 2019)
MACs stands for Multiply Adds
Classification Checkpoint | MACs(M) | Parameters(M) | Top1 Accuracy | Pixel1 CPU(ms) |
mobilenet_v3_large_1.0_224 | 217 | 5.4 | 75.6 | 51.2 |
mobilenet_v3_large_0.75_224 | 155 | 4.0 | 73.3 | 39.8 |
mobilenet_v3_large_minimalistic_1.0_224 | 209 | 3.9 | 72.3 | 44.1 |
mobilenet_v3_small_1.0_224 | 66 | 2.9 | 68.1 | 15.8 |
mobilenet_v3_small_0.75_224 | 44 | 2.4 | 65.4 | 12.8 |
mobilenet_v3_small_minimalistic_1.0_224 | 65 | 2.0 | 61.9 | 12.2 |
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For MobileNetV3, by default input preprocessing is included as a part of the
model (as a Rescaling
layer), and thus
application_preprocess_inputs()
is actually a
pass-through function. In this use case, MobileNetV3 models expect their
inputs to be float tensors of pixels with values in the [0-255]
range.
At the same time, preprocessing as a part of the model (i.e. Rescaling
layer) can be disabled by setting include_preprocessing
argument to FALSE
.
With preprocessing disabled MobileNetV3 models expect their inputs to be float
tensors of pixels with values in the [-1, 1]
range.
inputs
: A floating point numpy.array
or backend-native tensor,
4D with 3 color channels, with values in the range [0, 255]
if include_preprocessing
is TRUE
and in the range [-1, 1]
otherwise.
Instantiates a NASNet model in ImageNet mode.
application_nasnet_large( input_shape = NULL, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "nasnet_large" )
application_nasnet_large( input_shape = NULL, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "nasnet_large" )
input_shape |
Optional shape tuple, only to be specified
if |
include_top |
Whether to include the fully-connected layer at the top of the network. |
weights |
|
input_tensor |
Optional Keras tensor (i.e. output of
|
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Keras model instance.
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at ~/.keras/keras.json
.
Each Keras Application expects a specific kind of input preprocessing.
For NASNet, call application_preprocess_inputs()
on your
inputs before passing them to the model.
Instantiates a Mobile NASNet model in ImageNet mode.
application_nasnet_mobile( input_shape = NULL, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "nasnet_mobile" )
application_nasnet_mobile( input_shape = NULL, include_top = TRUE, weights = "imagenet", input_tensor = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "nasnet_mobile" )
input_shape |
Optional shape tuple, only to be specified
if |
include_top |
Whether to include the fully-connected layer at the top of the network. |
weights |
|
input_tensor |
Optional Keras tensor (i.e. output of
|
pooling |
Optional pooling mode for feature extraction
when
|
classes |
Optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Keras model instance.
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at ~/.keras/keras.json
.
Each Keras Application expects a specific kind of input preprocessing.
For NASNet, call application_preprocess_inputs()
on your
inputs before passing them to the model.
Instantiates the ResNet101 architecture.
application_resnet101( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet101" )
application_resnet101( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet101" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Deep Residual Learning for Image Recognition (CVPR 2015)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will convert
the input images from RGB to BGR, then will zero-center each color channel with
respect to the ImageNet dataset, without scaling.
Instantiates the ResNet101V2 architecture.
application_resnet101_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet101v2" )
application_resnet101_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet101v2" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Identity Mappings in Deep Residual Networks (CVPR 2016)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will
scale input pixels between -1 and 1.
Instantiates the ResNet152 architecture.
application_resnet152( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet152" )
application_resnet152( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet152" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Deep Residual Learning for Image Recognition (CVPR 2015)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will convert
the input images from RGB to BGR, then will zero-center each color channel with
respect to the ImageNet dataset, without scaling.
Instantiates the ResNet152V2 architecture.
application_resnet152_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet152v2" )
application_resnet152_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet152v2" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Identity Mappings in Deep Residual Networks (CVPR 2016)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will
scale input pixels between -1
and 1
.
Instantiates the ResNet50 architecture.
application_resnet50( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet50" )
application_resnet50( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet50" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Deep Residual Learning for Image Recognition (CVPR 2015)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will convert
the input images from RGB to BGR, then will zero-center each color channel with
respect to the ImageNet dataset, without scaling.
Instantiates the ResNet50V2 architecture.
application_resnet50_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet50v2" )
application_resnet50_v2( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "resnet50v2" )
include_top |
whether to include the fully-connected layer at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor (i.e. output of |
input_shape |
optional shape tuple, only to be specified if |
pooling |
Optional pooling mode for feature extraction when
|
classes |
optional number of classes to classify images into, only to be
specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model instance.
Identity Mappings in Deep Residual Networks (CVPR 2016)
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
Each Keras Application expects a specific kind of input preprocessing.
For ResNet, call application_preprocess_inputs()
on your
inputs before passing them to the model. application_preprocess_inputs()
will
scale input pixels between -1
and 1
.
Instantiates the VGG16 model.
application_vgg16( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "vgg16" )
application_vgg16( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "vgg16" )
include_top |
whether to include the 3 fully-connected layers at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A Model
instance.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The default input size for this model is 224x224.
Each Keras Application expects a specific kind of input preprocessing.
For VGG16, call application_preprocess_inputs()
on your
inputs before passing them to the model.
application_preprocess_inputs()
will convert the input images from RGB to BGR,
then will zero-center each color channel with respect to the ImageNet
dataset, without scaling.
Instantiates the VGG19 model.
application_vgg19( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "vgg19" )
application_vgg19( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "vgg19" )
include_top |
whether to include the 3 fully-connected layers at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The default input size for this model is 224x224.
Each Keras Application expects a specific kind of input preprocessing.
For VGG19, call application_preprocess_inputs()
on your
inputs before passing them to the model.
application_preprocess_inputs()
will convert the input images from RGB to BGR,
then will zero-center each color channel with respect to the ImageNet
dataset, without scaling.
Instantiates the Xception architecture.
application_xception( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "xception" )
application_xception( include_top = TRUE, weights = "imagenet", input_tensor = NULL, input_shape = NULL, pooling = NULL, classes = 1000L, classifier_activation = "softmax", name = "xception" )
include_top |
whether to include the 3 fully-connected layers at the top of the network. |
weights |
one of |
input_tensor |
optional Keras tensor
(i.e. output of |
input_shape |
optional shape tuple, only to be specified
if |
pooling |
Optional pooling mode for feature extraction
when
|
classes |
optional number of classes to classify images
into, only to be specified if |
classifier_activation |
A |
name |
The name of the model (string). |
A model instance.
For image classification use cases, see this page for detailed examples.
For transfer learning use cases, make sure to read the guide to transfer learning & fine-tuning.
The default input image size for this model is 299x299.
Each Keras Application expects a specific kind of input preprocessing.
For Xception, call application_preprocess_inputs()
on your inputs before passing them to the model.
application_preprocess_inputs()
will scale input pixels between -1
and 1
.
tf.data.Dataset
from audio files in a directory.If your directory structure is:
main_directory/ ...class_a/ ......a_audio_1.wav ......a_audio_2.wav ...class_b/ ......b_audio_1.wav ......b_audio_2.wav
Then calling audio_dataset_from_directory(main_directory, labels = 'inferred')
will return a tf.data.Dataset
that yields batches of audio files from
the subdirectories class_a
and class_b
, together with labels
0 and 1 (0 corresponding to class_a
and 1 corresponding to class_b
).
Only .wav
files are supported at this time.
audio_dataset_from_directory( directory, labels = "inferred", label_mode = "int", class_names = NULL, batch_size = 32L, sampling_rate = NULL, output_sequence_length = NULL, ragged = FALSE, shuffle = TRUE, seed = NULL, validation_split = NULL, subset = NULL, follow_links = FALSE, verbose = TRUE )
audio_dataset_from_directory( directory, labels = "inferred", label_mode = "int", class_names = NULL, batch_size = 32L, sampling_rate = NULL, output_sequence_length = NULL, ragged = FALSE, shuffle = TRUE, seed = NULL, validation_split = NULL, subset = NULL, follow_links = FALSE, verbose = TRUE )
directory |
Directory where the data is located.
If |
labels |
Either "inferred" (labels are generated from the directory
structure), |
label_mode |
String describing the encoding of
|
class_names |
Only valid if "labels" is |
batch_size |
Size of the batches of data. Default: 32. If |
sampling_rate |
Audio sampling rate (in samples per second). |
output_sequence_length |
Maximum length of an audio sequence. Audio files
longer than this will be truncated to |
ragged |
Whether to return a Ragged dataset (where each sequence has its
own length). Defaults to |
shuffle |
Whether to shuffle the data. Defaults to |
seed |
Optional random seed for shuffling and transformations. |
validation_split |
Optional float between 0 and 1, fraction of data to reserve for validation. |
subset |
Subset of the data to return. One of |
follow_links |
Whether to visits subdirectories pointed to by symlinks.
Defaults to |
verbose |
Whether to display number information on classes and
number of files found. Defaults to |
A tf.data.Dataset
object.
If label_mode
is NULL
, it yields string
tensors of shape
(batch_size,)
, containing the contents of a batch of audio files.
Otherwise, it yields a tuple (audio, labels)
, where audio
has shape (batch_size, sequence_length, num_channels)
and labels
follows the format described
below.
Rules regarding labels format:
if label_mode
is int
, the labels are an int32
tensor of shape
(batch_size,)
.
if label_mode
is binary
, the labels are a float32
tensor of
1s and 0s of shape (batch_size, 1)
.
if label_mode
is categorical
, the labels are a float32
tensor
of shape (batch_size, num_classes)
, representing a one-hot
encoding of the class index.
Other dataset utils: image_dataset_from_directory()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
Other utils: clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Callback
classCallbacks can be passed to keras methods such as fit()
, evaluate()
, and
predict()
in order to hook into the various stages of the model training,
evaluation, and inference lifecycle.
To create a custom callback, call Callback()
and
override the method associated with the stage of interest.
Callback( classname, on_epoch_begin = NULL, on_epoch_end = NULL, on_train_begin = NULL, on_train_end = NULL, on_train_batch_begin = NULL, on_train_batch_end = NULL, on_test_begin = NULL, on_test_end = NULL, on_test_batch_begin = NULL, on_test_batch_end = NULL, on_predict_begin = NULL, on_predict_end = NULL, on_predict_batch_begin = NULL, on_predict_batch_end = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
Callback( classname, on_epoch_begin = NULL, on_epoch_end = NULL, on_train_begin = NULL, on_train_end = NULL, on_train_batch_begin = NULL, on_train_batch_end = NULL, on_test_begin = NULL, on_test_end = NULL, on_test_batch_begin = NULL, on_test_batch_end = NULL, on_predict_begin = NULL, on_predict_end = NULL, on_predict_batch_begin = NULL, on_predict_batch_end = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
classname |
String, the name of the custom class. (Conventionally, CamelCase). |
on_epoch_begin |
\(epoch, logs = NULL) Called at the start of an epoch. Subclasses should override for any actions to run. This function should only be called during TRAIN mode. Args:
|
on_epoch_end |
\(epoch, logs = NULL) Called at the end of an epoch. Subclasses should override for any actions to run. This function should only be called during TRAIN mode. Args:
|
on_train_begin |
\(logs = NULL) Called at the beginning of training. Subclasses should override for any actions to run. Args:
|
on_train_end |
\(logs = NULL) Called at the end of training. Subclasses should override for any actions to run. Args:
|
on_train_batch_begin |
\(batch, logs = NULL) Called at the beginning of a training batch in Subclasses should override for any actions to run. Note that if the Args:
|
on_train_batch_end |
\(batch, logs=NULL) Called at the end of a training batch in Subclasses should override for any actions to run. Note that if the Args:
|
on_test_begin |
\(logs = NULL) Called at the beginning of evaluation or validation. Subclasses should override for any actions to run. Args:
|
on_test_end |
\(logs = NULL) Called at the end of evaluation or validation. Subclasses should override for any actions to run. Args:
|
on_test_batch_begin |
\(batch, logs = NULL) Called at the beginning of a batch in Also called at the beginning of a validation batch in the Subclasses should override for any actions to run. Note that if the Args:
|
on_test_batch_end |
\(batch, logs = NULL) Called at the end of a batch in Also called at the end of a validation batch in the Subclasses should override for any actions to run. Note that if the Args:
|
on_predict_begin |
\(logs = NULL) Called at the beginning of prediction. Subclasses should override for any actions to run. Args:
|
on_predict_end |
\(logs = NULL) Called at the end of prediction. Subclasses should override for any actions to run. Args:
|
on_predict_batch_begin |
\(batch, logs = NULL) Called at the beginning of a batch in Subclasses should override for any actions to run. Note that if the Args:
|
on_predict_batch_end |
\(batch, logs = NULL) Called at the end of a batch in Subclasses should override for any actions to run. Note that if the Args:
|
... , public
|
Additional methods or public members of the custom class. |
private |
Named list of R objects (typically, functions) to include in
instance private environments. |
inherit |
What the custom class will subclass. By default, the base keras class. |
parent_env |
The R environment that all class methods will have as a grandparent. |
A function that returns the custom Callback
instances,
similar to the builtin callback functions.
training_finished <- FALSE callback_mark_finished <- Callback("MarkFinished", on_train_end = function(logs = NULL) { training_finished <<- TRUE } ) model <- keras_model_sequential(input_shape = c(1)) |> layer_dense(1) model |> compile(loss = 'mean_squared_error') model |> fit(op_ones(c(1, 1)), op_ones(c(1, 1)), callbacks = callback_mark_finished()) stopifnot(isTRUE(training_finished))
All R function custom methods (public and private) will have the following symbols in scope:
self
: the Layer
instance.
super
: the Layer
superclass.
private
: An R environment specific to the class instance.
Any objects defined here will be invisible to the Keras framework.
__class__
the current class type object. This will also be available as
an alias symbol, the value supplied to Layer(classname = )
self$
)params
: Named list, Training parameters
(e.g. verbosity, batch size, number of epochs, ...).
model
: Instance of Model
.
Reference of the model being trained.
The logs
named list that callback methods
take as argument will contain keys for quantities relevant to
the current batch or epoch (see method-specific docstrings).
All R function custom methods (public and private) will have the following symbols in scope:
self
: The custom class instance.
super
: The custom class superclass.
private
: An R environment specific to the class instance.
Any objects assigned here are invisible to the Keras framework.
__class__
and as.symbol(classname)
: the custom class type object.
Other callbacks: callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
callback_backup_and_restore()
callback is intended to recover training from an
interruption that has happened in the middle of a fit
execution, by
backing up the training states in a temporary checkpoint file, at the end of
each epoch. Each backup overwrites the previously written checkpoint file,
so at any given time there is at most one such checkpoint file for
backup/restoring purpose.
If training restarts before completion, the training state (which includes
the model weights and epoch number) is restored to the most recently saved
state at the beginning of a new fit
run. At the completion of a
fit
run, the temporary checkpoint file is deleted.
Note that the user is responsible to bring jobs back after the interruption.
This callback is important for the backup and restore mechanism for fault
tolerance purpose, and the model to be restored from a previous checkpoint
is expected to be the same as the one used to back up. If user changes
arguments passed to compile
or fit
, the checkpoint saved for fault tolerance
can become invalid.
callback_backup_and_restore( backup_dir, save_freq = "epoch", delete_checkpoint = TRUE )
callback_backup_and_restore( backup_dir, save_freq = "epoch", delete_checkpoint = TRUE )
backup_dir |
String, path of directory where to store the data
needed to restore the model. The directory
cannot be reused elsewhere to store other files, e.g. by the
|
save_freq |
|
delete_checkpoint |
Boolean. This |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
callback_interrupting <- new_callback_class( "InterruptingCallback", on_epoch_begin = function(epoch, logs = NULL) { if (epoch == 4) { stop('Interrupting!') } } ) backup_dir <- tempfile() callback <- callback_backup_and_restore(backup_dir = backup_dir) model <- keras_model_sequential() %>% layer_dense(10) model %>% compile(optimizer = optimizer_sgd(), loss = 'mse') # ensure model is built (i.e., weights are initialized) for # callback_backup_and_restore() model(op_ones(c(5, 20))) |> invisible() tryCatch({ model %>% fit(x = op_ones(c(5, 20)), y = op_zeros(5), epochs = 10, batch_size = 1, callbacks = list(callback, callback_interrupting()), verbose = 0) }, python.builtin.RuntimeError = function(e) message("Interrupted!"))
## Interrupted!
model$history$epoch
## [1] 0 1 2
# model$history %>% keras3:::to_keras_training_history() %>% as.data.frame() %>% print() history <- model %>% fit(x = op_ones(c(5, 20)), y = op_zeros(5), epochs = 10, batch_size = 1, callbacks = list(callback), verbose = 0) # Only 6 more epochs are run, since first training got interrupted at # zero-indexed epoch 4, second training will continue from 4 to 9. nrow(as.data.frame(history))
## [1] 10
Other callbacks: Callback()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
Supports all values that can be represented as a string, including 1D iterables such as atomic vectors.
callback_csv_logger(filename, separator = ",", append = FALSE)
callback_csv_logger(filename, separator = ",", append = FALSE)
filename |
Filename of the CSV file, e.g. |
separator |
String used to separate elements in the CSV file. |
append |
Boolean. |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
csv_logger <- callback_csv_logger('training.log') model %>% fit(X_train, Y_train, callbacks = list(csv_logger))
Other callbacks: Callback()
callback_backup_and_restore()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
Assuming the goal of a training is to minimize the loss. With this, the
metric to be monitored would be 'loss'
, and mode would be 'min'
. A
model$fit()
training loop will check at end of every epoch whether
the loss is no longer decreasing, considering the min_delta
and
patience
if applicable. Once it's found no longer decreasing,
model$stop_training
is marked TRUE
and the training terminates.
The quantity to be monitored needs to be available in logs
list.
To make it so, pass the loss or metrics at model$compile()
.
callback_early_stopping( monitor = "val_loss", min_delta = 0L, patience = 0L, verbose = 0L, mode = "auto", baseline = NULL, restore_best_weights = FALSE, start_from_epoch = 0L )
callback_early_stopping( monitor = "val_loss", min_delta = 0L, patience = 0L, verbose = 0L, mode = "auto", baseline = NULL, restore_best_weights = FALSE, start_from_epoch = 0L )
monitor |
Quantity to be monitored. Defaults to |
min_delta |
Minimum change in the monitored quantity to qualify as an
improvement, i.e. an absolute change of less than min_delta, will
count as no improvement. Defaults to |
patience |
Number of epochs with no improvement after which training will
be stopped. Defaults to |
verbose |
Verbosity mode, 0 or 1. Mode 0 is silent, and mode 1 displays
messages when the callback takes an action. Defaults to |
mode |
One of |
baseline |
Baseline value for the monitored quantity. If not |
restore_best_weights |
Whether to restore model weights from the epoch
with the best value of the monitored quantity. If |
start_from_epoch |
Number of epochs to wait before starting to monitor
improvement. This allows for a warm-up period in which no
improvement is expected and thus training will not be stopped.
Defaults to |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
callback <- callback_early_stopping(monitor = 'loss', patience = 3) # This callback will stop the training when there is no improvement in # the loss for three consecutive epochs. model <- keras_model_sequential() %>% layer_dense(10) model %>% compile(optimizer = optimizer_sgd(), loss = 'mse') history <- model %>% fit(x = op_ones(c(5, 20)), y = op_zeros(5), epochs = 10, batch_size = 1, callbacks = list(callback), verbose = 0) nrow(as.data.frame(history)) # Only 4 epochs are run.
## [1] 10
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
This callback is constructed with anonymous functions that will be called
at the appropriate time (during Model.{fit | evaluate | predict}
).
Note that the callbacks expects positional arguments, as:
on_epoch_begin
and on_epoch_end
expect two positional arguments:
epoch
, logs
on_train_begin
and on_train_end
expect one positional argument:
logs
on_train_batch_begin
and on_train_batch_end
expect two positional
arguments: batch
, logs
See Callback
class definition for the full list of functions and their
expected arguments.
callback_lambda( on_epoch_begin = NULL, on_epoch_end = NULL, on_train_begin = NULL, on_train_end = NULL, on_train_batch_begin = NULL, on_train_batch_end = NULL, ... )
callback_lambda( on_epoch_begin = NULL, on_epoch_end = NULL, on_train_begin = NULL, on_train_end = NULL, on_train_batch_begin = NULL, on_train_batch_end = NULL, ... )
on_epoch_begin |
called at the beginning of every epoch. |
on_epoch_end |
called at the end of every epoch. |
on_train_begin |
called at the beginning of model training. |
on_train_end |
called at the end of model training. |
on_train_batch_begin |
called at the beginning of every train batch. |
on_train_batch_end |
called at the end of every train batch. |
... |
Any function in |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
# Print the batch number at the beginning of every batch. batch_print_callback <- callback_lambda( on_train_batch_begin = function(batch, logs) { print(batch) } ) # Stream the epoch loss to a file in new-line delimited JSON format # (one valid JSON object per line) json_log <- file('loss_log.json', open = 'wt') json_logging_callback <- callback_lambda( on_epoch_end = function(epoch, logs) { jsonlite::write_json( list(epoch = epoch, loss = logs$loss), json_log, append = TRUE ) }, on_train_end = function(logs) { close(json_log) } ) # Terminate some processes after having finished model training. processes <- ... cleanup_callback <- callback_lambda( on_train_end = function(logs) { for (p in processes) { if (is_alive(p)) { terminate(p) } } } ) model %>% fit( ..., callbacks = list( batch_print_callback, json_logging_callback, cleanup_callback ) )
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
At the beginning of every epoch, this callback gets the updated learning
rate value from schedule
function provided, with the current
epoch and current learning rate, and applies the updated learning rate on
the optimizer.
callback_learning_rate_scheduler(schedule, verbose = 0L)
callback_learning_rate_scheduler(schedule, verbose = 0L)
schedule |
A function that takes an epoch index (integer, indexed from 0) and current learning rate (float) as inputs and returns a new learning rate as output (float). |
verbose |
Integer. 0: quiet, 1: log update messages. |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
# This function keeps the initial learning rate steady for the first ten epochs # and decreases it exponentially after that. scheduler <- function(epoch, lr) { if (epoch < 10) return(lr) else return(lr * exp(-0.1)) } model <- keras_model_sequential() |> layer_dense(units = 10) model |> compile(optimizer = optimizer_sgd(), loss = 'mse') model$optimizer$learning_rate |> as.array() |> round(5)
## [1] 0.01
callback <- callback_learning_rate_scheduler(schedule = scheduler) history <- model |> fit(x = array(runif(100), c(5, 20)), y = array(0, c(5, 1)), epochs = 15, callbacks = list(callback), verbose = 0) model$optimizer$learning_rate |> as.array() |> round(5)
## [1] 0.00607
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
callback_model_checkpoint()
is used in conjunction with training using
model |> fit()
to save a model or weights (in a checkpoint file) at some
interval, so the model or weights can be loaded later to continue the
training from the state saved.
A few options this callback provides include:
Whether to only keep the model that has achieved the "best performance" so far, or whether to save the model at the end of every epoch regardless of performance.
Definition of "best"; which quantity to monitor and whether it should be maximized or minimized.
The frequency it should save at. Currently, the callback supports saving at the end of every epoch, or after a fixed number of training batches.
Whether only weights are saved, or the whole model is saved.
callback_model_checkpoint( filepath, monitor = "val_loss", verbose = 0L, save_best_only = FALSE, save_weights_only = FALSE, mode = "auto", save_freq = "epoch", initial_value_threshold = NULL )
callback_model_checkpoint( filepath, monitor = "val_loss", verbose = 0L, save_best_only = FALSE, save_weights_only = FALSE, mode = "auto", save_freq = "epoch", initial_value_threshold = NULL )
filepath |
string, path to save the model file.
|
monitor |
The metric name to monitor. Typically the metrics are set by
the
|
verbose |
Verbosity mode, 0 or 1. Mode 0 is silent, and mode 1 displays messages when the callback takes an action. |
save_best_only |
if |
save_weights_only |
if TRUE, then only the model's weights will be saved
( |
mode |
one of { |
save_freq |
|
initial_value_threshold |
Floating point initial "best" value of the
metric to be monitored. Only applies if |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
model <- keras_model_sequential(input_shape = c(10)) |> layer_dense(1, activation = "sigmoid") |> compile(loss = "binary_crossentropy", optimizer = "adam", metrics = c('accuracy')) EPOCHS <- 10 checkpoint_filepath <- tempfile('checkpoint-model-', fileext = ".keras") model_checkpoint_callback <- callback_model_checkpoint( filepath = checkpoint_filepath, monitor = 'val_accuracy', mode = 'max', save_best_only = TRUE ) # Model is saved at the end of every epoch, if it's the best seen so far. model |> fit(x = random_uniform(c(2, 10)), y = op_ones(2, 1), epochs = EPOCHS, validation_split = .5, verbose = 0, callbacks = list(model_checkpoint_callback)) # The model (that are considered the best) can be loaded as - load_model(checkpoint_filepath)
## Model: "sequential" ## +---------------------------------+------------------------+---------------+ ## | Layer (type) | Output Shape | Param # | ## +=================================+========================+===============+ ## | dense (Dense) | (None, 1) | 11 | ## +---------------------------------+------------------------+---------------+ ## Total params: 35 (144.00 B) ## Trainable params: 11 (44.00 B) ## Non-trainable params: 0 (0.00 B) ## Optimizer params: 24 (100.00 B)
# Alternatively, one could checkpoint just the model weights as - checkpoint_filepath <- tempfile('checkpoint-', fileext = ".weights.h5") model_checkpoint_callback <- callback_model_checkpoint( filepath = checkpoint_filepath, save_weights_only = TRUE, monitor = 'val_accuracy', mode = 'max', save_best_only = TRUE ) # Model weights are saved at the end of every epoch, if it's the best seen # so far. # same as above model |> fit(x = random_uniform(c(2, 10)), y = op_ones(2, 1), epochs = EPOCHS, validation_split = .5, verbose = 0, callbacks = list(model_checkpoint_callback)) # The model weights (that are considered the best) can be loaded model |> load_model_weights(checkpoint_filepath)
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
Models often benefit from reducing the learning rate by a factor of 2-10 once learning stagnates. This callback monitors a quantity and if no improvement is seen for a 'patience' number of epochs, the learning rate is reduced.
callback_reduce_lr_on_plateau( monitor = "val_loss", factor = 0.1, patience = 10L, verbose = 0L, mode = "auto", min_delta = 1e-04, cooldown = 0L, min_lr = 0, ... )
callback_reduce_lr_on_plateau( monitor = "val_loss", factor = 0.1, patience = 10L, verbose = 0L, mode = "auto", min_delta = 1e-04, cooldown = 0L, min_lr = 0, ... )
monitor |
String. Quantity to be monitored. |
factor |
Float. Factor by which the learning rate will be reduced.
|
patience |
Integer. Number of epochs with no improvement after which learning rate will be reduced. |
verbose |
Integer. 0: quiet, 1: update messages. |
mode |
String. One of |
min_delta |
Float. Threshold for measuring the new optimum, to only focus on significant changes. |
cooldown |
Integer. Number of epochs to wait before resuming normal operation after the learning rate has been reduced. |
min_lr |
Float. Lower bound on the learning rate. |
... |
For forward/backward compatability. |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
reduce_lr <- callback_reduce_lr_on_plateau(monitor = 'val_loss', factor = 0.2, patience = 5, min_lr = 0.001) model %>% fit(x_train, y_train, callbacks = list(reduce_lr))
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
Requires the requests
library.
Events are sent to root + '/publish/epoch/end/'
by default. Calls are
HTTP POST, with a data
argument which is a
JSON-encoded named list of event data.
If send_as_json = TRUE
, the content type of the request will be
"application/json"
.
Otherwise the serialized JSON will be sent within a form.
callback_remote_monitor( root = "http://localhost:9000", path = "/publish/epoch/end/", field = "data", headers = NULL, send_as_json = FALSE )
callback_remote_monitor( root = "http://localhost:9000", path = "/publish/epoch/end/", field = "data", headers = NULL, send_as_json = FALSE )
root |
String; root url of the target server. |
path |
String; path relative to |
field |
String; JSON field under which the data will be stored.
The field is used only if the payload is sent within a form
(i.e. when |
headers |
Named list; optional custom HTTP headers. |
send_as_json |
Boolean; whether the request should be
sent as |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_swap_ema_weights()
callback_tensorboard()
callback_terminate_on_nan()
This callbacks replaces the model's weight values with the values of the optimizer's EMA weights (the exponential moving average of the past model weights values, implementing "Polyak averaging") before model evaluation, and restores the previous weights after evaluation.
The SwapEMAWeights
callback is to be used in conjunction with
an optimizer that sets use_ema = TRUE
.
Note that the weights are swapped in-place in order to save memory. The behavior is undefined if you modify the EMA weights or model weights in other callbacks.
callback_swap_ema_weights(swap_on_epoch = FALSE)
callback_swap_ema_weights(swap_on_epoch = FALSE)
swap_on_epoch |
Whether to perform swapping at |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
# Remember to set `use_ema=TRUE` in the optimizer optimizer <- optimizer_sgd(use_ema = TRUE) model |> compile(optimizer = optimizer, loss = ..., metrics = ...) # Metrics will be computed with EMA weights model |> fit(X_train, Y_train, callbacks = c(callback_swap_ema_weights())) # If you want to save model checkpoint with EMA weights, you can set # `swap_on_epoch=TRUE` and place ModelCheckpoint after SwapEMAWeights. model |> fit( X_train, Y_train, callbacks = c( callback_swap_ema_weights(swap_on_epoch = TRUE), callback_model_checkpoint(...) ) )
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_tensorboard()
callback_terminate_on_nan()
TensorBoard is a visualization tool provided with TensorFlow. A TensorFlow installation is required to use this callback.
This callback logs events for TensorBoard, including:
Metrics summary plots
Training graph visualization
Weight histograms
Sampled profiling
When used in model |> evaluate()
or regular validation
in addition to epoch summaries, there will be a summary that records
evaluation metrics vs model$optimizer$iterations
written. The metric names
will be prepended with evaluation
, with model$optimizer$iterations
being
the step in the visualized TensorBoard.
If you have installed TensorFlow with pip
or reticulate::py_install()
, you should be able
to launch TensorBoard from the command line:
tensorboard --logdir=path_to_your_logs
or from R with tensorflow::tensorboard()
.
You can find more information about TensorBoard here.
callback_tensorboard( log_dir = "logs", histogram_freq = 0L, write_graph = TRUE, write_images = FALSE, write_steps_per_second = FALSE, update_freq = "epoch", profile_batch = 0L, embeddings_freq = 0L, embeddings_metadata = NULL )
callback_tensorboard( log_dir = "logs", histogram_freq = 0L, write_graph = TRUE, write_images = FALSE, write_steps_per_second = FALSE, update_freq = "epoch", profile_batch = 0L, embeddings_freq = 0L, embeddings_metadata = NULL )
log_dir |
the path of the directory where to save the log files to be
parsed by TensorBoard. e.g.,
|
histogram_freq |
frequency (in epochs) at which to compute weight histograms for the layers of the model. If set to 0, histograms won't be computed. Validation data (or split) must be specified for histogram visualizations. |
write_graph |
(Not supported at this time)
Whether to visualize the graph in TensorBoard.
Note that the log file can become quite large
when |
write_images |
whether to write model weights to visualize as image in TensorBoard. |
write_steps_per_second |
whether to log the training steps per second into TensorBoard. This supports both epoch and batch frequency logging. |
update_freq |
|
profile_batch |
(Not supported at this time) Profile the batch(es) to sample compute characteristics. profile_batch must be a non-negative integer or a tuple of integers. A pair of positive integers signify a range of batches to profile. By default, profiling is disabled. |
embeddings_freq |
frequency (in epochs) at which embedding layers will be visualized. If set to 0, embeddings won't be visualized. |
embeddings_metadata |
Named list which maps embedding layer names to the filename of a file in which to save metadata for the embedding layer. In case the same metadata file is to be used for all embedding layers, a single filename can be passed. |
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
tensorboard_callback <- callback_tensorboard(log_dir = "./logs") model %>% fit(x_train, y_train, epochs = 2, callbacks = list(tensorboard_callback)) # Then run the tensorboard command to view the visualizations.
Custom batch-level summaries in a subclassed Model:
MyModel <- new_model_class("MyModel", initialize = function() { self$dense <- layer_dense(units = 10) }, call = function(x) { outputs <- x |> self$dense() tf$summary$histogram('outputs', outputs) outputs } ) model <- MyModel() model |> compile(optimizer = 'sgd', loss = 'mse') # Make sure to set `update_freq = N` to log a batch-level summary every N # batches. In addition to any `tf$summary` contained in `model$call()`, # metrics added in `model |>compile` will be logged every N batches. tb_callback <- callback_tensorboard(log_dir = './logs', update_freq = 1) model |> fit(x_train, y_train, callbacks = list(tb_callback))
Custom batch-level summaries in a Functional API Model:
my_summary <- function(x) { tf$summary$histogram('x', x) x } inputs <- layer_input(10) outputs <- inputs |> layer_dense(10) |> layer_lambda(my_summary) model <- keras_model(inputs, outputs) model |> compile(optimizer = 'sgd', loss = 'mse') # Make sure to set `update_freq = N` to log a batch-level summary every N # batches. In addition to any `tf.summary` contained in `Model.call`, # metrics added in `Model.compile` will be logged every N batches. tb_callback <- callback_tensorboard(log_dir = './logs', update_freq = 1) model |> fit(x_train, y_train, callbacks = list(tb_callback))
Profiling:
# Profile a single batch, e.g. the 5th batch. tensorboard_callback <- callback_tensorboard( log_dir = './logs', profile_batch = 5) model |> fit(x_train, y_train, epochs = 2, callbacks = list(tensorboard_callback)) # Profile a range of batches, e.g. from 10 to 20. tensorboard_callback <- callback_tensorboard( log_dir = './logs', profile_batch = c(10, 20)) model |> fit(x_train, y_train, epochs = 2, callbacks = list(tensorboard_callback))
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_terminate_on_nan()
Callback that terminates training when a NaN loss is encountered.
callback_terminate_on_nan()
callback_terminate_on_nan()
A Callback
instance that can be passed to fit.keras.src.models.model.Model()
.
Other callbacks: Callback()
callback_backup_and_restore()
callback_csv_logger()
callback_early_stopping()
callback_lambda()
callback_learning_rate_scheduler()
callback_model_checkpoint()
callback_reduce_lr_on_plateau()
callback_remote_monitor()
callback_swap_ema_weights()
callback_tensorboard()
Keras manages a global state, which it uses to implement the Functional model-building API and to uniquify autogenerated layer names.
If you are creating many models in a loop, this global state will consume
an increasing amount of memory over time, and you may want to clear it.
Calling clear_session()
releases the global state: this helps avoid
clutter from old models and layers, especially when memory is limited.
Example 1: calling clear_session()
when creating models in a loop
for (i in 1:100) { # Without `clear_session()`, each iteration of this loop will # slightly increase the size of the global state managed by Keras model <- keras_model_sequential() for (j in 1:10) { model <- model |> layer_dense(units = 10) } } for (i in 1:100) { # With `clear_session()` called at the beginning, # Keras starts with a blank state at each iteration # and memory consumption is constant over time. clear_session() model <- keras_model_sequential() for (j in 1:10) { model <- model |> layer_dense(units = 10) } }
Example 2: resetting the layer name generation counter
layers <- lapply(1:10, \(i) layer_dense(units = 10)) new_layer <- layer_dense(units = 10) print(new_layer$name)
## [1] "dense_10"
clear_session() new_layer <- layer_dense(units = 10) print(new_layer$name)
## [1] "dense"
clear_session(free_memory = TRUE)
clear_session(free_memory = TRUE)
free_memory |
Whether to call Python garbage collection.
It's usually a good practice to call it to make sure
memory used by deleted objects is immediately freed.
However, it may take a few seconds to execute, so
when using |
NULL
, invisibly, called for side effects.
Other backend: config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other utils: audio_dataset_from_directory()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Model
instance.Model cloning is similar to calling a model on new inputs, except that it creates new layers (and thus new weights) instead of sharing the weights of the existing layers.
Note that
clone_model()
will not preserve the uniqueness of shared objects within the
model (e.g. a single variable attached to two distinct layers will be
restored as two separate variables).
clone_model( model, input_tensors = NULL, clone_function = NULL, call_function = NULL, recursive = FALSE, ... )
clone_model( model, input_tensors = NULL, clone_function = NULL, call_function = NULL, recursive = FALSE, ... )
model |
Instance of |
input_tensors |
Optional list of input tensors
to build the model upon. If not provided,
new |
clone_function |
Callable with signature |
call_function |
Callable with signature
|
recursive |
Note, This argument can only be used with
Functional models.
Boolean. Whether to recursively clone any Sequential
or Functional models encountered in the original
Sequential/Functional model. If |
... |
For forward/backward compatability. |
An instance of Model
reproducing the behavior
of the original model, on top of new inputs tensors,
using newly instantiated weights. The cloned model may behave
differently from the original model if a custom clone_function
or call_function
modifies a layer or layer call.
# Create a test Sequential model. model <- keras_model_sequential(input_shape = c(728)) |> layer_dense(32, activation = 'relu') |> layer_dense(1, activation = 'sigmoid') # Create a copy of the test model (with freshly initialized weights). new_model <- clone_model(model)
Using a clone_function
to make a model deterministic by setting the
random seed everywhere:
clone_function <- function(layer) { config <- layer$get_config() if ("seed" %in% names(config)) config$seed <- 1337L layer$`__class__`$from_config(config) } new_model <- clone_model(model, clone_function = clone_function)
Using a call_function
to add a Dropout
layer after each Dense
layer
(without recreating new layers):
call_function <- function(layer, ...) { out <- layer(...) if (inherits(layer, keras$layers$Dense)) out <- out |> layer_dropout(0.5) out } inputs <- keras_input(c(728)) outputs <- inputs |> layer_dense(32, activation = 'relu') |> layer_dense(1, activation = 'sigmoid') model <- keras_model(inputs, outputs) new_model <- clone_model( model, clone_function = function(x) x, # Reuse the same layers. call_function = call_function, ) new_model
## Model: "functional_4" ## +-----------------------------------+--------------------------+---------------+ ## | Layer (type) | Output Shape | Param # | ## +===================================+==========================+===============+ ## | keras_tensor_8 (InputLayer) | (None, 728) | 0 | ## +-----------------------------------+--------------------------+---------------+ ## | dense_2 (Dense) | (None, 32) | 23,328 | ## +-----------------------------------+--------------------------+---------------+ ## | dropout (Dropout) | (None, 32) | 0 | ## +-----------------------------------+--------------------------+---------------+ ## | dense_3 (Dense) | (None, 1) | 33 | ## +-----------------------------------+--------------------------+---------------+ ## | dropout_1 (Dropout) | (None, 1) | 0 | ## +-----------------------------------+--------------------------+---------------+ ## Total params: 23,361 (91.25 KB) ## Trainable params: 23,361 (91.25 KB) ## Non-trainable params: 0 (0.00 B)
Note that subclassed models cannot be cloned by default,
since their internal layer structure is not known.
To achieve equivalent functionality
as clone_model
in the case of a subclassed model, simply make sure
that the model class implements get_config()
(and optionally from_config()
), and call:
new_model <- model$`__class__`$from_config(model$get_config())
In the case of a subclassed model, you cannot using a custom
clone_function
.
Configure a model for training.
## S3 method for class 'keras.src.models.model.Model' compile( object, optimizer = "rmsprop", loss = NULL, metrics = NULL, ..., loss_weights = NULL, weighted_metrics = NULL, run_eagerly = FALSE, steps_per_execution = 1L, jit_compile = "auto", auto_scale_loss = TRUE )
## S3 method for class 'keras.src.models.model.Model' compile( object, optimizer = "rmsprop", loss = NULL, metrics = NULL, ..., loss_weights = NULL, weighted_metrics = NULL, run_eagerly = FALSE, steps_per_execution = 1L, jit_compile = "auto", auto_scale_loss = TRUE )
object |
Keras model object |
optimizer |
String (name of optimizer) or optimizer instance. See
|
loss |
Loss function. May be:
A loss function is any callable with the signature
|
metrics |
List of metrics to be evaluated by the model during training and testing. Each of these can be:
Typically you will use
If providing an anonymous R function, you can customize the printed name
during training by assigning |
... |
Additional arguments passed on to the |
loss_weights |
Optional list (named or unnamed) specifying scalar
coefficients (R numerics) to weight the loss contributions of
different model outputs. The loss value that will be minimized
by the model will then be the weighted sum of all individual
losses, weighted by the |
weighted_metrics |
List of metrics to be evaluated and weighted by
|
run_eagerly |
Bool. If |
steps_per_execution |
Int. The number of batches to run
during each a single compiled function call. Running multiple
batches inside a single compiled function call can
greatly improve performance on TPUs or small models with a large
R/Python overhead. At most, one full epoch will be run each
execution. If a number larger than the size of the epoch is
passed, the execution will be truncated to the size of the
epoch. Note that if |
jit_compile |
Bool or |
auto_scale_loss |
Bool. If |
This is called primarily for the side effect of modifying object
in-place. The first argument object
is also returned, invisibly, to
enable usage with the pipe.
model |> compile( optimizer = optimizer_adam(learning_rate = 1e-3), loss = loss_binary_crossentropy(), metrics = c(metric_binary_accuracy(), metric_false_negatives()) )
Other model training: evaluate.keras.src.models.model.Model()
predict.keras.src.models.model.Model()
predict_on_batch()
test_on_batch()
train_on_batch()
Publicly accessible method for determining the current backend.
config_backend()
config_backend()
String, the name of the backend Keras is currently using. One of
"tensorflow"
, "torch"
, or "jax"
.
config_backend()
## [1] "tensorflow"
Other config backend: config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other backend: clear_session()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other config: config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
When interactive logging is disabled, Keras sends logs to absl.logging
.
This is the best option when using Keras in a non-interactive
way, such as running a training or inference job on a server.
config_disable_interactive_logging()
config_disable_interactive_logging()
No return value, called for side effects.
Other io utils: config_enable_interactive_logging()
config_is_interactive_logging_enabled()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).
See also config_enable_traceback_filtering()
and
config_is_traceback_filtering_enabled()
.
If you have previously disabled traceback filtering via
config_disable_traceback_filtering()
, you can re-enable it via
config_enable_traceback_filtering()
.
config_disable_traceback_filtering()
config_disable_traceback_filtering()
No return value, called for side effects.
Other traceback utils: config_enable_traceback_filtering()
config_is_traceback_filtering_enabled()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_interactive_logging()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Returns the current default dtype policy object.
config_dtype_policy()
config_dtype_policy()
A DTypePolicy
object.
When interactive logging is enabled, Keras displays logs via stdout. This provides the best experience when using Keras in an interactive environment such as a shell or a notebook.
config_enable_interactive_logging()
config_enable_interactive_logging()
No return value, called for side effects.
Other io utils: config_disable_interactive_logging()
config_is_interactive_logging_enabled()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).
See also config_disable_traceback_filtering()
and
config_is_traceback_filtering_enabled()
.
If you have previously disabled traceback filtering via
config_disable_traceback_filtering()
, you can re-enable it via
config_enable_traceback_filtering()
.
config_enable_traceback_filtering()
config_enable_traceback_filtering()
No return value, called for side effects.
Other traceback utils: config_disable_traceback_filtering()
config_is_traceback_filtering_enabled()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Disables safe mode globally, allowing deserialization of lambdas.
config_enable_unsafe_deserialization()
config_enable_unsafe_deserialization()
No return value, called for side effects.
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Return the value of the fuzz factor used in numeric expressions.
config_epsilon()
config_epsilon()
A float.
config_epsilon()
## [1] 1e-07
Other config backend: config_backend()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other backend: clear_session()
config_backend()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
E.g. 'bfloat16'
'float16'
, 'float32'
, 'float64'
.
config_floatx()
config_floatx()
String, the current default float type.
keras3::config_floatx()
## [1] "float32"
Other config backend: config_backend()
config_epsilon()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other backend: clear_session()
config_backend()
config_epsilon()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Return the default image data format convention.
config_image_data_format()
config_image_data_format()
A string, either 'channels_first'
or 'channels_last'
.
config_image_data_format()
## [1] "channels_last"
Other config backend: config_backend()
config_epsilon()
config_floatx()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other backend: clear_session()
config_backend()
config_epsilon()
config_floatx()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
To switch between writing logs to stdout and absl.logging
, you may use
config_enable_interactive_logging()
and
config_disable_interactive_logging()
.
config_is_interactive_logging_enabled()
config_is_interactive_logging_enabled()
Boolean, TRUE
if interactive logging is enabled,
and FALSE
otherwise.
Other io utils: config_disable_interactive_logging()
config_enable_interactive_logging()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Raw Keras tracebacks (also known as stack traces) involve many internal frames, which can be challenging to read through, while not being actionable for end users. By default, Keras filters internal frames in most exceptions that it raises, to keep traceback short, readable, and focused on what's actionable for you (your own code).
See also config_enable_traceback_filtering()
and
config_disable_traceback_filtering()
.
If you have previously disabled traceback filtering via
config_disable_traceback_filtering()
, you can re-enable it via
config_enable_traceback_filtering()
.
config_is_traceback_filtering_enabled()
config_is_traceback_filtering_enabled()
Boolean, TRUE
if traceback filtering is enabled,
and FALSE
otherwise.
Other traceback utils: config_disable_traceback_filtering()
config_enable_traceback_filtering()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Reload the backend (and the Keras package).
config_set_backend(backend)
config_set_backend(backend)
backend |
String |
Nothing, this function is called for its side effect.
config_set_backend("jax")
Using this function is dangerous and should be done
carefully. Changing the backend will NOT convert
the type of any already-instantiated objects.
Thus, any layers / tensors / etc. already created will no
longer be usable without errors. It is strongly recommended not
to keep around any Keras-originated objects instances created
before calling config_set_backend()
.
This includes any function or class instance that uses any Keras
functionality. All such code needs to be re-executed after calling
config_set_backend()
.
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_epsilon()
config_set_floatx()
config_set_image_data_format()
Sets the default dtype policy globally.
config_set_dtype_policy(policy)
config_set_dtype_policy(policy)
policy |
A string or |
No return value, called for side effects.
config_set_dtype_policy("mixed_float16")
Set the value of the fuzz factor used in numeric expressions.
config_set_epsilon(value)
config_set_epsilon(value)
value |
float. New value of epsilon. |
No return value, called for side effects.
config_epsilon()
## [1] 1e-07
config_set_epsilon(1e-5) config_epsilon()
## [1] 1e-05
# Set it back to the default value. config_set_epsilon(1e-7)
Other config backend: config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_floatx()
config_set_image_data_format()
Other backend: clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_floatx()
config_set_image_data_format()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_floatx()
config_set_image_data_format()
Set the default float dtype.
config_set_floatx(value)
config_set_floatx(value)
value |
String; |
No return value, called for side effects.
It is not recommended to set this to "float16"
for training,
as this will likely cause numeric stability issues.
Instead, mixed precision, which leverages
a mix of float16
and float32
. It can be configured by calling
keras3::keras$mixed_precision$set_dtype_policy('mixed_float16')
.
config_floatx()
## [1] "float32"
config_set_floatx('float64') config_floatx()
## [1] "float64"
# Set it back to float32 config_set_floatx('float32')
ValueError: In case of invalid value.
Other config backend: config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_image_data_format()
Other backend: clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_image_data_format()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_image_data_format()
Set the value of the image data format convention.
config_set_image_data_format(data_format)
config_set_image_data_format(data_format)
data_format |
string. |
No return value, called for side effects.
config_image_data_format()
## [1] "channels_last"
# 'channels_last'
keras3::config_set_image_data_format('channels_first') config_image_data_format()
## [1] "channels_first"
# Set it back to `'channels_last'` keras3::config_set_image_data_format('channels_last')
Other config backend: config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
Other backend: clear_session()
config_backend()
config_epsilon()
config_floatx()
config_image_data_format()
config_set_epsilon()
config_set_floatx()
Other config: config_backend()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_enable_unsafe_deserialization()
config_epsilon()
config_floatx()
config_image_data_format()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
config_set_backend()
config_set_epsilon()
config_set_floatx()
Constraint
classBase class for weight constraints.
A Constraint()
instance works like a stateless function.
Users who subclass the Constraint
class should override
the call()
method, which takes a single
weight parameter and return a projected version of that parameter
(e.g. normalized or clipped). Constraints can be used with various Keras
layers via the kernel_constraint
or bias_constraint
arguments.
Here's a simple example of a non-negative weight constraint:
constraint_nonnegative <- Constraint("NonNegative", call = function(w) { w * op_cast(w >= 0, dtype = w$dtype) } ) weight <- op_convert_to_tensor(c(-1, 1)) constraint_nonnegative()(weight)
## tf.Tensor([-0. 1.], shape=(2), dtype=float32)
Usage in a layer:
layer_dense(units = 4, kernel_constraint = constraint_nonnegative())
## <Dense name=dense, built=False> ## signature: (*args, **kwargs)
Constraint( classname, call = NULL, get_config = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
Constraint( classname, call = NULL, get_config = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
classname |
String, the name of the custom class. (Conventionally, CamelCase). |
call |
\(w) Applies the constraint to the input weight variable. By default, the inputs weight variable is not modified. Users should override this method to implement their own projection function. Args:
Returns: Projected variable (by default, returns unmodified inputs). |
get_config |
\() Function that returns a named list of the object config. A constraint config is a named list (JSON-serializable) that can
be used to reinstantiate the same object
(via |
... , public
|
Additional methods or public members of the custom class. |
private |
Named list of R objects (typically, functions) to include in
instance private environments. |
inherit |
What the custom class will subclass. By default, the base keras class. |
parent_env |
The R environment that all class methods will have as a grandparent. |
A function that returns Constraint
instances, similar to the
builtin constraint functions like constraint_maxnorm()
.
All R function custom methods (public and private) will have the following symbols in scope:
self
: The custom class instance.
super
: The custom class superclass.
private
: An R environment specific to the class instance.
Any objects assigned here are invisible to the Keras framework.
__class__
and as.symbol(classname)
: the custom class type object.
Other constraints: constraint_maxnorm()
constraint_minmaxnorm()
constraint_nonneg()
constraint_unitnorm()
Constrains the weights incident to each hidden unit to have a norm less than or equal to a desired value.
constraint_maxnorm(max_value = 2L, axis = 1L)
constraint_maxnorm(max_value = 2L, axis = 1L)
max_value |
the maximum norm value for the incoming weights. |
axis |
integer, axis along which to calculate weight norms.
For instance, in a |
A Constraint
instance, a callable that can be passed to layer
constructors or used directly by calling it with tensors.
Other constraints: Constraint()
constraint_minmaxnorm()
constraint_nonneg()
constraint_unitnorm()
Constrains the weights incident to each hidden unit to have the norm between a lower bound and an upper bound.
constraint_minmaxnorm(min_value = 0, max_value = 1, rate = 1, axis = 1L)
constraint_minmaxnorm(min_value = 0, max_value = 1, rate = 1, axis = 1L)
min_value |
the minimum norm for the incoming weights. |
max_value |
the maximum norm for the incoming weights. |
rate |
rate for enforcing the constraint: weights will be
rescaled to yield
op_clip?
|
axis |
integer, axis along which to calculate weight norms.
For instance, in a |
A Constraint
instance, a callable that can be passed to layer
constructors or used directly by calling it with tensors.
Other constraints: Constraint()
constraint_maxnorm()
constraint_nonneg()
constraint_unitnorm()
Constrains the weights to be non-negative.
constraint_nonneg()
constraint_nonneg()
A Constraint
instance, a callable that can be passed to layer
constructors or used directly by calling it with tensors.
Other constraints: Constraint()
constraint_maxnorm()
constraint_minmaxnorm()
constraint_unitnorm()
Constrains the weights incident to each hidden unit to have unit norm.
constraint_unitnorm(axis = 1L)
constraint_unitnorm(axis = 1L)
axis |
integer, axis along which to calculate weight norms.
For instance, in a |
A Constraint
instance, a callable that can be passed to layer
constructors or used directly by calling it with tensors.
Other constraints: Constraint()
constraint_maxnorm()
constraint_minmaxnorm()
constraint_nonneg()
Count the total number of scalars composing the weights.
count_params(object)
count_params(object)
object |
Layer or model object |
An integer count
Other layer methods: get_config()
get_weights()
quantize_weights()
reset_state()
Custom metric function
custom_metric(name, metric_fn)
custom_metric(name, metric_fn)
name |
name used to show training progress output |
metric_fn |
An R function with signature |
You can provide an arbitrary R function as a custom metric. Note that
the y_true
and y_pred
parameters are tensors, so computations on
them should use op_*
tensor functions.
Use the custom_metric()
function to define a custom metric.
Note that a name ('mean_pred'
) is provided for the custom metric
function: this name is used within training progress output.
If you want to save and load a model with custom metrics, you should
also call register_keras_serializable()
, or
specify the metric in the call the load_model()
. For example:
load_model("my_model.keras", c('mean_pred' = metric_mean_pred))
.
Alternatively, you can wrap all of your code in a call to
with_custom_object_scope()
which will allow you to refer to the
metric by name just like you do with built in keras metrics.
Alternative ways of supplying custom metrics:
custom_metric():
Arbitrary R function.
metric_mean_wrapper()
: Wrap an arbitrary R function in a Metric
instance.
Create a custom Metric()
subclass.
A callable function with a __name__
attribute.
Other metrics: Metric()
metric_auc()
metric_binary_accuracy()
metric_binary_crossentropy()
metric_binary_focal_crossentropy()
metric_binary_iou()
metric_categorical_accuracy()
metric_categorical_crossentropy()
metric_categorical_focal_crossentropy()
metric_categorical_hinge()
metric_cosine_similarity()
metric_f1_score()
metric_false_negatives()
metric_false_positives()
metric_fbeta_score()
metric_hinge()
metric_huber()
metric_iou()
metric_kl_divergence()
metric_log_cosh()
metric_log_cosh_error()
metric_mean()
metric_mean_absolute_error()
metric_mean_absolute_percentage_error()
metric_mean_iou()
metric_mean_squared_error()
metric_mean_squared_logarithmic_error()
metric_mean_wrapper()
metric_one_hot_iou()
metric_one_hot_mean_iou()
metric_poisson()
metric_precision()
metric_precision_at_recall()
metric_r2_score()
metric_recall()
metric_recall_at_precision()
metric_root_mean_squared_error()
metric_sensitivity_at_specificity()
metric_sparse_categorical_accuracy()
metric_sparse_categorical_crossentropy()
metric_sparse_top_k_categorical_accuracy()
metric_specificity_at_sensitivity()
metric_squared_hinge()
metric_sum()
metric_top_k_categorical_accuracy()
metric_true_negatives()
metric_true_positives()
Dataset taken from the StatLib library which is maintained at Carnegie Mellon University.
dataset_boston_housing( path = "boston_housing.npz", test_split = 0.2, seed = 113L )
dataset_boston_housing( path = "boston_housing.npz", test_split = 0.2, seed = 113L )
path |
Path where to cache the dataset locally (relative to ~/.keras/datasets). |
test_split |
fraction of the data to reserve as test set. |
seed |
Random seed for shuffling the data before computing the test split. |
Lists of training and test data: train$x, train$y, test$x, test$y
.
Samples contain 13 attributes of houses at different locations around the Boston suburbs in the late 1970s. Targets are the median values of the houses at a location (in k$).
Other datasets: dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()
Dataset of 50,000 32x32 color training images, labeled over 10 categories, and 10,000 test images.
dataset_cifar10()
dataset_cifar10()
Lists of training and test data: train$x, train$y, test$x, test$y
.
The x
data is an array of RGB image data with shape (num_samples, 3, 32,
32).
The y
data is an array of category labels (integers in range 0-9) with
shape (num_samples).
Other datasets: dataset_boston_housing()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()
Dataset of 50,000 32x32 color training images, labeled over 100 categories, and 10,000 test images.
dataset_cifar100(label_mode = c("fine", "coarse"))
dataset_cifar100(label_mode = c("fine", "coarse"))
label_mode |
one of "fine", "coarse". |
Lists of training and test data: train$x, train$y, test$x, test$y
.
The x
data is an array of RGB image data with shape (num_samples, 3, 32, 32).
The y
data is an array of category labels with shape (num_samples).
Other datasets: dataset_boston_housing()
dataset_cifar10()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
dataset_reuters()
Dataset of 60,000 28x28 grayscale images of the 10 fashion article classes, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are encoded as integers from 0-9 which correspond to T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt,
dataset_fashion_mnist()
dataset_fashion_mnist()
Dataset of 60,000 28x28 grayscale images of 10 fashion categories, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The class labels are:
0 - T-shirt/top
1 - Trouser
2 - Pullover
3 - Dress
4 - Coat
5 - Sandal
6 - Shirt
7 - Sneaker
8 - Bag
9 - Ankle boot
Lists of training and test data: train$x, train$y, test$x, test$y
, where
x
is an array of grayscale image data with shape (num_samples, 28, 28) and y
is an array of article labels (integers in range 0-9) with shape (num_samples).
Other datasets: dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_imdb()
dataset_mnist()
dataset_reuters()
Dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".
dataset_imdb( path = "imdb.npz", num_words = NULL, skip_top = 0L, maxlen = NULL, seed = 113L, start_char = 1L, oov_char = 2L, index_from = 3L ) dataset_imdb_word_index(path = "imdb_word_index.json")
dataset_imdb( path = "imdb.npz", num_words = NULL, skip_top = 0L, maxlen = NULL, seed = 113L, start_char = 1L, oov_char = 2L, index_from = 3L ) dataset_imdb_word_index(path = "imdb_word_index.json")
path |
Where to cache the data (relative to |
num_words |
Max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept |
skip_top |
Skip the top N most frequently occuring words (which may not be informative). |
maxlen |
sequences longer than this will be filtered out. |
seed |
random seed for sample shuffling. |
start_char |
The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character. |
oov_char |
Words that were cut out because of the |
index_from |
Index actual words with this index and higher. |
As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown word.
Lists of training and test data: train$x, train$y, test$x, test$y
.
The x
data includes integer sequences. If the num_words
argument was
specific, the maximum possible index value is num_words-1
. If the
maxlen
argument was specified, the largest possible sequence length is
maxlen
.
The y
data includes a set of integer labels (0 or 1).
The dataset_imdb_word_index()
function returns a list where the
names are words and the values are integer.
Other datasets: dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_mnist()
dataset_reuters()
Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.
dataset_mnist(path = "mnist.npz")
dataset_mnist(path = "mnist.npz")
path |
Path where to cache the dataset locally (relative to ~/.keras/datasets). |
Lists of training and test data: train$x, train$y, test$x, test$y
, where
x
is an array of grayscale image data with shape (num_samples, 28, 28) and y
is an array of digit labels (integers in range 0-9) with shape (num_samples).
Other datasets: dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_reuters()
Dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with
dataset_imdb()
, each wire is encoded as a sequence of word indexes (same
conventions).
dataset_reuters( path = "reuters.npz", num_words = NULL, skip_top = 0L, maxlen = NULL, test_split = 0.2, seed = 113L, start_char = 1L, oov_char = 2L, index_from = 3L ) dataset_reuters_word_index(path = "reuters_word_index.pkl")
dataset_reuters( path = "reuters.npz", num_words = NULL, skip_top = 0L, maxlen = NULL, test_split = 0.2, seed = 113L, start_char = 1L, oov_char = 2L, index_from = 3L ) dataset_reuters_word_index(path = "reuters_word_index.pkl")
path |
Where to cache the data (relative to |
num_words |
Max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept |
skip_top |
Skip the top N most frequently occuring words (which may not be informative). |
maxlen |
Truncate sequences after this length. |
test_split |
Fraction of the dataset to be used as test data. |
seed |
Random seed for sample shuffling. |
start_char |
The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character. |
oov_char |
words that were cut out because of the |
index_from |
index actual words with this index and higher. |
Lists of training and test data: train$x, train$y, test$x, test$y
with same format as dataset_imdb()
. The dataset_reuters_word_index()
function returns a list where the names are words and the values are
integer. e.g. word_index[["giraffe"]]
might return 1234
.
Other datasets: dataset_boston_housing()
dataset_cifar10()
dataset_cifar100()
dataset_fashion_mnist()
dataset_imdb()
dataset_mnist()
The config dict is a Python dictionary that consists of a set of key-value
pairs, and represents a Keras object, such as an Optimizer
, Layer
,
Metrics
, etc. The saving and loading library uses the following keys to
record information of a Keras object:
class_name
: String. This is the name of the class,
as exactly defined in the source
code, such as "LossesContainer".
config
: Named List. Library-defined or user-defined key-value pairs that store
the configuration of the object, as obtained by object$get_config()
.
module
: String. The path of the python module. Built-in Keras classes
expect to have prefix keras
.
registered_name
: String. The key the class is registered under via
register_keras_serializable(package, name)
API. The
key has the format of '{package}>{name}'
, where package
and name
are
the arguments passed to register_keras_serializable()
. If name
is not
provided, it uses the class name. If registered_name
successfully
resolves to a class (that was registered), the class_name
and config
values in the config dict will not be used. registered_name
is only used for
non-built-in classes.
For example, the following config list represents the built-in Adam optimizer with the relevant config:
config <- list( class_name = "Adam", config = list( amsgrad = FALSE, beta_1 = 0.8999999761581421, beta_2 = 0.9990000128746033, epsilon = 1e-07, learning_rate = 0.0010000000474974513, name = "Adam" ), module = "keras.optimizers", registered_name = NULL ) # Returns an `Adam` instance identical to the original one. deserialize_keras_object(config)
## <keras.src.optimizers.adam.Adam object>
If the class does not have an exported Keras namespace, the library tracks
it by its module
and class_name
. For example:
config <- list( class_name = "MetricsList", config = list( ... ), module = "keras.trainers.compile_utils", registered_name = "MetricsList" ) # Returns a `MetricsList` instance identical to the original one. deserialize_keras_object(config)
And the following config represents a user-customized MeanSquaredError
loss:
# define a custom object loss_modified_mse <- Loss( "ModifiedMeanSquaredError", inherit = loss_mean_squared_error) # register the custom object register_keras_serializable(loss_modified_mse) # confirm object is registered get_custom_objects()
## $`keras3>ModifiedMeanSquaredError` ## <class '<r-namespace:keras3>.ModifiedMeanSquaredError'> ## signature: ( ## reduction='sum_over_batch_size', ## name='mean_squared_error', ## dtype=None ## )
get_registered_name(loss_modified_mse)
## [1] "keras3>ModifiedMeanSquaredError"
# now custom object instances can be serialized full_config <- serialize_keras_object(loss_modified_mse()) # the `config` arguments will be passed to loss_modified_mse() str(full_config)
## List of 4 ## $ module : chr "<r-namespace:keras3>" ## $ class_name : chr "ModifiedMeanSquaredError" ## $ config :List of 2 ## ..$ name : chr "mean_squared_error" ## ..$ reduction: chr "sum_over_batch_size" ## $ registered_name: chr "keras3>ModifiedMeanSquaredError"
# and custom object instances can be deserialized deserialize_keras_object(full_config)
## <<r-namespace:keras3>.ModifiedMeanSquaredError object> ## signature: (y_true, y_pred, sample_weight=None)
# Returns the `ModifiedMeanSquaredError` object
deserialize_keras_object(config, custom_objects = NULL, safe_mode = TRUE, ...)
deserialize_keras_object(config, custom_objects = NULL, safe_mode = TRUE, ...)
config |
Named list describing the object. |
custom_objects |
Named list containing a mapping between custom object names the corresponding classes or functions. |
safe_mode |
Boolean, whether to disallow unsafe |
... |
For forward/backward compatability. |
The object described by the config
dictionary.
Other serialization utilities: get_custom_objects()
get_registered_name()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()
This functions returns the loss value and metrics values for the model in
test mode.
Computation is done in batches (see the batch_size
arg.)
## S3 method for class 'keras.src.models.model.Model' evaluate( object, x = NULL, y = NULL, ..., batch_size = NULL, verbose = getOption("keras.verbose", default = "auto"), sample_weight = NULL, steps = NULL, callbacks = NULL )
## S3 method for class 'keras.src.models.model.Model' evaluate( object, x = NULL, y = NULL, ..., batch_size = NULL, verbose = getOption("keras.verbose", default = "auto"), sample_weight = NULL, steps = NULL, callbacks = NULL )
object |
Keras model object |
x |
Input data. It could be:
|
y |
Target data. Like the input data |
... |
For forward/backward compatability. |
batch_size |
Integer or |
verbose |
|
sample_weight |
Optional array of weights for the test samples,
used for weighting the loss function. You can either pass a flat
(1D) R array with the same length as the input samples
(1:1 mapping between weights and samples), or in the case of
temporal data, you can pass a 2D array with shape |
steps |
Integer or |
callbacks |
List of |
Scalar test loss (if the model has a single output and no metrics)
or list of scalars (if the model has multiple outputs
and/or metrics). The attribute model$metrics_names
will give you
the display labels for the scalar outputs.
Other model training: compile.keras.src.models.model.Model()
predict.keras.src.models.model.Model()
predict_on_batch()
test_on_batch()
train_on_batch()
(e.g. via TF-Serving).
Note: This can currently only be used with the TensorFlow or JAX backends.
This method lets you export a model to a lightweight SavedModel artifact
that contains the model's forward pass only (its call()
method)
and can be served via e.g. TF-Serving. The forward pass is registered
under the name serve()
(see example below).
The original code of the model (including any custom layers you may have used) is no longer necessary to reload the artifact – it is entirely standalone.
## S3 method for class 'keras.src.models.model.Model' export_savedmodel(object, export_dir_base, ...)
## S3 method for class 'keras.src.models.model.Model' export_savedmodel(object, export_dir_base, ...)
object |
A keras model. |
export_dir_base |
string, file path where to save the artifact. |
... |
For forward/backward compatability. |
This is called primarily for the side effect of exporting object
.
The first argument, object
is also returned, invisibly, to enable usage
with the pipe.
# Create the artifact model |> tensorflow::export_savedmodel("path/to/location") # Later, in a different process/environment... library(tensorflow) reloaded_artifact <- tf$saved_model$load("path/to/location") predictions <- reloaded_artifact$serve(input_data) # see tfdeploy::serve_savedmodel() for serving a model over a local web api.
Other saving and loading functions: layer_tfsm()
load_model()
load_model_weights()
register_keras_serializable()
save_model()
save_model_config()
save_model_weights()
with_custom_object_scope()
Train a model for a fixed number of epochs (dataset iterations).
## S3 method for class 'keras.src.models.model.Model' fit( object, x = NULL, y = NULL, ..., batch_size = NULL, epochs = 1L, callbacks = NULL, validation_split = 0, validation_data = NULL, shuffle = TRUE, class_weight = NULL, sample_weight = NULL, initial_epoch = 1L, steps_per_epoch = NULL, validation_steps = NULL, validation_batch_size = NULL, validation_freq = 1L, verbose = getOption("keras.verbose", default = "auto"), view_metrics = getOption("keras.view_metrics", default = "auto") )
## S3 method for class 'keras.src.models.model.Model' fit( object, x = NULL, y = NULL, ..., batch_size = NULL, epochs = 1L, callbacks = NULL, validation_split = 0, validation_data = NULL, shuffle = TRUE, class_weight = NULL, sample_weight = NULL, initial_epoch = 1L, steps_per_epoch = NULL, validation_steps = NULL, validation_batch_size = NULL, validation_freq = 1L, verbose = getOption("keras.verbose", default = "auto"), view_metrics = getOption("keras.view_metrics", default = "auto") )
object |
Keras model object |
x |
Input data. It could be:
|
y |
Target data. Like the input data |
... |
Additional arguments passed on to the model |
batch_size |
Integer or |
epochs |
Integer. Number of epochs to train the model.
An epoch is an iteration over the entire |
callbacks |
List of |
validation_split |
Float between 0 and 1.
Fraction of the training data to be used as validation data.
The model will set apart this fraction of the training data,
will not train on it, and will evaluate
the loss and any model metrics
on this data at the end of each epoch.
The validation data is selected from the last samples
in the |
validation_data |
Data on which to evaluate
the loss and any model metrics at the end of each epoch.
The model will not be trained on this data. Thus, note the fact
that the validation loss of data provided using
|
shuffle |
Boolean, whether to shuffle the training data
before each epoch. This argument is
ignored when |
class_weight |
Optional named list mapping class indices (integers, 0-based)
to a weight (float) value, used for weighting the loss function
(during training only).
This can be useful to tell the model to
"pay more attention" to samples from
an under-represented class. When |
sample_weight |
Optional array of weights for
the training samples, used for weighting the loss function
(during training only). You can either pass a flat (1D)
array/vector with the same length as the input samples
(1:1 mapping between weights and samples),
or in the case of temporal data,
you can pass a 2D array (matrix) with shape
|
initial_epoch |
Integer. Epoch at which to start training (useful for resuming a previous training run). |
steps_per_epoch |
Integer or |
validation_steps |
Only relevant if |
validation_batch_size |
Integer or |
validation_freq |
Only relevant if validation data is provided.
Specifies how many training epochs to run
before a new validation run is performed,
e.g. |
verbose |
|
view_metrics |
View realtime plot of training metrics (by epoch). The
default ( |
Unpacking behavior for iterator-like inputs:
A common pattern is to pass an iterator like object such as a
tf.data.Dataset
or a generator to fit()
,
which will in fact yield not only features (x
)
but optionally targets (y
) and sample weights (sample_weight
).
Keras requires that the output of such iterator-likes be
unambiguous. The iterator should return a tuple()
of length 1, 2, or 3, where the optional second and third elements
will be used for y
and sample_weight
respectively.
Any other type provided will be wrapped in
a length-one tuple()
, effectively treating everything as x
. When
yielding named lists, they should still adhere to the top-level tuple
structure,
e.g. tuple(list(x0 = x0, x = x1), y)
. Keras will not attempt to separate
features, targets, and weights from the keys of a single dict.
A keras_training_history
object, which is a named list:
list(params = <params>, metrics = <metrics>")
, with S3 methods
print()
, plot()
, and as.data.frame()
. The metrics
field is
a record of training loss values and metrics values
at successive epochs, as well as validation loss values
and validation metrics values (if applicable).
Freeze weights in a model or layer so that they are no longer trainable.
freeze_weights(object, from = NULL, to = NULL, which = NULL) unfreeze_weights(object, from = NULL, to = NULL, which = NULL)
freeze_weights(object, from = NULL, to = NULL, which = NULL) unfreeze_weights(object, from = NULL, to = NULL, which = NULL)
object |
Keras model or layer object |
from |
Layer instance, layer name, or layer index within model |
to |
Layer instance, layer name, or layer index within model |
which |
layer names, integer positions, layers, logical vector (of
|
The input object
with frozen weights is returned, invisibly. Note,
object
is modified in place, and the return value is only provided to
make usage with the pipe convenient.
# instantiate a VGG16 model conv_base <- application_vgg16( weights = "imagenet", include_top = FALSE, input_shape = c(150, 150, 3) ) # freeze it's weights freeze_weights(conv_base) # Note the "Trainable" column conv_base
## Model: "vgg16" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | input_layer (InputLayer) | (None, 150, 150, 3) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_pool (MaxPooling2D) | (None, 75, 75, 64) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_pool (MaxPooling2D) | (None, 37, 37, 128) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_pool (MaxPooling2D) | (None, 18, 18, 256) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_pool (MaxPooling2D) | (None, 9, 9, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_pool (MaxPooling2D) | (None, 4, 4, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 14,714,688 (56.13 MB) ## Trainable params: 0 (0.00 B) ## Non-trainable params: 14,714,688 (56.13 MB)
# create a composite model that includes the base + more layers model <- keras_model_sequential(input_batch_shape = shape(conv_base$input)) |> conv_base() |> layer_flatten() |> layer_dense(units = 256, activation = "relu") |> layer_dense(units = 1, activation = "sigmoid") # compile model |> compile( loss = "binary_crossentropy", optimizer = optimizer_rmsprop(learning_rate = 2e-5), metrics = c("accuracy") ) model
## Model: "sequential" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | vgg16 (Functional) | (None, 4, 4, 512) | 14,714,688 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | flatten (Flatten) | (None, 8192) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | dense (Dense) | (None, 256) | 2,097,408 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | dense_1 (Dense) | (None, 1) | 257 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 16,812,353 (64.13 MB) ## Trainable params: 2,097,665 (8.00 MB) ## Non-trainable params: 14,714,688 (56.13 MB)
print(model, expand_nested = TRUE)
## Model: "sequential" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | vgg16 (Functional) | (None, 4, 4, 512) | 14,714,688 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > input_layer | (None, 150, 150, 3) | 0 | - | ## | (InputLayer) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_pool | (None, 75, 75, 64) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_pool | (None, 37, 37, 128) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_pool | (None, 18, 18, 256) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_pool | (None, 9, 9, 512) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_pool | (None, 4, 4, 512) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | flatten (Flatten) | (None, 8192) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | dense (Dense) | (None, 256) | 2,097,408 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | dense_1 (Dense) | (None, 1) | 257 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 16,812,353 (64.13 MB) ## Trainable params: 2,097,665 (8.00 MB) ## Non-trainable params: 14,714,688 (56.13 MB)
# unfreeze weights from "block5_conv1" on unfreeze_weights(conv_base, from = "block5_conv1") # compile again since we froze or unfroze weights model |> compile( loss = "binary_crossentropy", optimizer = optimizer_rmsprop(learning_rate = 2e-5), metrics = c("accuracy") ) conv_base
## Model: "vgg16" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | input_layer (InputLayer) | (None, 150, 150, 3) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_pool (MaxPooling2D) | (None, 75, 75, 64) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_pool (MaxPooling2D) | (None, 37, 37, 128) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_pool (MaxPooling2D) | (None, 18, 18, 256) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_pool (MaxPooling2D) | (None, 9, 9, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_pool (MaxPooling2D) | (None, 4, 4, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 14,714,688 (56.13 MB) ## Trainable params: 7,079,424 (27.01 MB) ## Non-trainable params: 7,635,264 (29.13 MB)
print(model, expand_nested = TRUE)
## Model: "sequential" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | vgg16 (Functional) | (None, 4, 4, 512) | 14,714,688 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | > input_layer | (None, 150, 150, 3) | 0 | - | ## | (InputLayer) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block1_pool | (None, 75, 75, 64) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block2_pool | (None, 37, 37, 128) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block3_pool | (None, 18, 18, 256) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | > block4_pool | (None, 9, 9, 512) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | > block5_pool | (None, 4, 4, 512) | 0 | - | ## | (MaxPooling2D) | | | | ## +-----------------------------+-----------------------+------------+-------+ ## | flatten (Flatten) | (None, 8192) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | dense (Dense) | (None, 256) | 2,097,408 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | dense_1 (Dense) | (None, 1) | 257 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 16,812,353 (64.13 MB) ## Trainable params: 9,177,089 (35.01 MB) ## Non-trainable params: 7,635,264 (29.13 MB)
# freeze only the last 5 layers freeze_weights(conv_base, from = -5) conv_base
## Model: "vgg16" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | input_layer (InputLayer) | (None, 150, 150, 3) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_pool (MaxPooling2D) | (None, 75, 75, 64) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_pool (MaxPooling2D) | (None, 37, 37, 128) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_pool (MaxPooling2D) | (None, 18, 18, 256) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_pool (MaxPooling2D) | (None, 9, 9, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_pool (MaxPooling2D) | (None, 4, 4, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 14,714,688 (56.13 MB) ## Trainable params: 7,635,264 (29.13 MB) ## Non-trainable params: 7,079,424 (27.01 MB)
# freeze only the last 5 layers, a different way unfreeze_weights(conv_base, to = -6) conv_base
## Model: "vgg16" ## +-----------------------------+-----------------------+------------+-------+ ## | Layer (type) | Output Shape | Param # | Trai… | ## +=============================+=======================+============+=======+ ## | input_layer (InputLayer) | (None, 150, 150, 3) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv1 (Conv2D) | (None, 150, 150, 64) | 1,792 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_conv2 (Conv2D) | (None, 150, 150, 64) | 36,928 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block1_pool (MaxPooling2D) | (None, 75, 75, 64) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv1 (Conv2D) | (None, 75, 75, 128) | 73,856 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_conv2 (Conv2D) | (None, 75, 75, 128) | 147,584 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block2_pool (MaxPooling2D) | (None, 37, 37, 128) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv1 (Conv2D) | (None, 37, 37, 256) | 295,168 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv2 (Conv2D) | (None, 37, 37, 256) | 590,080 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_conv3 (Conv2D) | (None, 37, 37, 256) | 590,080 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block3_pool (MaxPooling2D) | (None, 18, 18, 256) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv1 (Conv2D) | (None, 18, 18, 512) | 1,180,160 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv2 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_conv3 (Conv2D) | (None, 18, 18, 512) | 2,359,808 | Y | ## +-----------------------------+-----------------------+------------+-------+ ## | block4_pool (MaxPooling2D) | (None, 9, 9, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv1 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv2 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_conv3 (Conv2D) | (None, 9, 9, 512) | 2,359,808 | N | ## +-----------------------------+-----------------------+------------+-------+ ## | block5_pool (MaxPooling2D) | (None, 4, 4, 512) | 0 | - | ## +-----------------------------+-----------------------+------------+-------+ ## Total params: 14,714,688 (56.13 MB) ## Trainable params: 7,635,264 (29.13 MB) ## Non-trainable params: 7,079,424 (27.01 MB)
# Freeze only layers of a certain type, e.g, BatchNorm layers batch_norm_layer_class_name <- class(layer_batch_normalization())[1] is_batch_norm_layer <- function(x) inherits(x, batch_norm_layer_class_name) model <- application_efficientnet_b0() freeze_weights(model, which = is_batch_norm_layer) # print(model) # equivalent to: for(layer in model$layers) { if(is_batch_norm_layer(layer)) layer$trainable <- FALSE else layer$trainable <- TRUE }
The from
and to
layer arguments are both inclusive.
When applied to a model, the freeze or unfreeze is a global operation over all layers in the model (i.e. layers not within the specified range will be set to the opposite value, e.g. unfrozen for a call to freeze).
Models must be compiled again after weights are frozen or unfrozen.
A layer config is an object returned from get_config()
that contains the
configuration of a layer or model. The same layer or model can be
reinstantiated later (without its trained weights) from this configuration
using from_config()
. The config does not include connectivity information,
nor the class name (those are handled externally).
get_config(object) from_config(config, custom_objects = NULL)
get_config(object) from_config(config, custom_objects = NULL)
object |
Layer or model object |
config |
Object with layer or model configuration |
custom_objects |
list of custom objects needed to instantiate the layer,
e.g., custom layers defined by |
get_config()
returns an object with the configuration,
from_config()
returns a re-instantiation of the object.
Objects returned from get_config()
are not serializable via RDS. If
you want to save and restore a model across sessions, you can use
save_model_config()
(for model configuration only, not weights)
or save_model()
to save the model configuration and weights
to the filesystem.
Other model functions: get_layer()
keras_model()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()
Other layer methods: count_params()
get_weights()
quantize_weights()
reset_state()
Custom objects set using custom_object_scope()
are not added to the
global list of custom objects, and will not appear in the returned
list.
get_custom_objects() set_custom_objects(objects = named_list(), clear = TRUE)
get_custom_objects() set_custom_objects(objects = named_list(), clear = TRUE)
objects |
A named list of custom objects, as returned by
|
clear |
bool, whether to clear the custom object registry before
populating it with |
An R named list mapping registered names to registered objects.
set_custom_objects()
returns the registry values before updating, invisibly.
get_custom_objects()
You can use set_custom_objects()
to restore a previous registry state.
# within a function, if you want to temporarily modify the registry, function() { orig_objects <- set_custom_objects(clear = TRUE) on.exit(set_custom_objects(orig_objects)) ## temporarily modify the global registry # register_keras_serializable(....) # .... <do work> # on.exit(), the previous registry state is restored. }
register_keras_serializable()
is preferred over set_custom_objects()
for
registering new objects.
Other serialization utilities: deserialize_keras_object()
get_registered_name()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()
By default the file at the url origin
is downloaded to the
cache_dir ~/.keras
, placed in the cache_subdir datasets
,
and given the filename fname
. The final location of a file
example.txt
would therefore be ~/.keras/datasets/example.txt
.
Files in .tar
, .tar.gz
, .tar.bz
, and .zip
formats can
also be extracted.
Passing a hash will verify the file after download. The command line
programs shasum
and sha256sum
can compute the hash.
get_file( fname = NULL, origin = NULL, ..., file_hash = NULL, cache_subdir = "datasets", hash_algorithm = "auto", extract = FALSE, archive_format = "auto", cache_dir = NULL, force_download = FALSE )
get_file( fname = NULL, origin = NULL, ..., file_hash = NULL, cache_subdir = "datasets", hash_algorithm = "auto", extract = FALSE, archive_format = "auto", cache_dir = NULL, force_download = FALSE )
fname |
Name of the file. If an absolute path, e.g. |
origin |
Original URL of the file. |
... |
For forward/backward compatability. |
file_hash |
The expected hash string of the file after download. The sha256 and md5 hash algorithms are both supported. |
cache_subdir |
Subdirectory under the Keras cache dir where the file is
saved. If an absolute path, e.g. |
hash_algorithm |
Select the hash algorithm to verify the file.
options are |
extract |
|
archive_format |
Archive format to try for extracting the file.
Options are |
cache_dir |
Location to store cached files, when |
force_download |
If |
Path to the downloaded file.
** Warning on malicious downloads **
Downloading something from the Internet carries a risk.
NEVER download a file/archive if you do not trust the source.
We recommend that you specify the file_hash
argument
(if the hash of the source file is known) to make sure that the file you
are getting is the one you expect.
path_to_downloaded_file <- get_file( origin = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz", extract = TRUE )
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Indices are based on order of horizontal graph traversal (bottom-up) and are
1-based. If name
and index
are both provided, index
will take
precedence.
get_layer(object, name = NULL, index = NULL)
get_layer(object, name = NULL, index = NULL)
object |
Keras model object |
name |
String, name of layer. |
index |
Integer, index of layer (1-based). Also valid are negative values, which count from the end of model. |
A layer instance.
Other model functions: get_config()
keras_model()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()
This function is part of the Keras serialization and deserialization framework. It maps objects to the string names associated with those objects for serialization/deserialization.
get_registered_name(obj)
get_registered_name(obj)
obj |
The object to look up. |
The name associated with the object, or the default name if the object is not registered.
Other serialization utilities: deserialize_keras_object()
get_custom_objects()
get_registered_object()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()
name
if it is registered with Keras.This function is part of the Keras serialization and deserialization framework. It maps strings to the objects associated with them for serialization/deserialization.
get_registered_object(name, custom_objects = NULL, module_objects = NULL)
get_registered_object(name, custom_objects = NULL, module_objects = NULL)
name |
The name to look up. |
custom_objects |
A named list of custom objects to look the name up in. Generally, custom_objects is provided by the user. |
module_objects |
A named list of custom objects to look the name up in.
Generally, |
An instantiable class associated with name
, or NULL
if no such class
exists.
from_config <- function(cls, config, custom_objects = NULL) { if ('my_custom_object_name' \%in\% names(config)) { config$hidden_cls <- get_registered_object( config$my_custom_object_name, custom_objects = custom_objects) } }
Other serialization utilities: deserialize_keras_object()
get_custom_objects()
get_registered_name()
register_keras_serializable()
serialize_keras_object()
with_custom_object_scope()
tensor
.Output will always be a list of tensors (potentially with 1 element).
get_source_inputs(tensor)
get_source_inputs(tensor)
tensor |
The tensor to start from. |
List of input tensors.
input <- keras_input(c(3)) output <- input |> layer_dense(4) |> op_multiply(5) reticulate::py_id(get_source_inputs(output)[[1]]) == reticulate::py_id(input)
## [1] TRUE
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Layer/Model weights as R arrays
get_weights(object, trainable = NA) set_weights(object, weights)
get_weights(object, trainable = NA) set_weights(object, weights)
object |
Layer or model object |
trainable |
if |
weights |
Weights as R array |
A list of R arrays.
You can access the Layer/Model as KerasVariables
(which are also
backend-native tensors like tf.Variable
) at object$weights
,
object$trainable_weights
, or object$non_trainable_weights
Other layer methods: count_params()
get_config()
quantize_weights()
reset_state()
Saves an image stored as an array to a path or file object.
image_array_save( x, path, data_format = NULL, file_format = NULL, scale = TRUE, ... )
image_array_save( x, path, data_format = NULL, file_format = NULL, scale = TRUE, ... )
x |
An array. |
path |
Path or file object. |
data_format |
Image data format, either |
file_format |
Optional file format override. If omitted, the format to use is determined from the filename extension. If a file object was used instead of a filename, this parameter should always be used. |
scale |
Whether to rescale image values to be within |
... |
Additional keyword arguments passed to |
Called primarily for side effects. The input x
is returned,
invisibly, to enable usage with the pipe.
Other image utils: image_from_array()
image_load()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
tf.data.Dataset
from image files in a directory.If your directory structure is:
main_directory/ ...class_a/ ......a_image_1.jpg ......a_image_2.jpg ...class_b/ ......b_image_1.jpg ......b_image_2.jpg
Then calling image_dataset_from_directory(main_directory, labels = 'inferred')
will return a tf.data.Dataset
that yields batches of
images from the subdirectories class_a
and class_b
, together with labels
0 and 1 (0 corresponding to class_a
and 1 corresponding to class_b
).
Supported image formats: .jpeg
, .jpg
, .png
, .bmp
, .gif
.
Animated gifs are truncated to the first frame.
image_dataset_from_directory( directory, labels = "inferred", label_mode = "int", class_names = NULL, color_mode = "rgb", batch_size = 32L, image_size = c(256L, 256L), shuffle = TRUE, seed = NULL, validation_split = NULL, subset = NULL, interpolation = "bilinear", follow_links = FALSE, crop_to_aspect_ratio = FALSE, pad_to_aspect_ratio = FALSE, data_format = NULL, verbose = TRUE )
image_dataset_from_directory( directory, labels = "inferred", label_mode = "int", class_names = NULL, color_mode = "rgb", batch_size = 32L, image_size = c(256L, 256L), shuffle = TRUE, seed = NULL, validation_split = NULL, subset = NULL, interpolation = "bilinear", follow_links = FALSE, crop_to_aspect_ratio = FALSE, pad_to_aspect_ratio = FALSE, data_format = NULL, verbose = TRUE )
directory |
Directory where the data is located.
If |
labels |
Either |
label_mode |
String describing the encoding of
|
class_names |
Only valid if |
color_mode |
One of |
batch_size |
Size of the batches of data. Defaults to 32.
If |
image_size |
Size to resize images to after they are read from disk,
specified as |
shuffle |
Whether to shuffle the data. Defaults to |
seed |
Optional random seed for shuffling and transformations. |
validation_split |
Optional float between 0 and 1, fraction of data to reserve for validation. |
subset |
Subset of the data to return.
One of |
interpolation |
String, the interpolation method used when
resizing images.
Supports |
follow_links |
Whether to visit subdirectories pointed to by symlinks.
Defaults to |
crop_to_aspect_ratio |
If |
pad_to_aspect_ratio |
If |
data_format |
If |
verbose |
Whether to display number information on classes and
number of files found. Defaults to |
A tf.data.Dataset
object.
If label_mode
is NULL
, it yields float32
tensors of shape
(batch_size, image_size[1], image_size[2], num_channels)
,
encoding images (see below for rules regarding num_channels
).
Otherwise, it yields a tuple (images, labels)
, where images
has
shape (batch_size, image_size[1], image_size[2], num_channels)
,
and labels
follows the format described below.
Rules regarding labels format:
if label_mode
is "int"
, the labels are an int32
tensor of shape
(batch_size,)
.
if label_mode
is "binary"
, the labels are a float32
tensor of
1s and 0s of shape (batch_size, 1)
.
if label_mode
is "categorical"
, the labels are a float32
tensor
of shape (batch_size, num_classes)
, representing a one-hot
encoding of the class index.
Rules regarding number of channels in the yielded images:
if color_mode
is "grayscale"
,
there's 1 channel in the image tensors.
if color_mode
is "rgb"
,
there are 3 channels in the image tensors.
if color_mode
is "rgba"
,
there are 4 channels in the image tensors.
Other dataset utils: audio_dataset_from_directory()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other preprocessing: image_smart_resize()
text_dataset_from_directory()
timeseries_dataset_from_array()
Converts a 3D array to a PIL Image instance.
image_from_array(x, data_format = NULL, scale = TRUE, dtype = NULL)
image_from_array(x, data_format = NULL, scale = TRUE, dtype = NULL)
x |
Input data, in any form that can be converted to an array. |
data_format |
Image data format, can be either |
scale |
Whether to rescale the image such that minimum and maximum values
are 0 and 255 respectively. Defaults to |
dtype |
Dtype to use. |
A PIL Image instance.
img <- array(runif(30000), dim = c(100, 100, 3)) pil_img <- image_from_array(img) pil_img
## <PIL.Image.Image image mode=RGB size=100x100>
Other image utils: image_array_save()
image_load()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Loads an image into PIL format.
image_load( path, color_mode = "rgb", target_size = NULL, interpolation = "nearest", keep_aspect_ratio = FALSE )
image_load( path, color_mode = "rgb", target_size = NULL, interpolation = "nearest", keep_aspect_ratio = FALSE )
path |
Path to image file. |
color_mode |
One of |
target_size |
Either |
interpolation |
Interpolation method used to resample the image if the
target size is different from that of the loaded image. Supported
methods are |
keep_aspect_ratio |
Boolean, whether to resize images to a target size without aspect ratio distortion. The image is cropped in the center with target aspect ratio before resizing. |
A PIL Image instance.
image_path <- get_file(origin = "https://www.r-project.org/logo/Rlogo.png") (image <- image_load(image_path))
## <PIL.Image.Image image mode=RGB size=724x561>
input_arr <- image_to_array(image) str(input_arr)
## num [1:561, 1:724, 1:3] 0 0 0 0 0 0 0 0 0 0 ...
input_arr %<>% array_reshape(dim = c(1, dim(input_arr))) # Convert single image to a batch.
model |> predict(input_arr)
Other image utils: image_array_save()
image_from_array()
image_smart_resize()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Image datasets typically yield images that have each a different size. However, these images need to be batched before they can be processed by Keras layers. To be batched, images need to share the same height and width.
You could simply do, in TF (or JAX equivalent):
size <- c(200, 200) ds <- ds$map(\(img) tf$image$resize(img, size))
However, if you do this, you distort the aspect ratio of your images, since
in general they do not all have the same aspect ratio as size
. This is
fine in many cases, but not always (e.g. for image generation models
this can be a problem).
Note that passing the argument preserve_aspect_ratio = TRUE
to tf$image$resize()
will preserve the aspect ratio, but at the cost of no longer respecting the
provided target size.
This calls for:
size <- c(200, 200) ds <- ds$map(\(img) image_smart_resize(img, size))
Your output images will actually be (200, 200)
, and will not be distorted.
Instead, the parts of the image that do not fit within the target size
get cropped out.
The resizing process is:
Take the largest centered crop of the image that has the same aspect
ratio as the target size. For instance, if size = c(200, 200)
and the input
image has size (340, 500)
, we take a crop of (340, 340)
centered along
the width.
Resize the cropped image to the target size. In the example above,
we resize the (340, 340)
crop to (200, 200)
.
image_smart_resize( x, size, interpolation = "bilinear", data_format = "channels_last", backend_module = NULL )
image_smart_resize( x, size, interpolation = "bilinear", data_format = "channels_last", backend_module = NULL )
x |
Input image or batch of images (as a tensor or array).
Must be in format |
size |
Tuple of |
interpolation |
String, interpolation to use for resizing.
Supports |
data_format |
|
backend_module |
Backend module to use (if different from the default backend). |
Array with shape (size[1], size[2], channels)
.
If the input image was an array, the output is an array,
and if it was a backend-native tensor,
the output is a backend-native tensor.
Other image utils: image_array_save()
image_from_array()
image_load()
image_to_array()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other preprocessing: image_dataset_from_directory()
text_dataset_from_directory()
timeseries_dataset_from_array()
Converts a PIL Image instance to a matrix.
image_to_array(img, data_format = NULL, dtype = NULL)
image_to_array(img, data_format = NULL, dtype = NULL)
img |
Input PIL Image instance. |
data_format |
Image data format, can be either |
dtype |
Dtype to use. |
A 3D array.
image_path <- get_file(origin = "https://www.r-project.org/logo/Rlogo.png") (img <- image_load(image_path))
## <PIL.Image.Image image mode=RGB size=724x561>
array <- image_to_array(img) str(array)
## num [1:561, 1:724, 1:3] 0 0 0 0 0 0 0 0 0 0 ...
Other image utils: image_array_save()
image_from_array()
image_load()
image_smart_resize()
op_image_affine_transform()
op_image_crop()
op_image_extract_patches()
op_image_hsv_to_rgb()
op_image_map_coordinates()
op_image_pad()
op_image_resize()
op_image_rgb_to_grayscale()
op_image_rgb_to_hsv()
Other utils: audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Only scalar values are allowed. The constant value provided must be convertible to the dtype requested when calling the initializer.
initializer_constant(value = 0)
initializer_constant(value = 0)
value |
A numeric scalar. |
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_constant(10) values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_constant(10) layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other constant initializers: initializer_identity()
initializer_ones()
initializer_zeros()
Other initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a truncated normal distribution centered on 0 with
stddev = sqrt(2 / (fan_in + fan_out))
where fan_in
is the number of
input units in the weight tensor and fan_out
is the number of output units
in the weight tensor.
initializer_glorot_normal(seed = NULL)
initializer_glorot_normal(seed = NULL)
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_glorot_normal() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_glorot_normal() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a uniform distribution within [-limit, limit]
, where
limit = sqrt(6 / (fan_in + fan_out))
(fan_in
is the number of input
units in the weight tensor and fan_out
is the number of output units).
initializer_glorot_uniform(seed = NULL)
initializer_glorot_uniform(seed = NULL)
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_glorot_uniform() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_glorot_uniform() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
It draws samples from a truncated normal distribution centered on 0 with
stddev = sqrt(2 / fan_in)
where fan_in
is the number of input units in
the weight tensor.
initializer_he_normal(seed = NULL)
initializer_he_normal(seed = NULL)
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_he_normal() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_he_normal() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a uniform distribution within [-limit, limit]
, where
limit = sqrt(6 / fan_in)
(fan_in
is the number of input units in the
weight tensor).
initializer_he_uniform(seed = NULL)
initializer_he_uniform(seed = NULL)
seed |
A integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_he_uniform() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_he_uniform() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Only usable for generating 2D matrices.
initializer_identity(gain = 1)
initializer_identity(gain = 1)
gain |
Multiplicative factor to apply to the identity matrix. |
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_identity() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_identity() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other constant initializers: initializer_constant()
initializer_ones()
initializer_zeros()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Initializers allow you to pre-specify an initialization strategy, encoded in the Initializer object, without knowing the shape and dtype of the variable being initialized.
Draws samples from a truncated normal distribution centered on 0 with
stddev = sqrt(1 / fan_in)
where fan_in
is the number of input units in
the weight tensor.
initializer_lecun_normal(seed = NULL)
initializer_lecun_normal(seed = NULL)
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_lecun_normal() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_lecun_normal() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a uniform distribution within [-limit, limit]
, where
limit = sqrt(3 / fan_in)
(fan_in
is the number of input units in the
weight tensor).
initializer_lecun_uniform(seed = NULL)
initializer_lecun_uniform(seed = NULL)
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_lecun_uniform() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_lecun_uniform() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Also available via the shortcut function ones
.
initializer_ones()
initializer_ones()
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_ones() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_ones() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other constant initializers: initializer_constant()
initializer_identity()
initializer_zeros()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
If the shape of the tensor to initialize is two-dimensional, it is initialized with an orthogonal matrix obtained from the QR decomposition of a matrix of random numbers drawn from a normal distribution. If the matrix has fewer rows than columns then the output will have orthogonal rows. Otherwise, the output will have orthogonal columns.
If the shape of the tensor to initialize is more than two-dimensional,
a matrix of shape (shape[1] * ... * shape[n - 1], shape[n])
is initialized, where n
is the length of the shape vector.
The matrix is subsequently reshaped to give a tensor of the desired shape.
initializer_orthogonal(gain = 1, seed = NULL)
initializer_orthogonal(gain = 1, seed = NULL)
gain |
Multiplicative factor to apply to the orthogonal matrix. |
seed |
An integer. Used to make the behavior of the initializer deterministic. |
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_orthogonal() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_orthogonal() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a normal distribution for given parameters.
initializer_random_normal(mean = 0, stddev = 0.05, seed = NULL)
initializer_random_normal(mean = 0, stddev = 0.05, seed = NULL)
mean |
A numeric scalar. Mean of the random values to generate. |
stddev |
A numeric scalar. Standard deviation of the random values to generate. |
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_random_normal(mean = 0.0, stddev = 1.0) values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_random_normal(mean = 0.0, stddev = 1.0) layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
Draws samples from a uniform distribution for given parameters.
initializer_random_uniform(minval = -0.05, maxval = 0.05, seed = NULL)
initializer_random_uniform(minval = -0.05, maxval = 0.05, seed = NULL)
minval |
A numeric scalar or a scalar keras tensor. Lower bound of the range of random values to generate (inclusive). |
maxval |
A numeric scalar or a scalar keras tensor. Upper bound of the range of random values to generate (exclusive). |
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_random_uniform(minval = 0.0, maxval = 1.0) values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_random_uniform(minval = 0.0, maxval = 1.0) layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_truncated_normal()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_truncated_normal()
initializer_variance_scaling()
initializer_zeros()
The values generated are similar to values from a
RandomNormal
initializer, except that values more
than two standard deviations from the mean are
discarded and re-drawn.
initializer_truncated_normal(mean = 0, stddev = 0.05, seed = NULL)
initializer_truncated_normal(mean = 0, stddev = 0.05, seed = NULL)
mean |
A numeric scalar. Mean of the random values to generate. |
stddev |
A numeric scalar. Standard deviation of the random values to generate. |
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_truncated_normal(mean = 0, stddev = 1) values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_truncated_normal(mean = 0, stddev = 1) layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_variance_scaling()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_variance_scaling()
initializer_zeros()
With distribution = "truncated_normal" or "untruncated_normal"
, samples are
drawn from a truncated/untruncated normal distribution with a mean of zero
and a standard deviation (after truncation, if used) stddev = sqrt(scale / n)
, where n
is:
number of input units in the weight tensor, if mode = "fan_in"
number of output units, if mode = "fan_out"
average of the numbers of input and output units, if mode = "fan_avg"
With distribution = "uniform"
, samples are drawn from a uniform distribution
within [-limit, limit]
, where limit = sqrt(3 * scale / n)
.
initializer_variance_scaling( scale = 1, mode = "fan_in", distribution = "truncated_normal", seed = NULL )
initializer_variance_scaling( scale = 1, mode = "fan_in", distribution = "truncated_normal", seed = NULL )
scale |
Scaling factor (positive float). |
mode |
One of |
distribution |
Random distribution to use.
One of |
seed |
An integer or instance of
|
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_variance_scaling(scale = 0.1, mode = 'fan_in', distribution = 'uniform') values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_variance_scaling(scale = 0.1, mode = 'fan_in', distribution = 'uniform') layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other random initializers: initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_zeros()
Initializer that generates tensors initialized to 0.
initializer_zeros()
initializer_zeros()
An Initializer
instance that can be passed to layer or variable
constructors, or called directly with a shape
to return a Tensor.
# Standalone usage: initializer <- initializer_zeros() values <- initializer(shape = c(2, 2))
# Usage in a Keras layer: initializer <- initializer_zeros() layer <- layer_dense(units = 3, kernel_initializer = initializer)
Other constant initializers: initializer_constant()
initializer_identity()
initializer_ones()
Other initializers: initializer_constant()
initializer_glorot_normal()
initializer_glorot_uniform()
initializer_he_normal()
initializer_he_uniform()
initializer_identity()
initializer_lecun_normal()
initializer_lecun_uniform()
initializer_ones()
initializer_orthogonal()
initializer_random_normal()
initializer_random_uniform()
initializer_truncated_normal()
initializer_variance_scaling()
This function will install Keras along with a selected backend, including all Python dependencies.
install_keras( envname = "r-keras", ..., extra_packages = c("scipy", "pandas", "Pillow", "pydot", "ipython", "tensorflow_datasets"), python_version = ">=3.9,<=3.11", backend = c("tensorflow", "jax"), gpu = NA, restart_session = TRUE )
install_keras( envname = "r-keras", ..., extra_packages = c("scipy", "pandas", "Pillow", "pydot", "ipython", "tensorflow_datasets"), python_version = ">=3.9,<=3.11", backend = c("tensorflow", "jax"), gpu = NA, restart_session = TRUE )
envname |
Name of or path to a Python virtual environment |
... |
reserved for future compatibility. |
extra_packages |
Additional Python packages to install alongside Keras |
python_version |
Passed on to |
backend |
Which backend(s) to install. Accepted values include |
gpu |
whether to install a GPU capable version of the backend. |
restart_session |
Whether to restart the R session after installing (note this will only occur within RStudio). |
No return value, called for side effects.
tensorflow::install_tensorflow()
The keras
module object is the equivalent of
reticulate::import("keras")
and provided mainly as a convenience.
An object of class python.builtin.module
the keras Python module
A Keras tensor is a symbolic tensor-like object, which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model.
For instance, if a
, b
and c
are Keras tensors,
it becomes possible to do:
model <- keras_model(input = c(a, b), output = c)
keras_input( shape = NULL, batch_size = NULL, dtype = NULL, sparse = NULL, batch_shape = NULL, name = NULL, tensor = NULL, optional = FALSE )
keras_input( shape = NULL, batch_size = NULL, dtype = NULL, sparse = NULL, batch_shape = NULL, name = NULL, tensor = NULL, optional = FALSE )
shape |
A shape list (list of integers or |
batch_size |
Optional static batch size (integer). |
dtype |
The data type expected by the input, as a string
(e.g. |
sparse |
A boolean specifying whether the expected input will be sparse
tensors. Note that, if |
batch_shape |
Optional shape list (list of integers or |
name |
Optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided. |
tensor |
Optional existing tensor to wrap into the |
optional |
Boolean, whether the input is optional or not.
An optional input can accept |
A Keras tensor,
which can passed to the inputs
argument of (keras_model()
).
# This is a logistic regression in Keras input <- layer_input(shape=c(32)) output <- input |> layer_dense(16, activation='softmax') model <- keras_model(input, output)
Other model creation: keras_model()
keras_model_sequential()
A model is a directed acyclic graph of layers.
keras_model(inputs = NULL, outputs = NULL, ...)
keras_model(inputs = NULL, outputs = NULL, ...)
inputs |
Input tensor(s) (from |
outputs |
Output tensors (from calling layers with |
... |
Any additional arguments |
A Model
instance.
library(keras3) # input tensor inputs <- keras_input(shape = c(784)) # outputs compose input + dense layers predictions <- inputs |> layer_dense(units = 64, activation = 'relu') |> layer_dense(units = 64, activation = 'relu') |> layer_dense(units = 10, activation = 'softmax') # create and compile model model <- keras_model(inputs = inputs, outputs = predictions) model |> compile( optimizer = 'rmsprop', loss = 'categorical_crossentropy', metrics = c('accuracy') )
Other model functions: get_config()
get_layer()
keras_model_sequential()
pop_layer()
summary.keras.src.models.model.Model()
Other model creation: keras_input()
keras_model_sequential()
Keras Model composed of a linear stack of layers
keras_model_sequential( input_shape = NULL, name = NULL, ..., input_dtype = NULL, input_batch_size = NULL, input_sparse = NULL, input_batch_shape = NULL, input_name = NULL, input_tensor = NULL, input_optional = FALSE, trainable = TRUE, layers = list() )
keras_model_sequential( input_shape = NULL, name = NULL, ..., input_dtype = NULL, input_batch_size = NULL, input_sparse = NULL, input_batch_shape = NULL, input_name = NULL, input_tensor = NULL, input_optional = FALSE, trainable = TRUE, layers = list() )
input_shape |
A shape integer vector,
not including the batch size.
For instance, |
name |
Name of model |
... |
additional arguments passed on to |
input_dtype |
The data type expected by the input, as a string
(e.g. |
input_batch_size |
Optional static batch size (integer). |
input_sparse |
A boolean specifying whether the expected input will be sparse
tensors. Note that, if |
input_batch_shape |
An optional way to specify |
input_name |
Optional name string for the input layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided. |
input_tensor |
Optional existing tensor to wrap into the |
input_optional |
Boolean, whether the input is optional or not.
An optional input can accept |
trainable |
Boolean, whether the model's variables should be trainable.
You can also change the trainable status of a model/layer with
|
layers |
List of layers to add to the model. |
A Sequential
model instance.
model <- keras_model_sequential(input_shape = c(784)) model |> layer_dense(units = 32) |> layer_activation('relu') |> layer_dense(units = 10) |> layer_activation('softmax') model |> compile( optimizer = 'rmsprop', loss = 'categorical_crossentropy', metrics = c('accuracy') ) model
## Model: "sequential" ## +---------------------------------+------------------------+---------------+ ## | Layer (type) | Output Shape | Param # | ## +=================================+========================+===============+ ## | dense (Dense) | (None, 32) | 25,120 | ## +---------------------------------+------------------------+---------------+ ## | activation (Activation) | (None, 32) | 0 | ## +---------------------------------+------------------------+---------------+ ## | dense_1 (Dense) | (None, 10) | 330 | ## +---------------------------------+------------------------+---------------+ ## | activation_1 (Activation) | (None, 10) | 0 | ## +---------------------------------+------------------------+---------------+ ## Total params: 25,450 (99.41 KB) ## Trainable params: 25,450 (99.41 KB) ## Non-trainable params: 0 (0.00 B)
If input_shape
is omitted, then the model layer
shapes, including the final model output shape, will not be known until
the model is built, either by calling the model with an input tensor/array
like model(input)
, (possibly via fit()
/evaluate()
/predict()
), or by
explicitly calling model$build(input_shape)
.
Other model functions: get_config()
get_layer()
keras_model()
pop_layer()
summary.keras.src.models.model.Model()
Other model creation: keras_input()
keras_model()
Layer
class.A layer is a callable object that takes as input one or more tensors and
that outputs one or more tensors. It involves computation, defined
in the call()
method, and a state (weight variables). State can be
created:
in initialize()
, for instance via self$add_weight()
;
in the optional build()
method, which is invoked by the first
call()
to the layer, and supplies the shape(s) of the input(s),
which may not have been known at initialization time.
Layers are recursively composable: If you assign a Layer instance as an
attribute of another Layer, the outer layer will start tracking the weights
created by the inner layer. Nested layers should be instantiated in the
initialize()
method or build()
method.
Users will just instantiate a layer and then treat it as a callable.
Layer( classname, initialize = NULL, call = NULL, build = NULL, get_config = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
Layer( classname, initialize = NULL, call = NULL, build = NULL, get_config = NULL, ..., public = list(), private = list(), inherit = NULL, parent_env = parent.frame() )
classname |
String, the name of the custom class. (Conventionally, CamelCase). |
initialize , call , build , get_config
|
Recommended methods to implement. See description and details sections. |
... , public
|
Additional methods or public members of the custom class. |
private |
Named list of R objects (typically, functions) to include in
instance private environments. |
inherit |
What the custom class will subclass. By default, the base keras class. |
parent_env |
The R environment that all class methods will have as a grandparent. |
A composing layer constructor, with similar behavior to other layer
functions like layer_dense()
. The first argument of the returned function
will be object
, enabling initialize()
ing and call()
the layer in one
step while composing the layer with the pipe, like
layer_foo <- Layer("Foo", ....) output <- inputs |> layer_foo()
To only initialize()
a layer instance and not call()
it, pass a missing
or NULL
value to object
, or pass all arguments to initialize()
by name.
layer <- layer_dense(units = 2, activation = "relu") layer <- layer_dense(NULL, 2, activation = "relu") layer <- layer_dense(, 2, activation = "relu") # then you can call() the layer in a separate step outputs <- inputs |> layer()
All R function custom methods (public and private) will have the following symbols in scope:
self
: The custom class instance.
super
: The custom class superclass.
private
: An R environment specific to the class instance.
Any objects assigned here are invisible to the Keras framework.
__class__
and as.symbol(classname)
: the custom class type object.
name
: The name of the layer (string).
dtype
: Dtype of the layer's weights. Alias of layer$variable_dtype
.
variable_dtype
: Dtype of the layer's weights.
compute_dtype
: The dtype of the layer's computations.
Layers automatically cast inputs to this dtype, which causes
the computations and output to also be in this dtype.
When mixed precision is used with a
keras$mixed_precision$DTypePolicy
, this will be different
than variable_dtype
.
trainable_weights
: List of variables to be included in backprop.
non_trainable_weights
: List of variables that should not be
included in backprop.
weights
: The concatenation of the lists trainable_weights
and
non_trainable_weights
(in this order).
trainable
: Whether the layer should be trained (boolean), i.e.
whether its potentially-trainable weights should be returned
as part of layer$trainable_weights
.
input_spec
: Optional (list of) InputSpec
object(s) specifying the
constraints on inputs that can be accepted by the layer.
We recommend that custom Layer
s implement the following methods:
initialize()
: Defines custom layer attributes, and creates layer weights
that do not depend on input shapes, using add_weight()
,
or other state.
build(input_shape)
: This method can be used to create weights that
depend on the shape(s) of the input(s), using add_weight()
, or other
state. Calling call()
will automatically build the layer
(if it has not been built yet) by calling build()
.
call(...)
: Method called after making
sure build()
has been called. call()
performs the logic of applying
the layer to the input arguments.
Two reserved arguments you can optionally use in call()
are:
training
(boolean, whether the call is in inference mode or
training mode).
mask
(boolean tensor encoding masked timesteps in the input,
used e.g. in RNN layers).
A typical signature for this method is call(inputs)
, and user
could optionally add training
and mask
if the layer need them.
get_config()
: Returns a named list containing the configuration
used to initialize this layer. If the list names differ from the arguments
in initialize()
, then override from_config()
as well.
This method is used when saving
the layer or a model that contains this layer.
Here's a basic example: a layer with two variables, w
and b
,
that returns y <- (w %*% x) + b
.
It shows how to implement build()
and call()
.
Variables set as attributes of a layer are tracked as weights
of the layers (in layer$weights
).
layer_simple_dense <- Layer( "SimpleDense", initialize = function(units = 32) { super$initialize() self$units <- units }, # Create the state of the layer (weights) build = function(input_shape) { self$kernel <- self$add_weight( shape = shape(tail(input_shape, 1), self$units), initializer = "glorot_uniform", trainable = TRUE, name = "kernel" ) self$bias = self$add_weight( shape = shape(self$units), initializer = "zeros", trainable = TRUE, name = "bias" ) }, # Defines the computation call = function(self, inputs) { op_matmul(inputs, self$kernel) + self$bias } ) # Instantiates the layer. # Supply missing `object` arg to skip invoking `call()` and instead return # the Layer instance linear_layer <- layer_simple_dense(, 4) # This will call `build(input_shape)` and create the weights, # and then invoke `call()`. y <- linear_layer(op_ones(c(2, 2))) stopifnot(length(linear_layer$weights) == 2) # These weights are trainable, so they're listed in `trainable_weights`: stopifnot(length(linear_layer$trainable_weights) == 2)
Besides trainable weights, updated via backpropagation during training,
layers can also have non-trainable weights. These weights are meant to
be updated manually during call()
. Here's a example layer that computes
the running sum of its inputs:
layer_compute_sum <- Layer( classname = "ComputeSum", initialize = function(input_dim) { super$initialize() # Create a non-trainable weight. self$total <- self$add_weight( shape = shape(), initializer = "zeros", trainable = FALSE, name = "total" ) }, call = function(inputs) { self$total$assign(self$total + op_sum(inputs)) self$total } ) my_sum <- layer_compute_sum(, 2) x <- op_ones(c(2, 2)) y <- my_sum(x) stopifnot(exprs = { all.equal(my_sum$weights, list(my_sum$total)) all.equal(my_sum$non_trainable_weights, list(my_sum$total)) all.equal(my_sum$trainable_weights, list()) })
initialize(..., activity_regularizer = NULL, trainable = TRUE, dtype = NULL, autocast = TRUE, name = NULL)
Initialize self. This method is typically called from a custom initialize()
method.
Example:
layer_my_layer <- Layer("MyLayer", initialize = function(units, ..., dtype = NULL, name = NULL) { super$initialize(..., dtype = dtype, name = name) # .... finish initializing `self` instance } )
Args:
trainable: Boolean, whether the layer's variables should be trainable.
name: String name of the layer.
dtype: The dtype of the layer's computations and weights. Can also be a
keras$DTypePolicy
,
which allows the computation and
weight dtype to differ. Defaults to NULL
. NULL
means to use
config_dtype_policy()
,
which is a "float32"
policy unless set to different value
(via config_set_dtype_policy()
).
add_loss(loss)
Can be called inside of the call()
method to add a scalar loss.
Example:
Layer("MyLayer", ... call = function(x) { self$add_loss(op_sum(x)) x }
add_metric(...)
add_variable(...)
Add a weight variable to the layer.
Alias of add_weight()
.
add_weight(shape = NULL, initializer = NULL, dtype = NULL, trainable = TRUE, autocast = TRUE, regularizer = NULL, constraint = NULL, aggregation = 'mean', name = NULL)
Add a weight variable to the layer.
Args:
shape
: shape for the variable (as defined by shape()
)
Must be fully-defined (no NA
/NULL
/-1
entries).
Defaults to ()
(scalar) if unspecified.
initializer
: Initializer object to use to
populate the initial variable value,
or string name of a built-in initializer
(e.g. "random_normal"
). If unspecified,
defaults to "glorot_uniform"
for floating-point variables and to "zeros"
for all other types (e.g. int, bool).
dtype
: Dtype of the variable to create,
e.g. "float32"
. If unspecified,
defaults to the layer's
variable dtype (which itself defaults to
"float32"
if unspecified).
trainable
: Boolean, whether the variable should
be trainable via backprop or whether its
updates are managed manually.
Defaults to TRUE
.
autocast
: Boolean, whether to autocast layers variables when
accessing them. Defaults to TRUE
.
regularizer
: Regularizer object to call to apply penalty on the
weight. These penalties are summed into the loss function
during optimization. Defaults to NULL
.
constraint
: Constraint object to call on the
variable after any optimizer update,
or string name of a built-in constraint.
Defaults to NULL
.
aggregation
: String, one of 'mean'
, 'sum'
,
'only_first_replica'
. Annotates the variable with the type
of multi-replica aggregation to be used for this variable
when writing custom data parallel training loops.
name
: String name of the variable. Useful for debugging purposes.
Returns:
A backend tensor, wrapped in a KerasVariable
class.
The KerasVariable
class has
Methods:
assign(value)
assign_add(value)
assign_sub(value)
numpy()
(calling as.array(<variable>)
is preferred)
Properties/Attributes:
value
dtype
ndim
shape
(calling shape(<variable>)
is preferred)
trainable
build(input_shape)
build_from_config(config)
Builds the layer's states with the supplied config (named list of args).
By default, this method calls the do.call(build, config$input_shape)
method,
which creates weights based on the layer's input shape in the supplied
config. If your config contains other information needed to load the
layer's state, you should override this method.
Args:
config
: Named list containing the input shape associated with this layer.
call(...)
See description above
compute_mask(inputs, previous_mask)
compute_output_shape(...)
compute_output_spec(...)
count_params()
Count the total number of scalars composing the weights.
Returns: An integer count.
get_build_config()
Returns a named list with the layer's input shape.
This method returns a config (named list) that can be used by
build_from_config(config)
to create all states (e.g. Variables and
Lookup tables) needed by the layer.
By default, the config only contains the input shape that the layer was built with. If you're writing a custom layer that creates state in an unusual way, you should override this method to make sure this state is already created when Keras attempts to load its value upon model loading.
Returns: A named list containing the input shape associated with the layer.
get_config()
Returns the config of the object.
An object config is a named list (serializable) containing the information needed to re-instantiate it. The config is expected to be serializable to JSON, and is expected to consist of a (potentially complex, nested) structure of names lists consisting of simple objects like strings, ints.
get_weights()
Return the values of layer$weights
as a list of R or NumPy arrays.
quantize(mode, type_check = TRUE)
Currently, only the Dense
, EinsumDense
and Embedding
layers support in-place
quantization via this quantize()
method.
Example:
model$quantize("int8") # quantize model in-place model |> predict(data) # faster inference
quantized_build(input_shape, mode)
quantized_call(...)
load_own_variables(store)
Loads the state of the layer.
You can override this method to take full control of how the state of
the layer is loaded upon calling load_model()
.
Args:
store
: Named list from which the state of the model will be loaded.
save_own_variables(store)
Saves the state of the layer.
You can override this method to take full control of how the state of
the layer is saved upon calling save_model()
.
Args:
store
: Named list where the state of the model will be saved.
set_weights(weights)
Sets the values of weights
from a list of R or NumPy arrays.
stateless_call(trainable_variables, non_trainable_variables, ..., return_losses = FALSE)
Call the layer without any side effects.
Args:
trainable_variables
: List of trainable variables of the model.
non_trainable_variables
: List of non-trainable variables of the
model.
...
: Positional and named arguments to be passed to call()
.
return_losses
: If TRUE
, stateless_call()
will return the list of
losses created during call()
as part of its return values.
Returns:
An unnamed list. By default, returns list(outputs, non_trainable_variables)
.
If return_losses = TRUE
, then returns
list(outputs, non_trainable_variables, losses)
.
Note: non_trainable_variables
include not only non-trainable weights
such as BatchNormalization
statistics, but also RNG seed state
(if there are any random operations part of the layer, such as dropout),
and Metric
state (if there are any metrics attached to the layer).
These are all elements of state of the layer.
Example:
model <- ... data <- ... trainable_variables <- model$trainable_variables non_trainable_variables <- model$non_trainable_variables # Call the model with zero side effects c(outputs, non_trainable_variables) %<-% model$stateless_call( trainable_variables, non_trainable_variables, data ) # Attach the updated state to the model # (until you do this, the model is still in its pre-call state). purrr::walk2( model$non_trainable_variables, non_trainable_variables, \(variable, value) variable$assign(value))
symbolic_call(...)
from_config(config)
Creates a layer from its config.
This is a class method, meaning, the R function will not have a self
symbol (a class instance) in scope. Use __class__
or the classname symbol
provided when the Layer()
was constructed) to resolve the class definition.
The default implementation is:
from_config = function(config) { do.call(`__class__`, config) }
This method is the reverse of get_config()
,
capable of instantiating the same layer from the config
named list. It does not handle layer connectivity
(handled by Network), nor weights (handled by set_weights()
).
Args:
config
: A named list, typically the
output of get_config()
.
Returns: A layer instance.
compute_dtype
The dtype of the computations performed by the layer.
dtype
Alias of layer$variable_dtype
.
input_dtype
The dtype layer inputs should be converted to.
losses
List of scalar losses from add_loss()
, regularizers and sublayers.
metrics
List of all metrics.
metrics_variables
List of all metric variables.
non_trainable_variables
List of all non-trainable layer state.
This extends layer$non_trainable_weights
to include all state used by
the layer including state for metrics and SeedGenerator
s.
non_trainable_weights
List of all non-trainable weight variables of the layer.
These are the weights that should not be updated by the optimizer during
training. Unlike, layer$non_trainable_variables
this excludes metric
state and random seeds.
trainable_variables
List of all trainable layer state.
This is equivalent to layer$trainable_weights
.
trainable_weights
List of all trainable weight variables of the layer.
These are the weights that get updated by the optimizer during training.
path
The path of the layer.
If the layer has not been built yet, it will be NULL
.
quantization_mode
The quantization mode of this layer, NULL
if not quantized.
variable_dtype
The dtype of the state (weights) of the layer.
variables
List of all layer state, including random seeds.
This extends layer$weights
to include all state used by the layer
including SeedGenerator
s.
Note that metrics variables are not included here, use
metrics_variables
to visit all the metric variables.
weights
List of all weight variables of the layer.
Unlike, layer$variables
this excludes metric state and random seeds.
input
Retrieves the input tensor(s) of a symbolic operation.
Only returns the tensor(s) corresponding to the first time the operation was called.
Returns: Input tensor or list of input tensors.
output
Retrieves the output tensor(s) of a layer.
Only returns the tensor(s) corresponding to the first time the operation was called.
Returns: Output tensor or list of output tensors.
dtype_policy
input_spec
supports_masking
Whether this layer supports computing a mask using compute_mask
.
trainable
Settable boolean, whether this layer should be trainable or not.
Other layers: layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Applies an activation function to an output.
layer_activation(object, activation, ...)
layer_activation(object, activation, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
activation |
Activation function. It could be a callable, or the name of
an activation from the |
... |
Base layer keyword arguments, such as |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
x <- array(c(-3, -1, 0, 2)) layer <- layer_activation(activation = 'relu') layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)
layer <- layer_activation(activation = activation_relu) layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)
layer <- layer_activation(activation = op_relu) layer(x)
## tf.Tensor([0. 0. 0. 2.], shape=(4), dtype=float32)
Other activation layers: layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
Other layers: Layer()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Formula:
f(x) = alpha * (exp(x) - 1.) for x < 0 f(x) = x for x >= 0
layer_activation_elu(object, alpha = 1, ...)
layer_activation_elu(object, alpha = 1, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
alpha |
float, slope of negative section. Defaults to |
... |
Base layer keyword arguments, such as |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
Other activation layers: layer_activation()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
Other layers: Layer()
layer_activation()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
This layer allows a small gradient when the unit is not active.
Formula:
f <- function(x) ifelse(x >= 0, x, alpha * x)
layer_activation_leaky_relu(object, negative_slope = 0.3, ...)
layer_activation_leaky_relu(object, negative_slope = 0.3, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
negative_slope |
Float >= 0.0. Negative slope coefficient.
Defaults to |
... |
Base layer keyword arguments, such as
|
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
leaky_relu_layer <- layer_activation_leaky_relu(negative_slope=0.5) input <- array(c(-10, -5, 0.0, 5, 10)) result <- leaky_relu_layer(input) as.array(result)
## [1] -5.0 -2.5 0.0 5.0 10.0
Other activation layers: layer_activation()
layer_activation_elu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Formula:
f <- function(x) ifelse(x >= 0, x, alpha * x)
where alpha
is a learned array with the same shape as x.
layer_activation_parametric_relu( object, alpha_initializer = "Zeros", alpha_regularizer = NULL, alpha_constraint = NULL, shared_axes = NULL, ... )
layer_activation_parametric_relu( object, alpha_initializer = "Zeros", alpha_regularizer = NULL, alpha_constraint = NULL, shared_axes = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
alpha_initializer |
Initializer function for the weights. |
alpha_regularizer |
Regularizer for the weights. |
alpha_constraint |
Constraint for the weights. |
shared_axes |
The axes along which to share learnable parameters for the
activation function. For example, if the incoming feature maps are
from a 2D convolution with output shape
|
... |
Base layer keyword arguments, such as |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
Other activation layers: layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_relu()
layer_activation_softmax()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Formula:
f <- function(x, max_value = Inf, negative_slope = 0, threshold = 0) { x <- max(x,0) if (x >= max_value) max_value else if (threshold <= x && x < max_value) x else negative_slope * (x - threshold) }
layer_activation_relu( object, max_value = NULL, negative_slope = 0, threshold = 0, ... )
layer_activation_relu( object, max_value = NULL, negative_slope = 0, threshold = 0, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
max_value |
Float >= 0. Maximum activation value. |
negative_slope |
Float >= 0. Negative slope coefficient.
Defaults to |
threshold |
Float >= 0. Threshold value for thresholded activation.
Defaults to |
... |
Base layer keyword arguments, such as |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
relu_layer <- layer_activation_relu(max_value = 10, negative_slope = 0.5, threshold = 0) input <- array(c(-10, -5, 0.0, 5, 10)) result <- relu_layer(input) as.array(result)
## [1] -5.0 -2.5 0.0 5.0 10.0
Other activation layers: layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_softmax()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Formula:
exp_x = exp(x - max(x)) f(x) = exp_x / sum(exp_x)
layer_activation_softmax(object, axis = -1L, ...)
layer_activation_softmax(object, axis = -1L, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
axis |
Integer, or list of Integers, axis along which the softmax normalization is applied. |
... |
Base layer keyword arguments, such as |
Softmaxed output with the same shape as inputs
.
softmax_layer <- layer_activation_softmax() input <- op_array(c(1, 2, 1)) softmax_layer(input)
## tf.Tensor([0.21194157 0.5761169 0.21194157], shape=(3), dtype=float32)
inputs
: The inputs (logits) to the softmax layer.
mask
: A boolean mask of the same shape as inputs
. The mask
specifies 1 to keep and 0 to mask. Defaults to NULL
.
Other activation layers: layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Layer that applies an update to the cost function based input activity.
layer_activity_regularization(object, l1 = 0, l2 = 0, ...)
layer_activity_regularization(object, l1 = 0, l2 = 0, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
l1 |
L1 regularization factor (positive float). |
l2 |
L2 regularization factor (positive float). |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
Arbitrary. Use the keyword argument input_shape
(tuple of integers, does not include the samples axis)
when using this layer as the first layer in a model.
Same shape as input.
Other regularization layers: layer_alpha_dropout()
layer_dropout()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape).
layer_add(inputs, ...)
layer_add(inputs, ...)
inputs |
layers to combine |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
input_shape <- c(1, 2, 3) x1 <- op_ones(input_shape) x2 <- op_ones(input_shape) layer_add(x1, x2)
## tf.Tensor( ## [[[2. 2. 2.] ## [2. 2. 2.]]], shape=(1, 2, 3), dtype=float32)
Usage in a Keras model:
input1 <- layer_input(shape = c(16)) x1 <- input1 |> layer_dense(8, activation = 'relu') input2 <- layer_input(shape = c(32)) x2 <- input2 |> layer_dense(8, activation = 'relu') # equivalent to `added = layer_add([x1, x2))` added <- layer_add(x1, x2) output <- added |> layer_dense(4) model <- keras_model(inputs = c(input1, input2), outputs = output)
Other merging layers: layer_average()
layer_concatenate()
layer_dot()
layer_maximum()
layer_minimum()
layer_multiply()
layer_subtract()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Inputs are a list with 2 or 3 elements:
A query
tensor of shape (batch_size, Tq, dim)
.
A value
tensor of shape (batch_size, Tv, dim)
.
A optional key
tensor of shape (batch_size, Tv, dim)
. If none
supplied, value
will be used as key
.
The calculation follows the steps:
Calculate attention scores using query
and key
with shape
(batch_size, Tq, Tv)
as a non-linear sum
scores = reduce_sum(tanh(query + key), axis=-1)
.
Use scores to calculate a softmax distribution with shape
(batch_size, Tq, Tv)
.
Use the softmax distribution to create a linear combination of value
with shape (batch_size, Tq, dim)
.
layer_additive_attention(object, use_scale = TRUE, dropout = 0, ...)
layer_additive_attention(object, use_scale = TRUE, dropout = 0, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
use_scale |
If |
dropout |
Float between 0 and 1. Fraction of the units to drop for the
attention scores. Defaults to |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
inputs
: List of the following tensors:
query
: Query tensor of shape (batch_size, Tq, dim)
.
value
: Value tensor of shape (batch_size, Tv, dim)
.
key
: Optional key tensor of shape (batch_size, Tv, dim)
. If
not given, will use value
for both key
and value
, which is
the most common case.
mask
: List of the following tensors:
query_mask
: A boolean mask tensor of shape (batch_size, Tq)
.
If given, the output will be zero at the positions where
mask==FALSE
.
value_mask
: A boolean mask tensor of shape (batch_size, Tv)
.
If given, will apply the mask such that values at positions
where mask==FALSE
do not contribute to the result.
return_attention_scores
: bool, it TRUE
, returns the attention scores
(after masking and softmax) as an additional output argument.
training
: Python boolean indicating whether the layer should behave in
training mode (adding dropout) or in inference mode (no dropout).
use_causal_mask
: Boolean. Set to TRUE
for decoder self-attention. Adds
a mask such that position i
cannot attend to positions j > i
.
This prevents the flow of information from the future towards the
past. Defaults to FALSE
.
Attention outputs of shape (batch_size, Tq, dim)
.
(Optional) Attention scores after masking and softmax with shape
(batch_size, Tq, Tv)
.
Other attention layers: layer_attention()
layer_group_query_attention()
layer_multi_head_attention()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Alpha Dropout is a Dropout
that keeps mean and variance of inputs
to their original values, in order to ensure the self-normalizing property
even after this dropout.
Alpha Dropout fits well to Scaled Exponential Linear Units (SELU) by
randomly setting activations to the negative saturation value.
layer_alpha_dropout(object, rate, noise_shape = NULL, seed = NULL, ...)
layer_alpha_dropout(object, rate, noise_shape = NULL, seed = NULL, ...)
object |
Object to compose the layer with. A tensor, array, or sequential model. |
rate |
Float between 0 and 1. The multiplicative noise will have
standard deviation |
noise_shape |
1D integer tensor representing the shape of the
binary alpha dropout mask that will be multiplied with the input.
For instance, if your inputs have shape
|
seed |
An integer to use as random seed. |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
inputs
: Input tensor (of any rank).
training
: R boolean indicating whether the layer should behave in
training mode (adding alpha dropout) or in inference mode
(doing nothing).
Other regularization layers: layer_activity_regularization()
layer_dropout()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Inputs are a list with 2 or 3 elements:
A query
tensor of shape (batch_size, Tq, dim)
.
A value
tensor of shape (batch_size, Tv, dim)
.
A optional key
tensor of shape (batch_size, Tv, dim)
. If none
supplied, value
will be used as a key
.
The calculation follows the steps:
Calculate attention scores using query
and key
with shape
(batch_size, Tq, Tv)
.
Use scores to calculate a softmax distribution with shape
(batch_size, Tq, Tv)
.
Use the softmax distribution to create a linear combination of value
with shape (batch_size, Tq, dim)
.
layer_attention( object, use_scale = FALSE, score_mode = "dot", dropout = 0, seed = NULL, ... )
layer_attention( object, use_scale = FALSE, score_mode = "dot", dropout = 0, seed = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
use_scale |
If |
score_mode |
Function to use to compute attention scores, one of
|
dropout |
Float between 0 and 1. Fraction of the units to drop for the
attention scores. Defaults to |
seed |
An integer to use as random seed incase of |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
inputs
: List of the following tensors:
query
: Query tensor of shape (batch_size, Tq, dim)
.
value
: Value tensor of shape (batch_size, Tv, dim)
.
key
: Optional key tensor of shape (batch_size, Tv, dim)
. If
not given, will use value
for both key
and value
, which is
the most common case.
mask
: List of the following tensors:
query_mask
: A boolean mask tensor of shape (batch_size, Tq)
.
If given, the output will be zero at the positions where
mask==FALSE
.
value_mask
: A boolean mask tensor of shape (batch_size, Tv)
.
If given, will apply the mask such that values at positions
where mask==FALSE
do not contribute to the result.
return_attention_scores
: bool, it TRUE
, returns the attention scores
(after masking and softmax) as an additional output argument.
training
: Python boolean indicating whether the layer should behave in
training mode (adding dropout) or in inference mode (no dropout).
use_causal_mask
: Boolean. Set to TRUE
for decoder self-attention. Adds
a mask such that position i
cannot attend to positions j > i
.
This prevents the flow of information from the future towards the
past. Defaults to FALSE
.
Attention outputs of shape (batch_size, Tq, dim)
.
(Optional) Attention scores after masking and softmax with shape
(batch_size, Tq, Tv)
.
Other attention layers: layer_additive_attention()
layer_group_query_attention()
layer_multi_head_attention()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
It takes as input a list of tensors, all of the same shape, and returns a single tensor (also of the same shape).
layer_average(inputs, ...)
layer_average(inputs, ...)
inputs |
layers to combine |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
input_shape <- c(1, 2, 3) x1 <- op_ones(input_shape) x2 <- op_zeros(input_shape) layer_average(x1, x2)
## tf.Tensor( ## [[[0.5 0.5 0.5] ## [0.5 0.5 0.5]]], shape=(1, 2, 3), dtype=float32)
Usage in a Keras model:
input1 <- layer_input(shape = c(16)) x1 <- input1 |> layer_dense(8, activation = 'relu') input2 <- layer_input(shape = c(32)) x2 <- input2 |> layer_dense(8, activation = 'relu') added <- layer_average(x1, x2) output <- added |> layer_dense(4) model <- keras_model(inputs = c(input1, input2), outputs = output)
Other merging layers: layer_add()
layer_concatenate()
layer_dot()
layer_maximum()
layer_minimum()
layer_multiply()
layer_subtract()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Downsamples the input representation by taking the average value over the
window defined by pool_size
. The window is shifted by strides
. The
resulting output when using "valid" padding option has a shape of:
output_shape = (input_shape - pool_size + 1) / strides)
The resulting output shape when using the "same" padding option is:
output_shape = input_shape / strides
layer_average_pooling_1d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
layer_average_pooling_1d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
pool_size |
int, size of the max pooling window. |
strides |
int or |
padding |
string, either |
data_format |
string, either |
name |
String, name for the object |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
If data_format="channels_last"
:
3D tensor with shape (batch_size, steps, features)
.
If data_format="channels_first"
:
3D tensor with shape (batch_size, features, steps)
.
If data_format="channels_last"
:
3D tensor with shape (batch_size, downsampled_steps, features)
.
If data_format="channels_first"
:
3D tensor with shape (batch_size, features, downsampled_steps)
.
strides=1
and padding="valid"
:
x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1)) output <- x |> layer_average_pooling_1d(pool_size = 2, strides = 1, padding = "valid") output
## tf.Tensor( ## [[[1.5] ## [2.5] ## [3.5] ## [4.5]]], shape=(1, 4, 1), dtype=float32)
strides=2
and padding="valid"
:
x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1)) output <- x |> layer_average_pooling_1d(pool_size = 2, strides = 2, padding = "valid") output
## tf.Tensor( ## [[[1.5] ## [3.5]]], shape=(1, 2, 1), dtype=float32)
strides=1
and padding="same"
:
x <- op_array(c(1., 2., 3., 4., 5.)) |> op_reshape(c(1, 5, 1)) output <- x |> layer_average_pooling_1d(pool_size = 2, strides = 1, padding = "same") output
## tf.Tensor( ## [[[1.5] ## [2.5] ## [3.5] ## [4.5] ## [5. ]]], shape=(1, 5, 1), dtype=float32)
Other pooling layers: layer_average_pooling_2d()
layer_average_pooling_3d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Downsamples the input along its spatial dimensions (height and width)
by taking the average value over an input window
(of size defined by pool_size
) for each channel of the input.
The window is shifted by strides
along each dimension.
The resulting output when using the "valid"
padding option has a spatial
shape (number of rows or columns) of:
output_shape = math.floor((input_shape - pool_size) / strides) + 1
(when input_shape >= pool_size
)
The resulting output shape when using the "same"
padding option is:
output_shape = math.floor((input_shape - 1) / strides) + 1
layer_average_pooling_2d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
layer_average_pooling_2d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
pool_size |
int or list of 2 integers, factors by which to downscale (dim1, dim2). If only one integer is specified, the same window length will be used for all dimensions. |
strides |
int or list of 2 integers, or |
padding |
string, either |
data_format |
string, either |
name |
String, name for the object |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
If data_format="channels_last"
:
4D tensor with shape (batch_size, height, width, channels)
.
If data_format="channels_first"
:
4D tensor with shape (batch_size, channels, height, width)
.
If data_format="channels_last"
:
4D tensor with shape
(batch_size, pooled_height, pooled_width, channels)
.
If data_format="channels_first"
:
4D tensor with shape
(batch_size, channels, pooled_height, pooled_width)
.
strides=(1, 1)
and padding="valid"
:
x <- op_array(1:9, "float32") |> op_reshape(c(1, 3, 3, 1)) output <- x |> layer_average_pooling_2d(pool_size = c(2, 2), strides = c(1, 1), padding = "valid") output
## tf.Tensor( ## [[[[3.] ## [4.]] ## ## [[6.] ## [7.]]]], shape=(1, 2, 2, 1), dtype=float32)
strides=(2, 2)
and padding="valid"
:
x <- op_array(1:12, "float32") |> op_reshape(c(1, 3, 4, 1)) output <- x |> layer_average_pooling_2d(pool_size = c(2, 2), strides = c(2, 2), padding = "valid") output
## tf.Tensor( ## [[[[3.5] ## [5.5]]]], shape=(1, 1, 2, 1), dtype=float32)
stride=(1, 1)
and padding="same"
:
x <- op_array(1:9, "float32") |> op_reshape(c(1, 3, 3, 1)) output <- x |> layer_average_pooling_2d(pool_size = c(2, 2), strides = c(1, 1), padding = "same") output
## tf.Tensor( ## [[[[3. ] ## [4. ] ## [4.5]] ## ## [[6. ] ## [7. ] ## [7.5]] ## ## [[7.5] ## [8.5] ## [9. ]]]], shape=(1, 3, 3, 1), dtype=float32)
Other pooling layers: layer_average_pooling_1d()
layer_average_pooling_3d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Downsamples the input along its spatial dimensions (depth, height, and
width) by taking the average value over an input window (of size defined by
pool_size
) for each channel of the input. The window is shifted by
strides
along each dimension.
layer_average_pooling_3d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
layer_average_pooling_3d( object, pool_size, strides = NULL, padding = "valid", data_format = NULL, name = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
pool_size |
int or list of 3 integers, factors by which to downscale (dim1, dim2, dim3). If only one integer is specified, the same window length will be used for all dimensions. |
strides |
int or list of 3 integers, or |
padding |
string, either |
data_format |
string, either |
name |
String, name for the object |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
If data_format="channels_last"
:
5D tensor with shape:
(batch_size, spatial_dim1, spatial_dim2, spatial_dim3, channels)
If data_format="channels_first"
:
5D tensor with shape:
(batch_size, channels, spatial_dim1, spatial_dim2, spatial_dim3)
If data_format="channels_last"
:
5D tensor with shape:
(batch_size, pooled_dim1, pooled_dim2, pooled_dim3, channels)
If data_format="channels_first"
:
5D tensor with shape:
(batch_size, channels, pooled_dim1, pooled_dim2, pooled_dim3)
depth <- height <- width <- 30 channels <- 3 inputs <- layer_input(shape = c(depth, height, width, channels)) outputs <- inputs |> layer_average_pooling_3d(pool_size = 3) outputs # Shape: (batch_size, 10, 10, 10, 3)
## <KerasTensor shape=(None, 10, 10, 10, 3), dtype=float32, sparse=False, name=keras_tensor_1>
Other pooling layers: layer_average_pooling_1d()
layer_average_pooling_2d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1.
Importantly, batch normalization works differently during training and during inference.
During training (i.e. when using fit()
or when calling the layer/model
with the argument training = TRUE
), the layer normalizes its output using
the mean and standard deviation of the current batch of inputs. That is to
say, for each channel being normalized, the layer returns
gamma * (batch - mean(batch)) / sqrt(var(batch) + epsilon) + beta
, where:
epsilon
is small constant (configurable as part of the constructor
arguments)
gamma
is a learned scaling factor (initialized as 1), which
can be disabled by passing scale = FALSE
to the constructor.
beta
is a learned offset factor (initialized as 0), which
can be disabled by passing center = FALSE
to the constructor.
During inference (i.e. when using evaluate()
or predict()
or when
calling the layer/model with the argument training = FALSE
(which is the
default), the layer normalizes its output using a moving average of the
mean and standard deviation of the batches it has seen during training. That
is to say, it returns
gamma * (batch - self$moving_mean) / sqrt(self$moving_var+epsilon) + beta
.
self$moving_mean
and self$moving_var
are non-trainable variables that
are updated each time the layer in called in training mode, as such:
moving_mean = moving_mean * momentum + mean(batch) * (1 - momentum)
moving_var = moving_var * momentum + var(batch) * (1 - momentum)
As such, the layer will only normalize its inputs during inference after having been trained on data that has similar statistics as the inference data.
About setting layer$trainable <- FALSE
on a BatchNormalization
layer:
The meaning of setting layer$trainable <- FALSE
is to freeze the layer,
i.e. its internal state will not change during training:
its trainable weights will not be updated
during fit()
or train_on_batch()
, and its state updates will not be run.
Usually, this does not necessarily mean that the layer is run in inference
mode (which is normally controlled by the training
argument that can
be passed when calling a layer). "Frozen state" and "inference mode"
are two separate concepts.
However, in the case of the BatchNormalization
layer, setting
trainable <- FALSE
on the layer means that the layer will be
subsequently run in inference mode (meaning that it will use
the moving mean and the moving variance to normalize the current batch,
rather than using the mean and variance of the current batch).
Note that:
Setting trainable
on an model containing other layers will recursively
set the trainable
value of all inner layers.
If the value of the trainable
attribute is changed after calling
compile()
on a model, the new value doesn't take effect for this model
until compile()
is called again.
layer_batch_normalization( object, axis = -1L, momentum = 0.99, epsilon = 0.001, center = TRUE, scale = TRUE, beta_initializer = "zeros", gamma_initializer = "ones", moving_mean_initializer = "zeros", moving_variance_initializer = "ones", beta_regularizer = NULL, gamma_regularizer = NULL, beta_constraint = NULL, gamma_constraint = NULL, synchronized = FALSE, ... )
layer_batch_normalization( object, axis = -1L, momentum = 0.99, epsilon = 0.001, center = TRUE, scale = TRUE, beta_initializer = "zeros", gamma_initializer = "ones", moving_mean_initializer = "zeros", moving_variance_initializer = "ones", beta_regularizer = NULL, gamma_regularizer = NULL, beta_constraint = NULL, gamma_constraint = NULL, synchronized = FALSE, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
axis |
Integer, the axis that should be normalized
(typically the features axis). For instance, after a |
momentum |
Momentum for the moving average. |
epsilon |
Small float added to variance to avoid dividing by zero. |
center |
If |
scale |
If |
beta_initializer |
Initializer for the beta weight. |
gamma_initializer |
Initializer for the gamma weight. |
moving_mean_initializer |
Initializer for the moving mean. |
moving_variance_initializer |
Initializer for the moving variance. |
beta_regularizer |
Optional regularizer for the beta weight. |
gamma_regularizer |
Optional regularizer for the gamma weight. |
beta_constraint |
Optional constraint for the beta weight. |
gamma_constraint |
Optional constraint for the gamma weight. |
synchronized |
Only applicable with the TensorFlow backend.
If |
... |
Base layer keyword arguments (e.g. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
inputs
: Input tensor (of any rank).
training
: R boolean indicating whether the layer should behave in
training mode or in inference mode.
training = TRUE
: The layer will normalize its inputs using
the mean and variance of the current batch of inputs.
training = FALSE
: The layer will normalize its inputs using
the mean and variance of its moving statistics, learned during
training.
mask
: Binary tensor of shape broadcastable to inputs
tensor, with
TRUE
values indicating the positions for which mean and variance
should be computed. Masked elements of the current inputs are not
taken into account for mean and variance computation during
training. Any prior unmasked element values will be taken into
account until their momentum expires.
Other normalization layers: layer_group_normalization()
layer_layer_normalization()
layer_spectral_normalization()
layer_unit_normalization()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_random_brightness()
layer_random_contrast()
layer_random_crop()
layer_random_flip()
layer_random_rotation()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Bidirectional wrapper for RNNs.
layer_bidirectional( object, layer, merge_mode = "concat", weights = NULL, backward_layer = NULL, ... )
layer_bidirectional( object, layer, merge_mode = "concat", weights = NULL, backward_layer = NULL, ... )
object |
Object to compose the layer with. A tensor, array, or sequential model. |
layer |
|
merge_mode |
Mode by which outputs of the forward and backward RNNs
will be combined. One of |
weights |
see description |
backward_layer |
Optional |
... |
For forward/backward compatability. |
The return value depends on the value provided for the first argument.
If object
is:
a keras_model_sequential()
, then the layer is added to the sequential model
(which is modified in place). To enable piping, the sequential model is also
returned, invisibly.
a keras_input()
, then the output tensor from calling layer(input)
is returned.
NULL
or missing, then a Layer
instance is returned.
The call arguments for this layer are the same as those of the
wrapped RNN layer. Beware that when passing the initial_state
argument during the call of this layer, the first half in the
list of elements in the initial_state
list will be passed to
the forward RNN call and the last half in the list of elements
will be passed to the backward RNN call.
instantiating a Bidirectional
layer from an existing RNN layer
instance will not reuse the weights state of the RNN layer instance – the
Bidirectional
layer will have freshly initialized weights.
model <- keras_model_sequential(input_shape = c(5, 10)) %>% layer_bidirectional(layer_lstm(units = 10, return_sequences = TRUE)) %>% layer_bidirectional(layer_lstm(units = 10)) %>% layer_dense(5, activation = "softmax") model %>% compile(loss = "categorical_crossentropy", optimizer = "rmsprop") # With custom backward layer forward_layer <- layer_lstm(units = 10, return_sequences = TRUE) backward_layer <- layer_lstm(units = 10, activation = "relu", return_sequences = TRUE, go_backwards = TRUE) model <- keras_model_sequential(input_shape = c(5, 10)) %>% bidirectional(forward_layer, backward_layer = backward_layer) %>% layer_dense(5, activation = "softmax") model %>% compile(loss = "categorical_crossentropy", optimizer = "rmsprop")
A Bidirectional
layer instance has property states
, which you can access
with layer$states
. You can also reset states using reset_state()
Other rnn layers: layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_gru()
layer_lstm()
layer_rnn()
layer_simple_rnn()
layer_time_distributed()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()
Other layers: Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_pooling_1d()
layer_max_pooling_2d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_multi_head_attention()
layer_multiply()