--- title: Save, serialize, and export models authors: Neel Kovelamudi, Francois Chollet date-created: 2023/06/14 last-modified: 2023/06/30 description: Complete guide to saving, serializing, and exporting models. output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Save, serialize, and export models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction A Keras model consists of multiple components: - The architecture, or configuration, which specifies what layers the model contain, and how they're connected. - A set of weights values (the "state of the model"). - An optimizer (defined by compiling the model). - A set of losses and metrics (defined by compiling the model). The Keras API saves all of these pieces together in a unified format, marked by the `.keras` extension. This is a zip archive consisting of the following: - A JSON-based configuration file (config.json): Records of model, layer, and other trackables' configuration. - A H5-based state file, such as `model.weights.h5` (for the whole model), with directory keys for layers and their weights. - A metadata file in JSON, storing things such as the current Keras version. Let's take a look at how this works. ## How to save and load a model If you only have 10 seconds to read this guide, here's what you need to know. **Saving a Keras model:** ```r # Get model (Sequential, Functional Model, or Model subclass) model <- ... # The filename needs to end with the .keras extension model |> save_model('path/to/location.keras') ``` **Loading the model back:** ```r model <- load_model('path/to/location.keras') ``` Now, let's look at the details. ## Setup ``` r library(keras3) ``` ## Saving This section is about saving an entire model to a single file. The file will include: - The model's architecture/config - The model's weight values (which were learned during training) - The model's compilation information (if `compile()` was called) - The optimizer and its state, if any (this enables you to restart training where you left) #### APIs You can save a model with `save_model()`. You can load it back with `load_model()`. The only supported format in Keras 3 is the "Keras v3" format, which uses the `.keras` extension. **Example:** ``` r get_model <- function() { # Create a simple model. inputs <- keras_input(shape(32)) outputs <- inputs |> layer_dense(1) model <- keras_model(inputs, outputs) model |> compile(optimizer = optimizer_adam(), loss = "mean_squared_error") model } model <- get_model() # Train the model. test_input <- random_uniform(c(128, 32)) test_target <- random_uniform(c(128, 1)) model |> fit(test_input, test_target) # Calling `save('my_model.keras')` creates a zip archive `my_model.keras`. model |> save_model("my_model.keras") # It can be used to reconstruct the model identically. reconstructed_model <- load_model("my_model.keras") # Let's check: stopifnot(all.equal( model |> predict(test_input), reconstructed_model |> predict(test_input) )) ``` ### Custom objects This section covers the basic workflows for handling custom layers, functions, and models in Keras saving and reloading. When saving a model that includes custom objects, such as a subclassed Layer, you **must** define a `get_config()` method on the object class. If the arguments passed to the constructor (`initialize()` method) of the custom object aren't simple objects (anything other than types like ints, strings, etc.), then you **must** also explicitly deserialize these arguments in the `from_config()` class method. Like this: ``` r layer_custom <- Layer( "CustomLayer", initialize = function(sublayer, ...) { super$initialize(...) self$sublayer <- sublayer }, call = function(x) { self$sublayer(x) }, get_config = function() { base_config <- super$get_config() config <- list( sublayer = serialize_keras_object(self$sublayer) ) c(base_config, config) }, from_config = function(cls, config) { sublayer_config <- config$sublayer sublayer <- deserialize_keras_object(sublayer_config) cls(sublayer, !!!config) } ) ``` Please see the [Defining the config methods section](#config_methods) for more details and examples. The saved `.keras` file is lightweight and does not store the Python code for custom objects. Therefore, to reload the model, `load_model` requires access to the definition of any custom objects used through one of the following methods: 1. Registering custom objects **(preferred)**, 2. Passing custom objects directly when loading, or 3. Using a custom object scope Below are examples of each workflow: #### Registering custom objects (**preferred**) This is the preferred method, as custom object registration greatly simplifies saving and loading code. Calling `register_keras_serializable()` on a custom object registers the object globally in a master list, allowing Keras to recognize the object when loading the model. Let's create a custom model involving both a custom layer and a custom activation function to demonstrate this. **Example:** ``` r # Clear all previously registered custom objects set_custom_objects(clear = TRUE) ``` ``` ## named list() ``` ``` r layer_custom <- Layer( "CustomLayer", initialize = function(self, factor) { super$initialize() self$factor = factor }, call = function(self, x) { x * self$factor }, get_config = function(self) { list(factor = self$factor) } ) # Upon registration, you can optionally specify a package or a name. # If left blank, the package defaults to "Custom" and the name defaults to # the class name. register_keras_serializable(layer_custom, package = "MyLayers") custom_fn <- keras3:::py_func2(function(x) x^2, name = "custom_fn", convert = TRUE) register_keras_serializable(custom_fn, name="custom_fn", package="my_package") # Create the model. get_model <- function() { inputs <- keras_input(shape(4)) mid <- inputs |> layer_custom(0.5) outputs <- mid |> layer_dense(1, activation = custom_fn) model <- keras_model(inputs, outputs) model |> compile(optimizer = "rmsprop", loss = "mean_squared_error") model } # Train the model. train_model <- function(model) { input <- random_uniform(c(4, 4)) target <- random_uniform(c(4, 1)) model |> fit(input, target, verbose = FALSE, epochs = 1) model } test_input <- random_uniform(c(4, 4)) test_target <- random_uniform(c(4, 1)) model <- get_model() |> train_model() model |> save_model("custom_model.keras", overwrite = TRUE) # Now, we can simply load without worrying about our custom objects. reconstructed_model <- load_model("custom_model.keras") # Let's check: stopifnot(all.equal( model |> predict(test_input, verbose = FALSE), reconstructed_model |> predict(test_input, verbose = FALSE) )) ``` #### Passing custom objects to `load_model()` ``` r model <- get_model() |> train_model() # Calling `save_model('my_model.keras')` creates a zip archive `my_model.keras`. model |> save_model("custom_model.keras", overwrite = TRUE) # Upon loading, pass a named list containing the custom objects used in the # `custom_objects` argument of `load_model()`. reconstructed_model <- load_model( "custom_model.keras", custom_objects = list(CustomLayer = layer_custom, custom_fn = custom_fn), ) # Let's check: stopifnot(all.equal( model |> predict(test_input, verbose = FALSE), reconstructed_model |> predict(test_input, verbose = FALSE) )) ``` #### Using a custom object scope Any code within the custom object scope will be able to recognize the custom objects passed to the scope argument. Therefore, loading the model within the scope will allow the loading of our custom objects. **Example:** ``` r model <- get_model() |> train_model() model |> save_model("custom_model.keras", overwrite = TRUE) # Pass the custom objects dictionary to a custom object scope and place # the `keras.models.load_model()` call within the scope. custom_objects <- list(CustomLayer = layer_custom, custom_fn = custom_fn) with_custom_object_scope(custom_objects, { reconstructed_model <- load_model("custom_model.keras") }) # Let's check: stopifnot(all.equal( model |> predict(test_input, verbose = FALSE), reconstructed_model |> predict(test_input, verbose = FALSE) )) ``` ### Model serialization This section is about saving only the model's configuration, without its state. The model's configuration (or architecture) specifies what layers the model contains, and how these layers are connected. If you have the configuration of a model, then the model can be created with a freshly initialized state (no weights or compilation information). #### APIs The following serialization APIs are available: - `clone_model(model)`: make a (randomly initialized) copy of a model. - `get_config()` and `cls.from_config()`: retrieve the configuration of a layer or model, and recreate a model instance from its config, respectively. - `keras.models.model_to_json()` and `keras.models.model_from_json()`: similar, but as JSON strings. - `keras.saving.serialize_keras_object()`: retrieve the configuration any arbitrary Keras object. - `keras.saving.deserialize_keras_object()`: recreate an object instance from its configuration. #### In-memory model cloning You can do in-memory cloning of a model via `clone_model()`. This is equivalent to getting the config then recreating the model from its config (so it does not preserve compilation information or layer weights values). **Example:** ``` r new_model <- clone_model(model) ``` #### `get_config()` and `from_config()` Calling `get_config(model)` or `get_config(layer)` will return a named list containing the configuration of the model or layer, respectively. You should define `get_config()` to contain arguments needed for the `initialize()` method of the model or layer. At loading time, the `from_config(config)` method will then call `initialize()` with these arguments to reconstruct the model or layer. **Layer example:** ``` r layer <- layer_dense(, 3, activation="relu") layer_config <- get_config(layer) str(layer_config) ``` ``` ## List of 12 ## $ name : chr "dense_4" ## $ trainable : logi TRUE ## $ dtype :List of 4 ## ..$ module : chr "keras" ## ..$ class_name : chr "DTypePolicy" ## ..$ config :List of 1 ## .. ..$ name: chr "float32" ## ..$ registered_name: NULL ## $ units : int 3 ## $ activation : chr "relu" ## $ use_bias : logi TRUE ## $ kernel_initializer:List of 4 ## ..$ module : chr "keras.initializers" ## ..$ class_name : chr "GlorotUniform" ## ..$ config :List of 1 ## .. ..$ seed: NULL ## ..$ registered_name: NULL ## $ bias_initializer :List of 4 ## ..$ module : chr "keras.initializers" ## ..$ class_name : chr "Zeros" ## ..$ config : Named list() ## ..$ registered_name: NULL ## $ kernel_regularizer: NULL ## $ bias_regularizer : NULL ## $ kernel_constraint : NULL ## $ bias_constraint : NULL ## - attr(*, "__class__")= ``` Now let's reconstruct the layer using the `from_config()` method: ``` r new_layer <- from_config(layer_config) ``` **Sequential model example:** ``` r model <- keras_model_sequential(input_shape = c(32)) |> layer_dense(1) config <- get_config(model) new_model <- from_config(config) ``` **Functional model example:** ``` r inputs <- keras_input(c(32)) outputs <- inputs |> layer_dense(1) model <- keras_model(inputs, outputs) config <- get_config(model) new_model <- from_config(config) ``` #### `save_model_config()` and `load_model_config()` This is similar to `get_config` / `from_config`, except it turns the model into a JSON file, which can then be loaded without the original model class. It is also specific to models, it isn't meant for layers. **Example:** ``` r model <- keras_model_sequential(input_shape = c(32)) |> layer_dense(1) save_model_config(model, "model_config.json") new_model <- load_model_config("model_config.json") ``` ``` r unlink("model_config.json") ``` #### Arbitrary object serialization and deserialization The `serialize_keras_object()` and `deserialize_keras_object()` APIs are general-purpose APIs that can be used to serialize or deserialize any Keras object and any custom object. It is at the foundation of saving model architecture and is behind all `serialize()`/`deserialize()` calls in keras. **Example**: ``` r my_reg <- regularizer_l1(0.005) config <- serialize_keras_object(my_reg) str(config) ``` ``` ## List of 4 ## $ module : chr "keras.regularizers" ## $ class_name : chr "L1" ## $ config :List of 1 ## ..$ l1: num 0.005 ## $ registered_name: NULL ``` Note the serialization format containing all the necessary information for proper reconstruction: - `module` containing the name of the Keras module or other identifying module the object comes from - `class_name` containing the name of the object's class. - `config` with all the information needed to reconstruct the object - `registered_name` for custom objects. See [here](#custom_object_serialization). Now we can reconstruct the regularizer. ``` r new_reg <- deserialize_keras_object(config) new_reg ``` ``` ## ## signature: (x) ``` ### Model weights saving You can choose to only save & load a model's weights. This can be useful if: - You only need the model for inference: in this case you won't need to restart training, so you don't need the compilation information or optimizer state. - You are doing transfer learning: in this case you will be training a new model reusing the state of a prior model, so you don't need the compilation information of the prior model. #### APIs for in-memory weight transfer Weights can be copied between different objects by using `get_weights()` and `set_weights()`: * `get_weights()`: Returns a list of arrays of weight values. * `set_weights(weights)`: Sets the model/layer weights to the values provided (as arrays). Examples: ***Transferring weights from one layer to another, in memory*** ``` r create_layer <- function() { layer <- layer_dense(, 64, activation = "relu", name = "dense_2") layer$build(shape(NA, 784)) layer } layer_1 <- create_layer() layer_2 <- create_layer() # Copy weights from layer 1 to layer 2 layer_2 |> set_weights(get_weights(layer_1)) ``` ***Transferring weights from one model to another model with a compatible architecture, in memory*** ``` r # Create a simple functional model inputs <- keras_input(shape=c(784), name="digits") outputs <- inputs |> layer_dense(64, activation = "relu", name = "dense_1") |> layer_dense(64, activation = "relu", name = "dense_2") |> layer_dense(10, name = "predictions") functional_model <- keras_model(inputs = inputs, outputs = outputs, name = "3_layer_mlp") # Define a subclassed model with the same architecture SubclassedModel <- new_model_class( "SubclassedModel", initialize = function(output_dim, name = NULL) { super$initialize(name = name) self$output_dim <- output_dim |> as.integer() self$dense_1 <- layer_dense(, 64, activation = "relu", name = "dense_1") self$dense_2 <- layer_dense(, 64, activation = "relu", name = "dense_2") self$dense_3 <- layer_dense(, self$output_dim, name = "predictions") }, call = function(inputs) { inputs |> self$dense_1() |> self$dense_2() |> self$dense_3() }, get_config = function(self) { list(output_dim = self$output_dim, name = self$name) } ) subclassed_model <- SubclassedModel(10) # Call the subclassed model once to create the weights. subclassed_model(op_ones(c(1, 784))) |> invisible() # Copy weights from functional_model to subclassed_model. set_weights(subclassed_model, get_weights(functional_model)) stopifnot(all.equal( get_weights(functional_model), get_weights(subclassed_model) )) ``` ***The case of stateless layers*** Because stateless layers do not change the order or number of weights, models can have compatible architectures even if there are extra/missing stateless layers. ``` r input <- keras_input(shape = c(784), name = "digits") output <- input |> layer_dense(64, activation = "relu", name = "dense_1") |> layer_dense(64, activation = "relu", name = "dense_2") |> layer_dense(10, name = "predictions") functional_model <- keras_model(inputs, outputs, name = "3_layer_mlp") input <- keras_input(shape = c(784), name = "digits") output <- input |> layer_dense(64, activation = "relu", name = "dense_1") |> layer_dense(64, activation = "relu", name = "dense_2") |> # Add a dropout layer, which does not contain any weights. layer_dropout(0.5) |> layer_dense(10, name = "predictions") functional_model_with_dropout <- keras_model(input, output, name = "3_layer_mlp") set_weights(functional_model_with_dropout, get_weights(functional_model)) ``` #### APIs for saving weights to disk & loading them back Weights can be saved to disk by calling `save_model_weights(filepath)`. The filename should end in `.weights.h5`. **Example:** ``` r sequential_model = keras_model_sequential(input_shape = c(784), input_name = "digits") |> layer_dense(64, activation = "relu", name = "dense_1") |> layer_dense(64, activation = "relu", name = "dense_2") |> layer_dense(10, name = "predictions") sequential_model |> save_model_weights("my_model.weights.h5") sequential_model |> load_model_weights("my_model.weights.h5") ``` Note that using `freeze_weights()` may result in a different output from `get_weights(layer)` ordering when the model contains nested layers. ##### **Transfer learning example** When loading pretrained weights from a weights file, it is recommended to load the weights into the original checkpointed model, and then extract the desired weights/layers into a new model. **Example:** ``` r create_functional_model <- function() { inputs <- keras_input(shape = c(784), name = "digits") outputs <- inputs |> layer_dense(64, activation = "relu", name = "dense_1") |> layer_dense(64, activation = "relu", name = "dense_2") |> layer_dense(10, name = "predictions") keras_model(inputs, outputs, name = "3_layer_mlp") } functional_model <- create_functional_model() functional_model |> save_model_weights("pretrained.weights.h5") # In a separate program: pretrained_model <- create_functional_model() pretrained_model |> load_model_weights("pretrained.weights.h5") # Create a new model by extracting layers from the original model: extracted_layers <- pretrained_model$layers |> head(-1) model <- keras_model_sequential(layers = extracted_layers) |> layer_dense(5, name = "dense_3") summary(model) ``` ``` ## Model: "sequential_4" ## ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ## ┃ Layer (type)  ┃ Output Shape  ┃  Param # ┃ ## ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ ## │ dense_1 (Dense) │ (None, 64) │ 50,240 │ ## ├─────────────────────────────────┼────────────────────────┼───────────────┤ ## │ dense_2 (Dense) │ (None, 64) │ 4,160 │ ## ├─────────────────────────────────┼────────────────────────┼───────────────┤ ## │ dense_3 (Dense) │ (None, 5) │ 325 │ ## └─────────────────────────────────┴────────────────────────┴───────────────┘ ##  Total params: 54,725 (213.77 KB) ##  Trainable params: 54,725 (213.77 KB) ##  Non-trainable params: 0 (0.00 B) ``` ### Appendix: Handling custom objects #### Defining the config methods Specifications: * `get_config()` should return a JSON-serializable named list in order to be compatible with the Keras architecture and model-saving APIs. * `from_config(config)` (a class method) should return a new layer or model object that is created from the config. The default implementation returns `do.call(cls, config)`. **NOTE**: If all your constructor arguments are already serializable, e.g. strings and ints, or non-custom Keras objects, overriding `from_config()` is not necessary. However, for more complex objects such as layers or models passed to `initialize()`, deserialization must be handled explicitly either in `initialize` itself or overriding the `from_config()` method. **Example:** ``` r layer_my_dense <- register_keras_serializable( package = "MyLayers", name = "KernelMult", object = Layer( "MyDense", initialize = function(units, ..., kernel_regularizer = NULL, kernel_initializer = NULL, nested_model = NULL) { super$initialize(...) self$hidden_units <- units self$kernel_regularizer <- kernel_regularizer self$kernel_initializer <- kernel_initializer self$nested_model <- nested_model }, get_config = function() { config <- super$get_config() # Update the config with the custom layer's parameters config <- modifyList(config, list( units = self$hidden_units, kernel_regularizer = self$kernel_regularizer, kernel_initializer = self$kernel_initializer, nested_model = self$nested_model )) config }, build = function(input_shape) { input_units <- tail(input_shape, 1) self$kernel <- self$add_weight( name = "kernel", shape = shape(input_units, self$hidden_units), regularizer = self$kernel_regularizer, initializer = self$kernel_initializer, ) }, call = function(inputs) { op_matmul(inputs, self$kernel) } ) ) layer <- layer_my_dense(units = 16, kernel_regularizer = "l1", kernel_initializer = "ones") layer3 <- layer_my_dense(units = 64, nested_model = layer) config <- serialize_keras_object(layer3) str(config) ``` ``` ## List of 4 ## $ module : chr "" ## $ class_name : chr "MyDense" ## $ config :List of 5 ## ..$ name : chr "my_dense_1" ## ..$ trainable : logi TRUE ## ..$ dtype :List of 4 ## .. ..$ module : chr "keras" ## .. ..$ class_name : chr "DTypePolicy" ## .. ..$ config :List of 1 ## .. .. ..$ name: chr "float32" ## .. ..$ registered_name: NULL ## ..$ units : num 64 ## ..$ nested_model:List of 4 ## .. ..$ module : chr "" ## .. ..$ class_name : chr "MyDense" ## .. ..$ config :List of 6 ## .. .. ..$ name : chr "my_dense" ## .. .. ..$ trainable : logi TRUE ## .. .. ..$ dtype :List of 4 ## .. .. .. ..$ module : chr "keras" ## .. .. .. ..$ class_name : chr "DTypePolicy" ## .. .. .. ..$ config :List of 1 ## .. .. .. .. ..$ name: chr "float32" ## .. .. .. ..$ registered_name: NULL ## .. .. ..$ units : num 16 ## .. .. ..$ kernel_regularizer: chr "l1" ## .. .. ..$ kernel_initializer: chr "ones" ## .. ..$ registered_name: chr "MyLayers>KernelMult" ## $ registered_name: chr "MyLayers>KernelMult" ``` ``` r new_layer <- deserialize_keras_object(config) new_layer ``` ``` ## ## signature: (*args, **kwargs) ``` Note that overriding `from_config` is unnecessary above for `MyDense` because `hidden_units`, `kernel_initializer`, and `kernel_regularizer` are ints, strings, and a built-in Keras object, respectively. This means that the default `from_config` implementation of `cls(!!!config)` will work as intended. For more complex objects, such as layers and models passed to `initialize()`, for example, you must explicitly deserialize these objects. Let's take a look at an example of a model where a `from_config` override is necessary. **Example:** ``` r `%||%` <- \(x, y) if(is.null(x)) y else x layer_custom_model <- register_keras_serializable( package = "ComplexModels", object = Layer( "CustomModel", initialize = function(first_layer, second_layer = NULL, ...) { super$initialize(...) self$first_layer <- first_layer self$second_layer <- second_layer %||% layer_dense(, 8) }, get_config = function() { config <- super$get_config() config <- modifyList(config, list( first_layer = self$first_layer, second_layer = self$second_layer )) config }, from_config = function(config) { config$first_layer %<>% deserialize_keras_object() config$second_layer %<>% deserialize_keras_object() # note that the class is available in methods under the classname symbol, # (`CustomModel` for this class), and also under the symbol `__class__` cls(!!!config) # CustomModel(!!!config) }, call = function(self, inputs) { inputs |> self$first_layer() |> self$second_layer() } ) ) # Let's make our first layer the custom layer from the previous example (MyDense) inputs <- keras_input(c(32)) outputs <- inputs |> layer_custom_model(first_layer=layer) model <- keras_model(inputs, outputs) config <- get_config(model) new_model <- from_config(config) ``` #### How custom objects are serialized The serialization format has a special key for custom objects registered via `register_keras_serializable()`. This `registered_name` key allows for easy retrieval at loading/deserialization time while also allowing users to add custom naming. Let's take a look at the config from serializing the custom layer `MyDense` we defined above. **Example**: ``` r layer <- layer_my_dense( units = 16, kernel_regularizer = regularizer_l1_l2(l1 = 1e-5, l2 = 1e-4), kernel_initializer = "ones", ) config <- serialize_keras_object(layer) str(config) ``` ``` ## List of 4 ## $ module : chr "" ## $ class_name : chr "MyDense" ## $ config :List of 6 ## ..$ name : chr "my_dense_2" ## ..$ trainable : logi TRUE ## ..$ dtype :List of 4 ## .. ..$ module : chr "keras" ## .. ..$ class_name : chr "DTypePolicy" ## .. ..$ config :List of 1 ## .. .. ..$ name: chr "float32" ## .. ..$ registered_name: NULL ## ..$ units : num 16 ## ..$ kernel_regularizer:List of 4 ## .. ..$ module : chr "keras.regularizers" ## .. ..$ class_name : chr "L1L2" ## .. ..$ config :List of 2 ## .. .. ..$ l1: num 1e-05 ## .. .. ..$ l2: num 1e-04 ## .. ..$ registered_name: NULL ## ..$ kernel_initializer: chr "ones" ## $ registered_name: chr "MyLayers>KernelMult" ``` As shown, the `registered_name` key contains the lookup information for the Keras master list, including the package `MyLayers` and the custom name `KernelMult` that we gave when calling `register_keras_serializables()`. Take a look again at the custom class definition/registration [here](#registration_example). Note that the `class_name` key contains the original name of the class, allowing for proper re-initialization in `from_config`. Additionally, note that the `module` key is `NULL` since this is a custom object.