How to Save and Load Models in TensorFlow

Key Insights

TensorFlow’s SavedModel format is the recommended choice for production deployments, offering full model serialization including architecture, weights, and computation graphs, while HDF5 remains useful for simpler Keras-only workflows.
Use ModelCheckpoint callbacks during training to automatically save your best models based on validation metrics, preventing loss of progress from crashes and enabling easy recovery of optimal weights.
Custom layers, losses, and metrics require explicit registration via custom_objects dictionaries or proper implementation of get_config() methods to ensure models can be successfully loaded after saving.

Introduction to Model Persistence

Saving and loading models is fundamental to any serious machine learning workflow. You don’t want to retrain a model every time you need to make predictions, and you certainly don’t want to lose hours of training progress because your process crashed. Model persistence enables production deployment, transfer learning, and collaborative development.

TensorFlow provides two primary serialization formats: SavedModel and HDF5. SavedModel is TensorFlow’s native format and the recommended choice for most use cases. It saves everything—architecture, weights, training configuration, and even the optimizer state. HDF5 is Keras’s legacy format, simpler but less comprehensive. Understanding both formats and when to use each will save you considerable frustration down the road.

Saving Models During and After Training

The simplest way to save a complete Keras model is using the save() method. This works for both SavedModel and HDF5 formats depending on the file extension you provide:

import tensorflow as tf
from tensorflow import keras

# Build a simple model
model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=5, validation_split=0.2)

# Save the complete model
model.save('my_model')  # SavedModel format
model.save('my_model.h5')  # HDF5 format

For long training runs, you should save checkpoints periodically. The ModelCheckpoint callback automates this process and can save only the best model based on a monitored metric:

from tensorflow.keras.callbacks import ModelCheckpoint

# Save the best model based on validation loss
checkpoint = ModelCheckpoint(
    filepath='best_model.h5',
    monitor='val_loss',
    save_best_only=True,
    mode='min',
    verbose=1
)

# Save checkpoints every epoch
checkpoint_all = ModelCheckpoint(
    filepath='model_epoch_{epoch:02d}.h5',
    save_freq='epoch',
    verbose=1
)

model.fit(x_train, y_train, 
          epochs=50, 
          validation_split=0.2,
          callbacks=[checkpoint, checkpoint_all])

Sometimes you only need to save the weights, not the entire architecture. This is useful when you’re experimenting with different architectures but want to transfer learned weights:

# Save only weights
model.save_weights('my_model_weights.h5')

# Later, rebuild the model architecture and load weights
new_model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])
new_model.load_weights('my_model_weights.h5')

The SavedModel Format (Recommended)

SavedModel is TensorFlow’s language-neutral, recoverable serialization format. It’s the recommended format for production because it saves everything needed to restore your model completely, including for TensorFlow Serving deployment.

When you save a model using SavedModel format, TensorFlow creates a directory structure:

model.save('my_model')

# This creates:
# my_model/
#   assets/
#   variables/
#     variables.data-00000-of-00001
#     variables.index
#   saved_model.pb

The saved_model.pb file contains the model architecture and training configuration. The variables/ directory holds the model weights. The assets/ directory stores auxiliary files like vocabularies for text models.

Loading a SavedModel is straightforward:

# Load the model
loaded_model = tf.keras.models.load_model('my_model')

# Verify it works
predictions = loaded_model.predict(x_test[:5])
print(predictions)

# Continue training if needed
loaded_model.fit(x_train, y_train, epochs=2)

You can inspect a SavedModel’s signatures and operations using the CLI:

saved_model_cli show --dir my_model --all

This is invaluable for debugging production deployment issues.

HDF5 Format for Keras Models

HDF5 is a hierarchical data format originally developed for scientific computing. Keras adopted it early on, and it remains useful for pure Keras workflows. It’s simpler than SavedModel—just a single file—but less comprehensive.

# Save to HDF5
model.save('my_model.h5')

# Load from HDF5
loaded_model = tf.keras.models.load_model('my_model.h5')

# Verify predictions match
original_predictions = model.predict(x_test[:5])
loaded_predictions = loaded_model.predict(x_test[:5])

import numpy as np
assert np.allclose(original_predictions, loaded_predictions)

The main advantage of HDF5 is portability—it’s a single file you can easily move around. The disadvantage is it’s Keras-specific and doesn’t support some TensorFlow features as completely as SavedModel.

When working with custom objects (layers, losses, metrics), you need to pass them during loading:

# Define a custom loss
def custom_loss(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_true - y_pred))

model.compile(optimizer='adam', loss=custom_loss)
model.fit(x_train, y_train, epochs=5)
model.save('model_with_custom_loss.h5')

# Load with custom objects
loaded_model = tf.keras.models.load_model(
    'model_with_custom_loss.h5',
    custom_objects={'custom_loss': custom_loss}
)

Saving and Loading Custom Models

Custom layers and models require additional consideration. TensorFlow needs to know how to reconstruct your custom components. The proper approach is implementing get_config() and from_config() methods:

class CustomDense(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomDense, self).__init__(**kwargs)
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer='random_normal',
            trainable=True
        )
        self.b = self.add_weight(
            shape=(self.units,),
            initializer='zeros',
            trainable=True
        )
    
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
    def get_config(self):
        config = super(CustomDense, self).get_config()
        config.update({'units': self.units})
        return config

# Build model with custom layer
model = keras.Sequential([
    CustomDense(64, input_shape=(784,)),
    keras.layers.Activation('relu'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(x_train, y_train, epochs=5)

# Save and load
model.save('custom_model.h5')
loaded_model = tf.keras.models.load_model(
    'custom_model.h5',
    custom_objects={'CustomDense': CustomDense}
)

For models with custom training loops using tf.GradientTape, save the model architecture and weights separately, then reconstruct:

# Save
model.save_weights('custom_training_weights.h5')

# Rebuild and load
new_model = build_model()  # Your model building function
new_model.load_weights('custom_training_weights.h5')

Best Practices and Common Pitfalls

Always use version naming for your saved models. This prevents accidental overwrites and enables easy rollback:

import datetime

timestamp = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
model_path = f'models/model_{timestamp}'
model.save(model_path)

Verify that loaded models produce identical predictions to the original:

# Create test data
test_input = x_test[:10]

# Get predictions from original model
original_predictions = model.predict(test_input)

# Save and load
model.save('verification_model')
loaded_model = tf.keras.models.load_model('verification_model')

# Compare predictions
loaded_predictions = loaded_model.predict(test_input)

# Should be nearly identical (allowing for floating point precision)
difference = np.abs(original_predictions - loaded_predictions)
max_difference = np.max(difference)
print(f"Maximum prediction difference: {max_difference}")

assert max_difference < 1e-6, "Predictions don't match!"

The most common error when loading models is ValueError: Unknown layer or ValueError: Unknown loss function. This happens when you forget to register custom objects:

# Wrong - will fail
loaded_model = tf.keras.models.load_model('my_model.h5')

# Correct - register custom objects
loaded_model = tf.keras.models.load_model(
    'my_model.h5',
    custom_objects={
        'CustomLayer': CustomLayer,
        'custom_loss': custom_loss,
        'custom_metric': custom_metric
    }
)

For production deployments, prefer SavedModel format. It’s more robust and works seamlessly with TensorFlow Serving. Use HDF5 for quick experiments, model sharing with collaborators, or when you need a single portable file.

Conclusion

Model persistence is non-negotiable for production machine learning. Use SavedModel format as your default choice—it’s comprehensive, production-ready, and the format TensorFlow actively develops. Resort to HDF5 when you need simplicity or backward compatibility.

Implement ModelCheckpoint callbacks in all training runs to protect against data loss and automatically capture your best models. For custom components, properly implement serialization methods or maintain a registry of custom objects. Always verify that loaded models produce identical predictions to originals before deploying to production.

The few minutes spent implementing proper model saving will save you hours of retraining and debugging down the line.