How to Use TensorBoard in TensorFlow

Key Insights

TensorBoard transforms ML debugging from guesswork into data-driven analysis by visualizing metrics, model architecture, and training dynamics in real-time
Proper log directory organization is critical—use separate subdirectories for each experiment run to enable meaningful comparisons across hyperparameter configurations
Beyond basic metrics tracking, TensorBoard’s histogram and embedding visualizations reveal weight distribution problems and high-dimensional data patterns that are impossible to spot in raw logs

Introduction to TensorBoard

TensorBoard is TensorFlow’s built-in visualization toolkit that turns opaque training processes into observable, debuggable workflows. When you’re training neural networks, you’re essentially flying blind without visualization—scrolling through console logs of loss values tells you almost nothing about what’s actually happening inside your model.

The reality is that modern deep learning involves managing dozens of hyperparameters, monitoring multiple metrics across training and validation sets, and understanding how your model’s internal representations evolve. TensorBoard addresses this by providing interactive dashboards for metrics, graphs, distributions, and more. It’s not optional tooling for serious ML work—it’s essential infrastructure.

Setting Up TensorBoard

TensorBoard comes bundled with TensorFlow, so if you have TensorFlow installed, you already have TensorBoard. The core concept is simple: your training code writes summary data to a log directory, and TensorBoard reads from that directory to generate visualizations.

The log directory structure matters. Use a parent directory like logs/ with subdirectories for each training run. This organization enables experiment comparison and prevents data from different runs from mixing.

import tensorflow as tf
from datetime import datetime

# Create timestamped log directory
log_dir = "logs/fit/" + datetime.now().strftime("%Y%m%d-%H%M%S")

# Create TensorBoard callback
tensorboard_callback = tf.keras.callbacks.TensorBoard(
    log_dir=log_dir,
    histogram_freq=1,  # Log weight histograms every epoch
    write_graph=True,   # Visualize the model graph
    update_freq='epoch' # Update metrics after each epoch
)

# Simple model for demonstration
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Train with TensorBoard callback
model.fit(
    x_train, y_train,
    epochs=10,
    validation_data=(x_val, y_val),
    callbacks=[tensorboard_callback]
)

Launch TensorBoard from your terminal:

tensorboard --logdir logs/fit

Navigate to http://localhost:6006 to access the dashboard. TensorBoard automatically refreshes as new data is written.

Tracking Metrics and Scalars

The Scalars dashboard is where you’ll spend most of your time. It displays training and validation metrics over time, making it immediately obvious whether your model is learning, overfitting, or stuck.

The TensorBoard callback automatically logs metrics defined in model.compile(), but you can also log custom scalars for anything you want to track—learning rates, gradient norms, custom loss components, or domain-specific metrics.

import tensorflow as tf

# Custom training loop with manual scalar logging
log_dir = "logs/custom/" + datetime.now().strftime("%Y%m%d-%H%M%S")
summary_writer = tf.summary.create_file_writer(log_dir)

model = create_model()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

for epoch in range(10):
    for step, (x_batch, y_batch) in enumerate(train_dataset):
        with tf.GradientTape() as tape:
            predictions = model(x_batch, training=True)
            loss = loss_fn(y_batch, predictions)
        
        gradients = tape.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(gradients, model.trainable_variables))
        
        # Log custom scalars
        with summary_writer.as_default():
            tf.summary.scalar('loss', loss, step=epoch * len(train_dataset) + step)
            tf.summary.scalar('learning_rate', optimizer.learning_rate, step=epoch)
            
            # Log gradient norms to detect exploding/vanishing gradients
            gradient_norm = tf.linalg.global_norm(gradients)
            tf.summary.scalar('gradient_norm', gradient_norm, step=epoch)

This approach gives you fine-grained control over what gets logged and when. Monitoring gradient norms, for instance, can reveal training instabilities long before they manifest as diverging loss values.

Visualizing Model Architecture

The Graphs tab visualizes your model’s computational graph, showing how data flows through layers and operations. This is invaluable for debugging model architecture issues—mismatched tensor shapes, unintended layer connections, or inefficient operation sequences become immediately visible.

For Keras models, the graph is automatically logged when you set write_graph=True in the TensorBoard callback. For custom models or training loops, you need to trace the graph explicitly:

# Log model graph for custom model
@tf.function
def trace_model(x):
    return model(x)

# Create sample input
sample_input = tf.random.normal([1, 784])

# Trace and log graph
tf.summary.trace_on(graph=True)
trace_model(sample_input)

with summary_writer.as_default():
    tf.summary.trace_export(
        name="model_trace",
        step=0
    )

Use the graph visualization to compare different architectures. If you’re deciding between ResNet and DenseNet variants, log both graphs to the same parent directory but different subdirectories, then compare them side-by-side in TensorBoard.

Advanced Visualizations

Beyond scalars and graphs, TensorBoard offers specialized visualizations that reveal deeper model behavior.

Histograms show how weight and activation distributions change during training. Weights that don’t change suggest dead neurons; weights that explode indicate instability.

# Log weight histograms and sample predictions
tensorboard_callback = tf.keras.callbacks.TensorBoard(
    log_dir=log_dir,
    histogram_freq=1  # Log histograms every epoch
)

# Custom callback for logging images
class ImageLoggingCallback(tf.keras.callbacks.Callback):
    def __init__(self, log_dir, validation_data):
        super().__init__()
        self.log_dir = log_dir
        self.validation_data = validation_data
        self.file_writer = tf.summary.create_file_writer(log_dir + '/images')
    
    def on_epoch_end(self, epoch, logs=None):
        # Get sample predictions
        images, labels = self.validation_data
        predictions = self.model.predict(images[:25])
        
        with self.file_writer.as_default():
            # Log images with predictions
            tf.summary.image(
                "validation_predictions",
                images[:25],
                max_outputs=25,
                step=epoch
            )

# Use both callbacks
model.fit(
    x_train, y_train,
    epochs=10,
    validation_data=(x_val, y_val),
    callbacks=[
        tensorboard_callback,
        ImageLoggingCallback(log_dir, (x_val, y_val))
    ]
)

Embeddings project high-dimensional data into 2D/3D space using t-SNE or PCA. This is crucial for understanding how your model represents data internally.

from tensorboard.plugins import projector

# Create embedding visualization
embedding_log_dir = log_dir + '/embeddings'
os.makedirs(embedding_log_dir, exist_ok=True)

# Get embeddings from model
embedding_model = tf.keras.Model(
    inputs=model.input,
    outputs=model.layers[-2].output  # Second-to-last layer
)
embeddings = embedding_model.predict(x_test[:1000])

# Save embeddings
np.savetxt(
    os.path.join(embedding_log_dir, 'embeddings.tsv'),
    embeddings,
    delimiter='\t'
)

# Configure projector
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = "embeddings"
embedding.metadata_path = 'metadata.tsv'
projector.visualize_embeddings(embedding_log_dir, config)

Comparing Experiments and Hyperparameter Tuning

The real power of TensorBoard emerges when comparing multiple experiments. Organize runs by hyperparameter configurations to identify optimal settings.

from tensorboard.plugins.hparams import api as hp

# Define hyperparameters
HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([64, 128, 256]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.1, 0.5))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd']))

METRIC_ACCURACY = 'accuracy'

# Configure HParams
with tf.summary.create_file_writer('logs/hparam_tuning').as_default():
    hp.hparams_config(
        hparams=[HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER],
        metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')]
    )

# Run experiments
def train_model(hparams, run_dir):
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(hparams[HP_NUM_UNITS], activation='relu'),
        tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    
    model.compile(
        optimizer=hparams[HP_OPTIMIZER],
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    tensorboard_callback = tf.keras.callbacks.TensorBoard(run_dir)
    hparams_callback = hp.KerasCallback(run_dir, hparams)
    
    model.fit(
        x_train, y_train,
        epochs=5,
        validation_data=(x_val, y_val),
        callbacks=[tensorboard_callback, hparams_callback]
    )

# Grid search over hyperparameters
session_num = 0
for num_units in HP_NUM_UNITS.domain.values:
    for dropout_rate in [0.1, 0.3, 0.5]:
        for optimizer in HP_OPTIMIZER.domain.values:
            hparams = {
                HP_NUM_UNITS: num_units,
                HP_DROPOUT: dropout_rate,
                HP_OPTIMIZER: optimizer
            }
            run_name = f"run-{session_num}"
            train_model(hparams, 'logs/hparam_tuning/' + run_name)
            session_num += 1

The HParams dashboard provides parallel coordinates plots and scatter plots to visualize relationships between hyperparameters and performance metrics.

Best Practices and Tips

Organize logs hierarchically. Use logs/experiment_name/run_timestamp to keep experiments separate but comparable. Delete old logs regularly to prevent TensorBoard from slowing down.

Don’t log too frequently. Logging every batch creates massive log files and slows training. Log once per epoch for most metrics, or use update_freq='batch' only during debugging.

Use descriptive run names. Instead of timestamps alone, include hyperparameter info: logs/lr_0.001_batch_32_dropout_0.2/.

Monitor multiple metrics. Track both training and validation metrics for every important quantity. The gap between them reveals overfitting immediately.

Leverage profiling. TensorBoard’s Profile tab identifies performance bottlenecks in your training pipeline. Enable it periodically, not continuously, as profiling adds overhead.

Version your experiments. When you change model architecture or data preprocessing, create a new parent log directory. This maintains a clear experimental history.

TensorBoard transforms ML development from an art into an engineering discipline. Use it religiously, organize your logs carefully, and let the visualizations guide your debugging and optimization decisions.