The Neptune-PyTorch integration simplifies tracking your PyTorch experiments with Neptune by providing automated tracking of PyTorch model internals including activations, gradients, and parameters.
pip install -U neptune-pytorch- Neptune 3.x: Requires a Neptune 3.x account. See the Getting Started Guide for setup instructions.
- Python 3.10+: Minimum Python version requirement
- PyTorch 1.11+: For tensor operations and model support
- NumPy 1.20+: For numerical computations
The below quickstart example logs the following data to Neptune:
- Model architecture: Visual diagram and summary of the neural network
- Training metrics: Loss curves and epoch progress
- Layer activations: Mean, std, norm, histograms for each layer
- Gradient analysis: Gradient statistics to detect vanishing/exploding gradients
- Parameter tracking: Weight and bias distributions over time
import torch
import torch.nn as nn
import torch.optim as optim
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger
# Initialize Neptune run
run = Run(project="your-project/experiment-tracking")
# Create your PyTorch model
model = nn.Sequential(
nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10)
)
# Initialize Neptune logger with model tracking
neptune_logger = NeptuneLogger(
run=run,
model=model,
base_namespace="mnist_classification", # Organizes all metrics under this folder
log_model_diagram=True, # Generates model architecture diagram
)
# Training setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())
# Training loop with comprehensive tracking
for epoch in range(num_epochs):
for batch_idx, (data, target) in enumerate(train_loader):
# Forward pass
output = model(data)
loss = criterion(output, target)
# Backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Log training metrics to Neptune
run.log_metrics({
f"{neptune_logger.base_namespace}/batch/loss": loss.item(),
f"{neptune_logger.base_namespace}/epoch": epoch,
})
# Track model internals every 10 steps
if batch_idx % 10 == 0:
neptune_logger.log_model_internals(
step=batch_idx,
prefix="train",
track_activations=True, # Monitor activation patterns
track_gradients=True, # Track gradient flow
track_parameters=True # Log parameter statistics
)The below example demonstrates the following additional features:
- Layer filtering: Only track Conv2d and Linear layers (reduces overhead)
- Custom statistics: Use mean, std, hist instead of all 8 statistics
- Phase-specific tracking: Different tracking strategies for train/validation
- Frequency control: Track every 20 steps in training, every 50 in validation
import torch
import torch.nn as nn
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger
# Initialize Neptune run
run = Run(project="your-project/advanced-tracking")
# Create a more complex model (e.g., CNN for image classification)
class CNNModel(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
self.fc1 = nn.Linear(64 * 8 * 8, 128)
self.fc2 = nn.Linear(128, 10)
self.relu = nn.ReLU()
self.pool = nn.MaxPool2d(2)
def forward(self, x):
x = self.pool(self.relu(self.conv1(x)))
x = self.pool(self.relu(self.conv2(x)))
x = x.view(x.size(0), -1)
x = self.relu(self.fc1(x))
x = self.fc2(x)
return x
model = CNNModel()
# Advanced Neptune logger configuration
neptune_logger = NeptuneLogger(
run=run,
model=model,
base_namespace="cnn_experiment", # Custom organization folder
track_layers=[nn.Conv2d, nn.Linear], # Only track conv and linear layers
tensor_stats=["mean", "norm", "hist"], # Custom statistics (faster than default)
log_model_diagram=True, # Log model summary and diagram
)
# Training with phase-specific tracking
for epoch in range(num_epochs):
# Training phase - comprehensive tracking
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
# ... your training code ...
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# Track everything during training
if batch_idx % 20 == 0: # Every 20 steps
neptune_logger.log_model_internals(
step=batch_idx,
prefix="train",
track_activations=True, # Monitor activation patterns
track_gradients=True, # Track gradient flow
track_parameters=True # Log parameter statistics
)
# Validation phase - lightweight tracking
model.eval()
with torch.no_grad():
for batch_idx, (data, target) in enumerate(val_loader):
# ... your validation code ...
output = model(data)
val_loss = criterion(output, target)
# Only track activations during validation (faster)
if batch_idx % 50 == 0: # Every 50 steps
neptune_logger.log_model_internals(
step=batch_idx,
prefix="validation",
track_activations=True, # Monitor activation patterns
track_gradients=False, # Skip gradients (no backward pass)
track_parameters=False # Skip parameters (expensive)
)- Layer activations: Track activation patterns across all layers with 8 different statistics
- Gradient analysis: Monitor gradient flow and detect vanishing/exploding gradients
- Parameter tracking: Log parameter statistics and distributions for model analysis
- Custom statistics: Choose from mean, std, norm, min, max, var, abs_mean, and hist
- Layer filtering: Track only specific layer types (Conv2d, Linear, etc.)
- Phase organization: Separate tracking for training/validation phases with custom prefixes
- Custom namespaces: Organize experiments with custom folder structures
- Model architecture: Automatic model diagram generation with torchviz
- Distribution histograms: 50-bin histograms for all tracked metrics
- Real-time monitoring: Live tracking during training with Neptune
- Comparative analysis: Easy comparison across experiments and runs
- Minimal setup: Simple integration with existing code
- PyTorch native: Works with existing PyTorch workflows
Since parameter logging can be expensive for large models, you can control the frequency explicitly:
for step in range(num_steps):
# ... training code ...
# Log lightweight metrics every step
neptune_logger.log_model_internals(
step=step,
track_activations=True,
track_gradients=True,
track_parameters=False # Skip expensive parameter logging
)
# Log expensive parameters less frequently
if step % 100 == 0:
neptune_logger.log_model_internals(
step=step,
track_activations=False,
track_gradients=False,
track_parameters=True
)The integration organizes all logged data under a clear hierarchical and customizable namespace structure:
{base_namespace}/ # Optional custom top-level folder
βββ batch/
β βββ loss # Training loss per batch (logged by the user)
βββ model/
β βββ summary # Model architecture (if log_model_diagram=True)
β βββ internals/ # Model internals tracking
β βββ {prefix}/ # Optional prefix (e.g., "train", "validation")
β βββ activations/ # Layer activations
β β βββ {layer_name}/
β β βββ mean # Mean activation value
β β βββ std # Standard deviation
β β βββ norm # L2 norm
β β βββ min # Minimum value
β β βββ max # Maximum value
β β βββ var # Variance
β β βββ abs_mean # Mean of absolute values
β β βββ hist # Histogram (50 bins)
β βββ gradients/ # Layer gradients
β β βββ {layer_name}/
β β βββ {statistic} # Same statistics as activations
β βββ parameters/ # Model parameters
β βββ {layer_name}/
β βββ {statistic} # Same statistics as activations
Example namespaces:
With base_namespace="my_experiment":
my_experiment/batch/loss- Training lossmy_experiment/model/summary- Model architecturemy_experiment/model/internals/activations/conv/1/mean- Mean activation (no prefix)my_experiment/model/internals/train/activations/conv/1/mean- Mean activation (with "train" prefix)my_experiment/model/internals/validation/gradients/linear1/norm- L2 norm of gradients (with "validation" prefix)
With base_namespace=None:
batch/loss- Training lossmodel/summary- Model architecturemodel/internals/activations/conv/1/mean- Mean activation (no prefix)model/internals/train/activations/conv/1/mean- Mean activation (with "train" prefix)model/internals/validation/gradients/linear1/norm- L2 norm of gradients (with "validation" prefix)
Layer name handling:
- Dots in layer names are automatically replaced with forward slashes for proper namespace organization
- Example:
seq_model.0.weightbecomesseq_model/0/weightin the namespace - Example:
module.submodule.layerbecomesmodule/submodule/layerin the namespace
Available statistics: mean, std, norm, min, max, var, abs_mean, hist
NeptuneLogger(
run: Run,
model: torch.nn.Module,
base_namespace: Optional[str] = None,
track_layers: Optional[List[Type[nn.Module]]] = None,
tensor_stats: Optional[List[TensorStatType]] = None,
log_model_diagram: bool = False
)Parameters:
run: Neptune run object for loggingmodel: PyTorch model to trackbase_namespace: Optional top-level folder for organization (default:None)track_layers: List of layer types to track (default:None= all layers)tensor_stats: Statistics to compute (default:["mean", "norm", "hist"])log_model_diagram: Log the model summary and diagram (default:False)
log_model_internals(
step: int,
track_activations: bool = True,
track_gradients: bool = True,
track_parameters: bool = False,
prefix: Optional[str] = None
)Parameters:
step: Current training step for loggingtrack_activations: Track layer activations (default:True)track_gradients: Track layer gradients (default:True)track_parameters: Track model parameters (default:False)prefix: Optional phase identifier (e.g., "train", "validation")
| Statistic | Description | Use case |
|---|---|---|
mean |
Mean value | Monitor activation levels |
std |
Standard deviation | Detect activation variance |
norm |
L2 norm | Monitor gradient/activation magnitude |
min |
Minimum value | Detect dead neurons |
max |
Maximum value | Detect saturation |
var |
Variance | Monitor activation spread |
abs_mean |
Mean of absolute values | Monitor activation strength |
hist |
50-bin histogram | Visualize distributions |
Contributions to neptune-pytorch are welcome. Here's how you can help:
- Found a bug? Open an issue
- Include Python version, PyTorch version, and error traceback
- Provide a minimal reproducible example
- Have an idea? Create a feature request
- Describe the use case and expected behavior
- Check existing issues first to avoid duplicates
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
pytest tests/ - Commit your changes:
git commit -m 'Add amazing feature' - Push to remote:
git push origin feature/amazing-feature - Open a Pull Request
- π§ Troubleshooting: Common Issues Guide
- π« Support Portal: Reach out to us
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Made with β€οΈ by the Neptune team