Skip to content

Neuron SDK Release - September 18, 2025

Choose a tag to compare

@suneelnj suneelnj released this 19 Sep 22:33
· 6 commits to master since this release
49ee15c

AWS Neuron SDK 2.26.0 adds support for PyTorch 2.8, JAX 0.6.2, along with support for Python 3.11, and introduces inference improvements on Trainium2 (Trn2). This release includes expanded model support, enhanced parallelism features, new Neuron Kernel Interface (NKI) APIs, and improved development tools for optimization and profiling.

Inference Updates
NxD Inference - Model support expands with beta releases of Llama 4 Scout and Maverick variants on Trn2. The FLUX.1-dev image generation models are now available in beta on Trn2 instances.

Expert parallelism is now supported in beta, enabling MoE expert distribution across multiple NeuronCores. This release introduces on-device forward pipeline execution in beta and adds sequence parallelism in MoE routers for model deployment flexibility.

Neural Kernel Interface (NKI)
New APIs enable additional optimization capabilities:
- gelu_apprx_sigmoid: GELU activation with sigmoid approximation
- select_reduce: Selective element copying with maximum reduction

  • sequence_bounds: Sequence bounds computation

API enhancements include:

  • tile_size: Added total_available_sbuf_size field
  • dma_transpose: Added axes parameter for 4D transpose.
  • activation: Added gelu_apprx_sigmoid operation

Developer Tools
Neuron Profiler improvements include the ability to select multiple semaphores at once to correlate pending activity with semaphore waits and increments. Additionally, system profile grouping now uses a global NeuronCore ID instead of a process local ID for visibility across distributed workloads. The Profiler also adds warnings for dropped events due to limited buffer space.

The ncom-test utility adds State Buffer support on Trn2 for collective operations, including all-reduce, all-gather, and reduce-scatter operations. Error reporting provides messages for invalid all-to-all collective sizes to help developers identify and resolve issues.

Deep Learning AMI and Containers
The Deep Learning AMI now supports PyTorch 2.8 on Amazon Linux 2023 and Ubuntu 22.04. Container updates include PyTorch 2.8.0 and Python 3.11 across all DLCs. The transformers-neuronx environment and package have been removed from PyTorch inference DLAMI/DLC.

Component release highlights
These component release notes contain details on specific new and improved features, as well as breaking changes, bug fixes, and known issues for that component area of the Neuron SDK.