Amazons’s cloud computing & web hosting service.
AWS adds support for NIXL with EFA to accelerate LLM inference at scale
19 March 2026 @ 6:00 pm
AWS announces support for NVIDIA Inference Xfer Library (NIXL) with Elastic Fabric Adapter (EFA) to accelerate disaggregated large language model (LLM) inference on Amazon EC2. This integration enhances disaggregated inference serving through three key improvements: increased KV-cache throughput, reduced inter-token latency, and optimized KV-cache memory utilization. NIXL with EFA enables high throughput, low-latency KV-cache transfer between prefill and decode nodes, and it enables efficient KV-cache movement between various storage layers. NIXL is interoperable with all EFA-enabled EC2 instances and integrates natively with frameworks including NVIDIA Dynamo, SGLang, and vLLM. Combined, NIXL with EFA enables flexible integration with your EC2 instance and framework of choice, providing performant disaggregated inference at scale. AWS supports NIXL version 1.0.0 or higher with EFA installer version 1.47.0 or higher on all EFA-enabled EC2 instance types in all AWS
Amazon EC2 C8gn instances are now available in additional regions
19 March 2026 @ 5:00 pm
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C8gn instances, powered by the latest-generation AWS Graviton4 processors, are available in the AWS Region Asia Pacific (Jakarta, Hyderabad, Tokyo), South America (Sao Paulo), and Europe (Zurich). The new instances provide up to 30% better compute performance than Graviton3-based Amazon EC2 C7gn instances. Amazon EC2 C8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network optimized EC2 instances.
Take advantage of the enhanced networking capabilities of C8gn to scale performance and throughput, while optimizing the cost of running network-intensive workloads such as network virtual appliances, data analytics, CPU-based artificial intelligence and machine learning (AI/ML) inference. For increased scalability, C8gn instances offer instance sizes up to 48xla
AWS Lambda now supports Availability Zone metadata
19 March 2026 @ 4:36 pm
AWS Lambda now provides Availability Zone (AZ) metadata through a new metadata endpoint in the Lambda execution environment. With this capability, developers can determine the AZ ID (e.g., use1-az1) of the AZ their Lambda function is running in, enabling them to build functions that make AZ-aware routing decisions, such as preferring same-AZ endpoints for downstream services to reduce cross-AZ latency. This capability also enables operators to implement AZ-aware resilience patterns like AZ-specific fault injection testing.
Lambda automatically provisions and maintains execution environments ready to serve function invocations across multiple AZs within an AWS Region to provide high availability and fault tolerance without any additional configuration or management overhead for customers. As development teams scale their serverless applications, their functions often need to interact with other AWS
Amazon RDS Custom for SQL Server adds ability to view and schedule new operating system updates
19 March 2026 @ 7:00 am
Amazon RDS Custom for SQL Server now provides customers with ability to view and schedule new operating system (OS) updates for RDS provided engine versions (RPEV). With RPEV, RDS Custom provides a SQL Server version pre-installed on an Amazon Machine Image (AMI). When new operating system updates are available for RPEV, customers can now view upcoming updates, apply them immediately, or schedule them for application in the next maintenance window using RDS Custom APIs. To view available OS updates, customers can use the describe-pending-maintenance-actions API, or subscribe to RDS-EVENT-0230 to receive an alert when new updates become available for their database instance. Customers can use
Minimax M2.5 and GLM 5 models now available on Amazon Bedrock
18 March 2026 @ 9:07 pm
Amazon Bedrock expands model selection for customers by adding support for GLM 5 and Minimax M2.5. GLM 5 is a frontier‑class, general‑purpose large language model optimized for complex systems engineering and long‑horizon agentic tasks. It builds on the GLM 4.5 agent‑centric lineage and is designed to support multi‑step reasoning, math (including AIME‑style benchmarks), advanced coding, and tool‑augmented workflows, with long context support suitable for sophisticated agents and enterprise applications. MiniMax M2.5 is an agent‑native frontier model trained explicitly to reason efficiently, decompose tasks optimally, and complete complex workflows under real‑world time and cost constraints. It achieves task completion speeds comparable to or faster than leading proprietary frontier models by combining high inference throughput with reinforcement learning focused on token‑efficient reasoning and better decision‑making in agentic scaffolds.
MiniMax M2.5 a
Amazon EC2 High Memory U7i-6TB instances now available in Asia Pacific (Malaysia)
18 March 2026 @ 7:00 pm
Amazon EC2 High Memory U7i instances with 6TB of memory (u7i-6tb.112xlarge) are now available in AWS Asia Pacific (Malaysia). U7i instances are part of AWS 7th generation and are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). U7i-6tb instances offer 6TiB of DDR5 memory, enabling customers to scale transaction processing throughput in a fast-growing data environment.
U7i-6tb instances deliver 448 vCPUs with up to 100 Gbps of Amazon EBS bandwidth for faster data loading and backups, 100 Gbps of network bandwidth, and ENA Express. U7i instances are ideal for customers using mission-critical in-memory databases like SAP HANA, Oracle, and SQL Server.
To learn more about U7i instances, visit the High Memory instances page.
Amazon ECR now supports pull through cache for Chainguard
18 March 2026 @ 6:00 pm
Amazon Elastic Container Registry (Amazon ECR) pull through cache now supports Chainguard’s registry as an upstream source. With today’s release, customers now benefit from the security and availability of Amazon ECR for private Chainguard images. As customers continue to scale their use of Chainguard images, keeping them synchronized with Chainguard's registry becomes increasingly important. With ECR's pull through cache feature, customers can keep Chainguard images in sync without additional workflows or tools to manage. Amazon ECR's pull through cache supports frequent registry syncs, helping to keep container images sourced from Chainguard up to date. Later, customers can apply ECR features such as image scanning and lifecycle policies to their cached Chainguard images. The pull through cache for Chainguard is available in all AWS Regions where Amazon ECR pull through cache is supported. To get s
NVIDIA Nemotron 3 Super now available on Amazon Bedrock
18 March 2026 @ 6:00 pm
Amazon Bedrock now supports NVIDIA Nemotron 3 Super, an open hybrid Mixture-of-Experts (MoE) model designed for complex multi-agent applications. Built for agentic workloads, Nemotron 3 Super delivers fast, and cost-efficient inference enabling AI agents to maintain focus and accuracy across long, multi-step tasks without losing context. Fully open with weights, datasets, and recipes, the model supports easy customization and secure deployment, making it well-suited for enterprises, startups, and individual developers building multi-agent workflows, and advanced reasoning applications.
Amazon Bedrock gives customers access to Nemotron 3 Super through a single, fully managed API — with no infrastructure to provision or models to host. Bedrock's serverless inference, built-in security controls, and compatibility with OpenAI API specifications make it easy to integrate Nemotron 3 Super into existing workflows and deploy at production scale with confidence.
NVIDIA Nemo
Amazon S3 Access Grants are now available in the AWS Asia Pacific (New Zealand) Region
18 March 2026 @ 5:17 pm
You can now create Amazon S3 Access Grants in the AWS Asia Pacific (New Zealand) Region.
Amazon S3 Access Grants map identities in directories such as Microsoft Entra ID, or AWS Identity and Access Management (IAM) principals, to datasets in S3. This helps you manage data permissions at scale by automatically granting S3 access to end users based on their corporate identity.
Visit the AWS Region Table for complete regional availability information. To learn more about Amazon S3 Access Grants, visit our product page.
Amazon EC2 M6in and M6idn instances are now available in Europe (London) Region
18 March 2026 @ 4:39 pm
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M6in and M6idn instances are available in AWS London Region. These sixth-generation network optimized instances, powered by 3rd Generation Intel Xeon Scalable processors and built on the AWS Nitro System, deliver up to 200Gbps network bandwidth, for 2x more network bandwidth over comparable fifth-generation instances. Customers can use M6in and M6idn instances to scale their performance and throughput of network-intensive workloads such as high-performance file systems, distributed web scale in-memory caches, caching fleets, real-time big data analytics, and Telco applications such as 5G User Plane Function.
M6in and M6idn instances are available in 10 different instance sizes including metal, offering up to 128 vCPUs and 512 GiB of memory. They deliver up to 100Gbps of