The post NVIDIA Enhances AI Scalability with NIM Operator 3.0.0 Release appeared on BitcoinEthereumNews.com. Darius Baruo Sep 10, 2025 17:33 NVIDIA’s NIM Operator 3.0.0 introduces advanced features for scalable AI inference, enhancing Kubernetes deployments with multi-LLM and multi-node capabilities, and efficient GPU utilization. NVIDIA has unveiled the latest iteration of its NIM Operator, version 3.0.0, aimed at bolstering the scalability and efficiency of AI inference deployments. This release, as detailed in a recent NVIDIA blog post, introduces a suite of enhancements designed to optimize the deployment and management of AI inference pipelines within Kubernetes environments. Advanced Deployment Capabilities The NIM Operator 3.0.0 facilitates the deployment of NVIDIA NIM microservices, which cater to the latest large language models (LLMs) and multimodal AI models. These include applications across reasoning, retrieval, vision, and speech domains. The update supports multi-LLM compatibility, allowing the deployment of diverse models with custom weights from various sources, and multi-node capabilities, addressing the challenges of deploying massive LLMs across multiple GPUs and nodes. Collaboration with Red Hat An important facet of this release is NVIDIA’s collaboration with Red Hat, which has enhanced the NIM Operator’s deployment on KServe. This integration leverages KServe lifecycle management, simplifying scalable NIM deployments and offering features such as model caching and NeMo Guardrails, which are essential for building trusted AI systems. Efficient GPU Utilization The release also marks the introduction of Kubernetes’ Dynamic Resource Allocation (DRA) to the NIM Operator. DRA simplifies GPU management by allowing users to define GPU device classes and request resources based on specific workload requirements. This feature, although currently under technology preview, promises full GPU and MIG usage, as well as GPU sharing through time slicing. Seamless Integration with KServe NVIDIA’s NIM Operator 3.0.0 supports both raw and serverless deployments on KServe, enhancing inference service management through intelligent caching and NeMo microservices support. This integration… The post NVIDIA Enhances AI Scalability with NIM Operator 3.0.0 Release appeared on BitcoinEthereumNews.com. Darius Baruo Sep 10, 2025 17:33 NVIDIA’s NIM Operator 3.0.0 introduces advanced features for scalable AI inference, enhancing Kubernetes deployments with multi-LLM and multi-node capabilities, and efficient GPU utilization. NVIDIA has unveiled the latest iteration of its NIM Operator, version 3.0.0, aimed at bolstering the scalability and efficiency of AI inference deployments. This release, as detailed in a recent NVIDIA blog post, introduces a suite of enhancements designed to optimize the deployment and management of AI inference pipelines within Kubernetes environments. Advanced Deployment Capabilities The NIM Operator 3.0.0 facilitates the deployment of NVIDIA NIM microservices, which cater to the latest large language models (LLMs) and multimodal AI models. These include applications across reasoning, retrieval, vision, and speech domains. The update supports multi-LLM compatibility, allowing the deployment of diverse models with custom weights from various sources, and multi-node capabilities, addressing the challenges of deploying massive LLMs across multiple GPUs and nodes. Collaboration with Red Hat An important facet of this release is NVIDIA’s collaboration with Red Hat, which has enhanced the NIM Operator’s deployment on KServe. This integration leverages KServe lifecycle management, simplifying scalable NIM deployments and offering features such as model caching and NeMo Guardrails, which are essential for building trusted AI systems. Efficient GPU Utilization The release also marks the introduction of Kubernetes’ Dynamic Resource Allocation (DRA) to the NIM Operator. DRA simplifies GPU management by allowing users to define GPU device classes and request resources based on specific workload requirements. This feature, although currently under technology preview, promises full GPU and MIG usage, as well as GPU sharing through time slicing. Seamless Integration with KServe NVIDIA’s NIM Operator 3.0.0 supports both raw and serverless deployments on KServe, enhancing inference service management through intelligent caching and NeMo microservices support. This integration…

NVIDIA Enhances AI Scalability with NIM Operator 3.0.0 Release

2025/09/11 14:46


Darius Baruo
Sep 10, 2025 17:33

NVIDIA’s NIM Operator 3.0.0 introduces advanced features for scalable AI inference, enhancing Kubernetes deployments with multi-LLM and multi-node capabilities, and efficient GPU utilization.





NVIDIA has unveiled the latest iteration of its NIM Operator, version 3.0.0, aimed at bolstering the scalability and efficiency of AI inference deployments. This release, as detailed in a recent NVIDIA blog post, introduces a suite of enhancements designed to optimize the deployment and management of AI inference pipelines within Kubernetes environments.

Advanced Deployment Capabilities

The NIM Operator 3.0.0 facilitates the deployment of NVIDIA NIM microservices, which cater to the latest large language models (LLMs) and multimodal AI models. These include applications across reasoning, retrieval, vision, and speech domains. The update supports multi-LLM compatibility, allowing the deployment of diverse models with custom weights from various sources, and multi-node capabilities, addressing the challenges of deploying massive LLMs across multiple GPUs and nodes.

Collaboration with Red Hat

An important facet of this release is NVIDIA’s collaboration with Red Hat, which has enhanced the NIM Operator’s deployment on KServe. This integration leverages KServe lifecycle management, simplifying scalable NIM deployments and offering features such as model caching and NeMo Guardrails, which are essential for building trusted AI systems.

Efficient GPU Utilization

The release also marks the introduction of Kubernetes’ Dynamic Resource Allocation (DRA) to the NIM Operator. DRA simplifies GPU management by allowing users to define GPU device classes and request resources based on specific workload requirements. This feature, although currently under technology preview, promises full GPU and MIG usage, as well as GPU sharing through time slicing.

Seamless Integration with KServe

NVIDIA’s NIM Operator 3.0.0 supports both raw and serverless deployments on KServe, enhancing inference service management through intelligent caching and NeMo microservices support. This integration aims to reduce inference time and autoscaling latency, thereby facilitating faster and more responsive AI deployments.

Overall, the NIM Operator 3.0.0 is a significant step forward in NVIDIA’s efforts to streamline AI workflows. By automating deployment, scaling, and lifecycle management, the operator enables enterprise teams to more easily adopt and scale AI applications, aligning with NVIDIA’s broader AI Enterprise initiatives.

Image source: Shutterstock


Source: https://blockchain.news/news/nvidia-enhances-ai-scalability-nim-operator-3-0-0

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Botanix launches stBTC to deliver Bitcoin-native yield

Botanix launches stBTC to deliver Bitcoin-native yield

The post Botanix launches stBTC to deliver Bitcoin-native yield appeared on BitcoinEthereumNews.com. Botanix Labs has launched stBTC, a liquid staking token designed to turn Bitcoin into a yield-bearing asset by redistributing network gas fees directly to users. The protocol will begin yield accrual later this week, with its Genesis Vault scheduled to open on Sept. 25, capped at 50 BTC. The initiative marks one of the first attempts to generate Bitcoin-native yield without relying on inflationary token models or centralized custodians. stBTC works by allowing users to deposit Bitcoin into Botanix’s permissionless smart contract, receiving stBTC tokens that represent their share of the staking vault. As transactions occur, 50% of Botanix network gas fees, paid in BTC, flow back to stBTC holders. Over time, the value of stBTC increases relative to BTC, enabling users to redeem their original deposit plus yield. Botanix estimates early returns could reach 20–50% annually before stabilizing around 6–8%, a level similar to Ethereum staking but fully denominated in Bitcoin. Botanix says that security audits have been completed by Spearbit and Sigma Prime, and the protocol is built on the EIP-4626 vault standard, which also underpins Ethereum-based staking products. The company’s Spiderchain architecture, operated by 16 independent entities including Galaxy, Alchemy, and Fireblocks, secures the network. If adoption grows, Botanix argues the system could make Bitcoin a productive, composable asset for decentralized finance, while reinforcing network consensus. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication. Get the news in your inbox. Explore Blockworks newsletters: Source: https://blockworks.co/news/botanix-launches-stbtc
Share
BitcoinEthereumNews2025/09/18 02:37
The FDA Is Trying To Make Corporate Free Speech Situational

The FDA Is Trying To Make Corporate Free Speech Situational

The post The FDA Is Trying To Make Corporate Free Speech Situational appeared on BitcoinEthereumNews.com. BENSENVILLE, ILLINOIS – SEPTEMBER 10: Flanked by U.S. Attorney General Pam Bondi (rear), and FDA Commissioner Marty Makary (R), Secretary of Health and Human Services Robert F. Kennedy Jr. speaks to the press outside Midwest Distribution after it was raided by federal agents on September 10, 2025 in Bensenville, Illinois. According to the company, various e-liquids were seized in the raid. (Photo by Scott Olson/Getty Images) Getty Images While running for President in 2008, Barack Obama famously chanted “Yes we can.” Love or hate his political views, Obama’s politics were quite effective. He was asking voters to think big, to envision a much better future. Advertisers no doubt approved. That’s because ads routinely evoke things not as they are, but as they could be. Gyms and exercise equipment companies don’t promote their locations and equipment with flabby, lumbering people, rather their ads show fit, upright, energetic individuals. A look ahead. Restaurants do the same with ads showing happy people enjoying impressively put together plates of food. Conversely, ads meant to convince smokers to quit have not infrequently shown the worst of the worst future downsides of the habit. The nature of advertising comes to mind as FDA commissioner Marty Makary puzzlingly brags that “The Trump Administration Is Taking On Big Pharma” in the New York Times. Makary laments pharmaceutical ads that “are filled with dancing patients, glowing smiles and catch jingles that drown out the fine print.” Not explained is whether Makary would be happier if drug companies placed ads with immobile patients, frowns, and funereal music. Seriously, what does he expect? Does he want drug companies to commit billions to drug development to accompany their achievements with imagery defined by misery? Has Makary stopped to contemplate the myriad shareholders lawsuits drugmakers would face if, upon risking staggering sums meant…
Share
BitcoinEthereumNews2025/09/18 06:29