NVIDIA NIM Microservices Revolutionize AI Model Deployment
Delivered as optimized containers, NVIDIA NIM microservices are designed to accelerate AI application development for businesses of all sizes, paving the way for rapid production and deployment of AI technologies. The set of microservices can be used to build and deploy AI solutions across speech AI, data retrieval, digital biology, digital humans, simulation, and large language models (LLMs), according to the NVIDIA Technical Blog.
Speech and Translation NIM Microservices
The latest NIM microservices for speech and translation enable organizations to integrate advanced multilingual speech and translation capabilities into their conversational applications. These include automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT), catering to diverse industry needs.
Parakeet ASR
The Parakeet ASR-CTC-1.1B-EnUS ASR model, with 1.1 billion parameters, provides record-setting English language transcription capabilities. It delivers exceptional accuracy and robustness, adeptly handling diverse speech patterns and noise levels, enabling businesses to advance their voice-based services.
FastPitch-HiFiGAN TTS
FastPitch-HiFiGAN-EN integrates FastPitch and HiFiGAN models to generate high-fidelity audio from text. It enables businesses to create natural-sounding voices, elevating user engagement and delivering immersive experiences.
Megatron NMT
The Megatron 1B-En32 is a powerful NMT model excelling in real-time translation across multiple languages, facilitating seamless multilingual communication. It enables organizations to extend their global reach and engage diverse audiences.
Retrieval NIM Microservices
The latest NVIDIA NeMo Retriever NIM microservices help developers efficiently fetch the best proprietary data to generate knowledgeable responses for their AI applications. NeMo Retriever enables organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses using retrieval-augmented generation (RAG).
Embedding QA E5
The NVIDIA NeMo Retriever QA E5 embedding model is optimized for text question-answering retrieval. It transforms textual information into dense vector representations, crucial for a text retrieval system.
Embedding QA Mistral 7B
The NVIDIA NeMo Retriever QA Mistral 7B embedding model is a multilingual community base model fine-tuned for high-accuracy question-answering. This model is suitable for users building a question-and-answer application over a large text corpus.
Snowflake Arctic Embed
Snowflake Arctic Embed is a suite of text embedding models for high-quality retrieval, optimized for performance. These models are ready for commercial use, free of charge, and have achieved state-of-the-art performance on the MTEB/BEIR leaderboard.
Reranking QA Mistral 4B
The NVIDIA NeMo Retriever QA Mistral 4B reranking model provides a logit score representing document relevance to a query. It improves the overall accuracy of text retrieval systems, often deployed in combination with embedding models.
Digital Biology NIM Microservices
In healthcare and life sciences, NVIDIA NIM microservices are transforming digital biology. These AI tools empower pharmaceutical companies, biotechnology, and healthcare facilities to expedite innovation and deliver life-saving medicine to patients.
MolMIM
MolMIM is a transformer-based model for controlled small molecule generation, optimizing and sampling molecules for improved values of desired scoring functions. It can be deployed in the cloud or on-premises for computational drug discovery workflows.
DiffDock
NVIDIA DiffDock NIM microservice is built for high-performance, scalable molecular docking. It predicts up to 7x more poses per second compared to baseline models, reducing the cost of computational drug discovery workflows.
LLM NIM Microservices
New NVIDIA NIM microservices for LLMs offer unprecedented performance and accuracy across various applications and languages.
Llama 3.1 8B and 70B
The Llama 3.1 8B and 70B models provide cutting-edge text generation and language understanding capabilities, serving as powerful tools for creating engaging and informative content. Deploying Llama 3.1 8B NIM on NVIDIA H100 data center GPUs can achieve up to 2.5x tokens per second for content generation.
Llama 3.1 405B
Llama 3.1 405B is the largest openly available model for various use cases, including synthetic data generation. The Llama 3.1 405B NIM microservice can be downloaded and run anywhere from the NVIDIA API catalog.
Simulation NIM Microservices
New NVIDIA USD NIM microservices offer the ability to leverage generative AI copilots and agents to develop Universal Scene Description (OpenUSD) tools that accelerate the creation of 3D worlds.
USD Code
USD Code is a state-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.
USD Search
USD Search provides AI-powered search for OpenUSD data, 3D models, images, and assets using text- or image-based inputs.
USD Validate
USD Validate enables verifying compatibility of OpenUSD assets with instant RTX render and rule-based validation.
Video Conferencing NIM Microservices
NVIDIA Maxine simplifies the deployment of AI features that enhance audio, video, and augmented reality effects for video conferencing and telepresence.
Maxine Audio2Face-2D
Maxine Audio2Face-2D animates a 2D image in real time using speech audio. It enables head pose animation for natural delivery and can be coupled with chatbot output or translated speech.
Eye Contact
NVIDIA Maxine Eye Contact NIM microservice uses AI to apply a filter to the user’s webcam feed in real time, redirecting their eye gaze toward the camera to improve, augment, and enhance the user experience.
Accelerate AI Application Development
NVIDIA NIM streamlines the creation of complex AI applications by enabling the integration of specialized microservices across domains. Using NIM microservices, organizations can bypass the complexities of building AI models from scratch, saving time and resources. This allows for the assembly of customized AI solutions that meet specific business needs.
For example, a company can combine ACE NIM microservices, including speech recognition, with LLM NIM microservices to create digital humans for personalized customer service across industries such as healthcare, finance, and retail.
NIM microservices can also be integrated into supply chain management systems, combining cuOpt NIM microservice for route optimization with NeMo Retriever NIM microservices for retrieval-augmented generation and LLM NIM microservices for business communication.
Get Started
NVIDIA NIM empowers enterprises to fully harness AI, accelerating innovation, maintaining a competitive edge, and delivering superior customer experiences. Explore the latest AI models available with NIM microservices and discover how these powerful tools can transform your business.
Image source: Shutterstock