Nvidia has announced a new demo application called Chat with RTX that allows users to personalize an LLM with their content such as documents, notes, videos, or other data.
The application leverages Retrieval Augmented Generation (RAG), TensorRT-LLM, and RTX acceleration to allow users to query a custom chatbot and receive contextual responses quickly and securely.
The chatbot runs locally on a Windows RTX PC or workstation, providing additional data protection over your standard cloud chatbot.
Chat with RTX supports various file formats, including text, PDF, doc/docx, and XML. Users can simply point the application to the appropriate folders, and it will load the files into the library.
Users can also specify the URL of a YouTube playlist and the application will load the transcripts of the videos in a playlist and make them chattable. Google Bard offers a similar feature, but only with a Google account in the Google Cloud. Chat with RTX processes the transcript locally.
You can register here to be notified when Chat with RTX is available.
Developers can get started right away
The Chat with RTX Tech Demo is based on the TensorRT-LLM RAG Developer Reference Project available on GitHub. According to Nvidia, developers can use this reference to build and deploy their RAG-based applications for RTX accelerated by TensorRT-LLM.
In addition to Chat with RTX, Nvidia also introduced RTX Remix at CES, a platform for creating RTX remasters of classic games, which will be available in beta in January, and Nvidia ACE Microservices, which provides games with intelligent and dynamic digital avatars based on generative AI.
Nvidia has also released TensorRT acceleration for Stable Diffusion XL (SDXL) Turbo and Latent Consistency models, which is expected to deliver up to a 60 percent performance boost. An updated version of the Stable Diffusion WebUI TensorRT extension with improved support for SDXL, SDXL Turbo, LCM – Low-Rank Adaptation (LoRA) is now available.