The Hottest 3 Letters in Generative AI Right Now Are Not LLM

Using RAG, or retrieval augmented generation, can help boost your generative AI work. [Adobe Stock | Studio Science]

How to take generative AI prompts to the next level with retrieval augmented generation, or RAG.

Ari Bendersky

March 19, 2024 4 min read

In 2023, Canada-based Algo Communications found itself facing a challenge. The company was poised for rapid growth, but it couldn’t train customer service representatives (CSRs) quickly enough to keep up with its expansion. To tackle this challenge, the company turned to a novel solution: generative AI.

Algo adopted a large language model (LLM) to help onboard new CSRs faster. In order to train them to answer complex customer questions with accuracy and fluency, Algo knew it needed something more robust than an off-the-shelf LLM, which is typically trained on the public internet, and lacks the specific business context needed to answer questions accurately. Enter retrieval augmented generation, better known simply as RAG.

By now, many of us have already used a generative AI LLM through chat apps like OpenAI’s ChatGPT or Google’s Gemini (formerly Bard) to help write an email or craft clever social media copy. But getting the best results isn’t always easy — especially if you haven’t nailed the fine art and science of crafting a great prompt.

Here’s why: An AI model is only as good as what it’s taught. For it to thrive, it needs the proper context and reams of factual data — and not generic information. An off-the-shelf LLM is not always up to date, nor will it have trustworthy access to your data or understand your customer relationships. That’s where RAG can help.

RAG is an AI technique that allows companies to automatically embed their most current and relevant proprietary data directly into their LLM prompt. And we’re not just talking about structured data like a spreadsheet or a relational database. We mean retrieving all available data, including unstructured data: emails, PDFs, chat logs, social media posts, and other types of information that could lead to a better AI output.

Want better answers from generative AI?

If you’re still not getting the results you want from generative AI, follow these six tips to writing better prompts — and get the technology to work for you.

Get the tips

How does retrieval augmented generation work?

In a nutshell, RAG helps companies retrieve and use their data from various internal sources for better generative AI results. Because the source material comes from your own trusted data, it helps reduce or even eliminate hallucinations and other incorrect outputs. Bottom line: You can trust the responses to be relevant and accurate.

To achieve this improved accuracy, RAG works in conjunction with a specialized type of database — called a vector database — to store data in a numeric format that makes sense for AI, and retrieve it when prompted.

“RAG can’t do its job without the vector database doing its job,” said Ryan Schellack, director of AI product marketing at Salesforce. “The two go hand in hand. When you see a company talk about supporting retrieval augmented generation, they are at minimum supporting two things: a vector store for storing information, and then some type of machine-learning search mechanism designed to work against that type of data.”

Working in tandem with a vector database, RAG can be a powerful tool for generating better LLM outputs, but it’s not a silver bullet. Users must still understand the fundamentals of writing a clear prompt.

Get a full view of AI copilots

A trusted copilot that brings AI to your business

Einstein Copilot: How it works, and what it can do

See 5 ways your company can use an AI copilot

Learn the basics for using an AI copilot

Faster response times to complex questions

After adding a tremendous amount of unstructured data to its vector database, including chat logs and two years of email history, Algo Communications started testing this technology in December 2023 with a few of its CSRs. They worked on a small sample set: about 10% of the company’s product base. It took about two months for the CSRs to get comfortable with the tool. During implementation, company leadership was excited to see CSRs gain greater confidence in answering in-depth questions with the assistance of RAG. At this point, the company started rolling out RAG wider across the company.

“Exploring RAG helped us understand we were going to be able to bring in so much more data,” said Ryan Zoehner, vice president, commercial operations for Algo Communications. “It was going to allow us to break down a lot of those really complex answers and deliver five- and six-part responses in a way that customers knew [there] was someone technically savvy answering them.”

In just two months after adding RAG, Algo’s customer service team was able to more quickly and efficiently complete cases, which helped them move on to new inquiries 67% faster. RAG now touches 60% of its products and will continue to expand. The company also started adding new chat logs and conversations into the database, reinforcing its solution with even more relevant context. Using RAG has also allowed Algo to cut its onboarding time in half, enabling it to grow faster.

“RAG is making us more efficient,” Zoehner said. “It’s making our employees happier with their job and is helping us onboard everything faster. What has made this different from everything else we tried to do with LLMs is it allowed us to keep our brand, our identity, and the ethos of who we are as a company.”

With RAG providing Algo’s CSRs with an AI assist, the team has been able to dedicate more time to adding a human touch to customer interactions.

“It allows our team to spend that extra little bit on making sure the response is landing the right way,” Zoehner said. “That humanity allows us to bring our brand across everything. It also gives us quality assurance across the board.”

A smiling woman in a white sweater gets good customer service on her yellow mobile device from her couch

What Is Good Customer Service?

10 min read

Illustration of a screen with colorful boxes representing custom design for Copilot Actions.

Should You Design Custom Actions for Einstein Copilot? If So, Here Are 5 Tips

7 min read

Ari Bendersky Contributing Editor

Ari Bendersky is a Chicago-based lifestyle journalist who has contributed to a number of leading publications including the New York Times, The Wall Street Journal magazine, Men's Journal, RollingStone.com and many more. He has written for brands as wide-ranging as Ace Hardware to Grassroots Cannabis and is a lead contributor to the Salesforce 360 Blog. He is also the co-host of the Overserved podcast, featuring long-form conversations with food and beverage personalities.

More by Ari