Your LLM Gets Its Data From Where??
Salesforce Marketing Cloud
MARCH 20, 2024
Wikipedia, where anyone can write and edit an entry, is a major data source. It’s estimated that Wikipedia makes up between 3%-5% of the scraped data used to train off-the-shelf LLMs. Corpus data Corpus data includes written or spoken data from books, newspapers, articles, websites (including blogs), academic papers, and more.
Let's personalize your content