Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Four things that matter in the AI hype cycle

It’s been almost one year since a new breed of artificial intelligence took the world by storm. The capabilities of these new generative AI tools, most of which are powered by large language models (LLM), forced every company and employee to rethink how they work. Was this new technology a threat to their job or a tool that would amplify their productivity? If you don’t figure out how to make the most of GenAI, are you going to get outclassed by your peers?

This paradigm shift placed a dual burden on engineering and technical leaders. First, there’s the internal demand to understand how your organization is going to adopt these new tools and what you need to do to avoid falling behind your competitors. Second, if you’re selling software and services to other companies, you’re going to find that many have paused spending on new tools while they sort out exactly what their approach should be to the GenAI era.

There is a ton of hype, and it can be exhausting trying to figure out where to direct your resources. Before you can dive into the details of what to do with the answers or art your GenAI is creating, you need a robust foundation to ensure it’s operating well. To help, we’ve come up with four key areas you’ll need to understand to make the most of the time and resources you invest.

  • Vector Databases
  • Embedding Models
  • Retrieval Augmented Generation
  • Knowledge Bases

These are almost certain to be fundamental pieces of your AI stack, so read on below to learn more about the four pillars needed for effectively adding GenAI to your organization.

Vector Databases

To make use of a Large Language Model, you’re going to need to vectorize your data. That means the text you feed into the model is going to be reduced to arrays of numbers, and those numbers are going to be as a vector on a map, albeit one with thousands of dimensions. Finding similar text is reduced to finding the distance between two vectors. This allows you to move from the old-fashioned approach of lexical keyword search—typing a few terms and getting back results that share those keywords—to semantic search, typing a query in natural language and getting back a response that understands a coding question about Python is probably referring to the programming language and not the large snake.

“Traditional data structures, typically organized in structured tables, often fall short of capturing the complexity of the real world,” says Weaviate’s Philip Vollet. “Enter vector embeddings. These embeddings capture features and representations of data, enabling machines to understand, abstract, and compute on that data in sophisticated ways.”

How do you choose the right vector database? In some cases, it may depend on the tech stack your team is already using. Stack Overflow went with Weaviate in part because it allowed us to continue using PySpark, which was the initial choice for our OverflowAI efforts. On the other hand, you may have a database provider, like MongoDB, which has been serving you well. Mongo now includes vectors as part of their OLTP DB, making it easy to integrate with your existing deployments. Expect this to be standard for database providers in the future. As Louis Brady, VP of Engineering at Rockset explained, most companies will find that a hybrid approach combining a vector database with your existing system offers you the most flexibility and the best results.

Embedding Models

How do you get your data into the vector database in a way that accurately organizes it by the content? For that, you’ll need an embedding model. This is the software system which will take your text and convert it to the array of numbers you store in the vector database. There are a lot to choose from, and they vary greatly in cost and complexity. For this article, we’ll focus on embedding models that work with text, although embedding models can also be used to organize information about other types of media, like images or songs.

As Dale Markowitz wrote on the Google Cloud blog, “If you’d like to embed text–i.e. to do text search or similarity search on text–you’re in luck. There are tons and tons of pre-trained text embeddings free and easily available.” One example is the Universal Sentence Decoder, which “encodes text into high-dimensional vectors that can be used for text classification, semantic similarity, clustering, and other natural language tasks.” With just a few lines of Python code, you can prepare your data for a GenAI chatbot-style interface. If you want to take things a step further, Dale also has a great tutorial on how to prototype a language-powered app using nothing more than Google Sheets and a plugin called Semantic Reactor.

You’ll need to evaluate the tradeoffs between the time and cost of putting huge amounts of text into your embedding model and how thinly you slice the text, which is usually chunked into sections like chapters, pages, paragraphs, sentences, or even individual words. The other tradeoff is the precision of the embedding model — how many decimal places to use on vectors, as each decimal place increases in size. Over thousands of vectors for millions of tokens, this adds up. You can use techniques like quantization to shrink the model down, but it’s best to consider the amount of data and degree of detail you’re looking for before you choose which embedding method is right for you.

Retrieval Augmented Generation (RAG)

Big AI models read the internet to gain knowledge. That means they know the earth is round…and they also know that it’s flat.

One of the main problems with large language models like ChatGPT is that they were trained on a massive set of text from across the internet. That means they’ve read a lot about how the earth is round, and also a lot about how the earth is flat. The model isn’t trained to understand which of these assertions is correct, only the probability that a certain response to a question will be a good match for the query the user enters. It also mixes those inputs into a statistically probable new one, which is where hallucinations can occur. It may be responding with neither response, which is why checking sources is good.

With RAG, you can limit the dataset the model searches, meaning the model hopefully won’t be drawing on inaccurate data. Secondly, you can ask the model to cite its sources, allowing you to verify its answer against the ground truth. At Stack Overflow, that might mean containing queries to just the questions on our site with an accepted answer. When a user asks a question, the system first searches for Q&A posts that are a good match. That’s the retrieval part of this equation. A hidden prompt then instructs the model to do the following: synthesize a short answer for the user based on the answers you found that were validated by our community, then provide the short summary along with links to the three posts that were the best match for the user’s search.

A third benefit of RAG is that it allows you to keep the data the model is using fresh. Training a large model is costly. Many of the popular models available today are based on training data that ended months, or even years ago. Ask it a question about something after that, and it will happily hallucinate a convincing response, but it doesn’t have actual information to work with. RAG allows you to point the model at a specific dataset, one that you can keep up to date without having to retrain the entire model.

RAG means the user still gets the benefit of working with an LLM. They can ask questions using natural language and get back a summary that synthesizes the most relevant information from a vast data store. At the same time, drawing on a predefined data set helps to reduce hallucinations and gives the user links to the ground truth, so they can easily check the model’s output against something generated by humans.

Knowledge Base

As mentioned in the previous section, RAG can constrain the text your model is drawing on when generating its response. Ideally, that means you’re giving it accurate data, not just a random sampling of things it’s read on the internet. One of the most important laws of training an AI model is that data quality matters. Garbage in, garbage out, as the old saying goes, holds very true for your LLM. Feed it low-quality or poorly organized text, and the results will be equally uninspiring. 

At Stack Overflow, we kind of lucked out on the data quality issue. Question and answer is the format being adopted by most LLMs used inside organizations, and our dataset was already built that way. Our Q&A couplets can show us which information is accurate and which is still lacking a sufficient confidence score by analyzing the number of votes or which question has an accepted answer. Votes can also be used to determine which of three similar answers might be the most widely utilized and thus the most valuable. Last but not least, tags allow the system to better understand how different information in your dataset is related. 

Learn more about how Stack Overflow for Teams helps the world’s top companies share knowledge and build their foundation for an AI future.

Artificial Intelligence
Read More from This Article: Four things that matter in the AI hype cycle
Source: News

Category: NewsOctober 24, 2023
Tags: art

Post navigation

PreviousPrevious post:Top overlooked GenAI security risks for businessesNextNext post:Empowering cyber resilience in education: Three strategies for the future

Related posts

Barb Wixom and MIT CISR on managing data like a product
May 30, 2025
Avery Dennison takes culture-first approach to AI transformation
May 30, 2025
The agentic AI assist Stanford University cancer care staff needed
May 30, 2025
Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
May 30, 2025
“AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
May 30, 2025
“ROI는 어디에?” AI 도입을 재고하게 만드는 실패 사례
May 30, 2025
Recent Posts
  • Barb Wixom and MIT CISR on managing data like a product
  • Avery Dennison takes culture-first approach to AI transformation
  • The agentic AI assist Stanford University cancer care staff needed
  • Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
  • “AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.