Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

AI isn’t failing, people are failing with AI

In 2018, Google released the AI model BERT, forever changing how machines understood context in a language. BERT, short for Bidirectional Encoder Representations from Transformers, solved a long-standing problem in natural language understanding. Before BERT, researchers needed multiple bespoke models (and datasets) to understand the different contextual meanings of human languages. BERT demonstrated that one model could process contextual meaning across multiple languages (via mBERT).

While BERT became a fundamental building block in natural language processing (NLP), its impact on how we interact with computers came from its application. That change did not come from shoehorning the technology into existing solutions. It stemmed from an applied understanding of how BERT functioned, would evolve and could solve domain-specific challenges. I know because my team at Google used BERT to create responsive search ads. Our work transformed online text advertising.

Our success in applying BERT wasn’t simply because we had TPUs or more resources than our competitors. (Those things helped, undoubtedly.) Our advantage was my team’s domain expertise and close collaboration with the Google Research team, who created the model. Collectively, we could envision how to apply BERT to fundamentally reshape advertising because we understood:

  1. How the model operates, including its strengths, weaknesses and dependencies.
  2. The specific industry problems and operational challenges we were solving for.
  3. The way we’d tune the model and create a system around it, at scale.

This framework remains pertinent today, as business leaders seek to understand how to build and deliver scaled impact with large language models (LLMs).

Weighing the pros and cons of models

As critics and pundits debate the efficacy of generative AI, numerous studies underscore a similar finding: people are unsure of how to use LLMs. To address these uncertainties, leaders need to ensure that their organization’s decision-makers understand, at a high level, how these models function and can be applied.

That technical understanding mattered when we applied BERT. It matters even more now. Because while BERT required domain expertise to deploy effectively, today’s LLMs make it dangerously easy to deploy them poorly and unknowingly. It’s likely the reason so many AI projects never proceed past pilots, as McKinsey reported.

BERT’s success underscored the power of pre-training and fine-tuning a model on a large dataset to enhance token-level semantic context within NLP. But where BERT zigged, OpenAI’s Generative Pre-trained Transformer (GPT) zagged. Unlike BERT’s encoder-only architecture, GPTs use a decoder-only architecture to generate outputs, trained to predict the next token in a sequence. BERT was trained on billions of tokens, while today’s GPTs are trained on trillions. The more tokens these LLMs were trained upon, the more capabilities they gained.

Early applications of these LLMs have centered on their ability to generate creative, fluent and coherent writing, coding and imagery. This “creativity” reflects the probabilistic patterns that these models have learnt from their massive training datasets. But this same predictive output can be disadvantageous when these models need to be deployed in factual, deterministic environments.

Deep learning researchers have long argued that setting any rules, following the historical method of symbolic reasoning, would inhibit the model’s abilities. Structured to predict an output, even when the model is uncertain, LLMs have demonstrated a tendency to hallucinate. Hallucinations are not a bug, but a feature. They’re inherent to how these models operate and should limit your applications without guardrails. After all, mistakes in consequential industries like healthcare, finance and legal can be catastrophic.

This historical comparison, while surface-level, still illustrates how the model’s underlying functions impact its outputs and applications. Understanding these architectures solves half the problem. The other half is building a strategy to collect and scale data within your specific domain.

AI’s scaling problem is a domain problem

Healthcare, finance and legal are all industries with ample data and capital to spend. While each sector presents distinct challenges, they all have found success with AI.

Hospitals epitomize the opportunities and obstacles within the healthcare industry. The average hospital generates 50 petabytes of data annually, enough tokens to train a sophisticated model. But 97 percent of this data isn’t used: it consists of unstructured clinical notes and radiology reports, redacted documents following HIPAA compliance, and data siloed or managed under regulatory scrutiny. When you can access it cleanly, as illustrated by AI detections of tumors in radiology images, a measurable impact is possible. When you can’t, your pilot stalls.

Finance presents different tradeoffs. Transaction data is generally well-structured and high-volume, enabling novel applications in fraud detection and customer service automation. But limiting these applications is LLMs’ high error rates on multi-step numerical reasoning, creating fundamental misalignment for calculation-intensive applications.

The pervasiveness of hallucinations spotted in trial briefs and documents has sown distrust in AI within the legal industry. The ABA’s 2024 Legal Technology Survey found 75% of attorneys cite accuracy concerns as a primary barrier to their AI adoption. But outside the courtroom, there are many applications of AI to radically reshape legal work, including managing contracts, conducting compliance and risk assessments, protecting intellectual property, etc. These instances lend themselves to LLMs’ strengths: ability to handle unstructured data, pattern recognition, information extraction and data structure analysis.

Your development framework to scale data

The distinction between courtroom risk and contract is exactly the advantage that we identified at Ironclad. Every business function generates contracts that include information on renewal dates, payment terms, obligations and counterparty details. Training models on this data is a strategy that leverages LLMs’ strengths with minimal risk.

Compared to legal briefs and documents, where incorrect AI summaries have massive implications, the risk of applying AI to contracts is lower. Our approach at Ironclad is modeled after the automotive industry’s safety framework for deploying autonomous vehicles, the SAE J3016. This standard distinguishes between systems in which humans retain responsibility (Levels 1-2, driver assistance) and those in which machines become accountable (Levels 3-5, automated driving).

Applied to enterprise AI, this risk-based framework clarifies deployment roadmaps and boundaries. We chose to develop our Intake Agent, which extracts contract data from third-party papers, and our Conversational Search Agents, which enable natural language querying of documents, because we saw them as “Level 1-2” applications with low adoption barriers and associated risks. Verifying a contract autonomously, while plausible with today’s LLMs, might seem innocuous, but unsupervised verification could be very risky. There’s a plausible scenario in which an agent could overlook litigation between the two engaged parties and autonomously renew a contract because it can’t access the tort proceedings data.

Using a risk-based framework can help determine where and how to build your AI applications by answering the question: what’s the likelihood we’ll get this right, and what happens if we’re wrong? This calculation, not competitive pressure, should then determine your deployment sequencing.

Build a foundation, create transformation

During my twenty-plus years of working with AI, I’ve found that the most significant determinant of a technology’s impact on a business is the organization’s fluency with it.

As planning ensues, organizations trying to determine how and where to apply generative AI to their business need first to have an introspective look and ask themselves:

  1. Do you understand the technology’s actual mechanisms? Not marketing promises, but fundamental architecture. LLMs perform statistical pattern matching, not logical reasoning. They require extensive general training followed by domain-specific fine-tuning. Without this understanding, you’ll deploy for impossible tasks.
  2. How does the industry you’re serving share, collect and store data? What data will you use to tune and fine-tune your models, if at all? LLMs are foundational because they require sophisticated pre-training, followed by fine-tuning to generate meaningful, domain-specific outputs.
  3. Do you have a framework, such as a risk-based model, to prioritize, deploy and assess which products to develop? Start with low-risk, human-supervised applications. Conduct evaluations and build feedback loops. Expand systematically as reliability proves out.

Over my career, I’ve watched people overestimate what AI can do this quarter while underestimating what it will do this decade. Technology evolves. Industries are reshaped. But transformations are created by businesses that do the foundational work.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?


Read More from This Article: AI isn’t failing, people are failing with AI
Source: News

Category: NewsFebruary 23, 2026
Tags: art

Post navigation

PreviousPrevious post:Salesforce to acquire Momentum to boost Agentforce 360, Slack for sales teamsNextNext post:When speed stops working

Related posts

샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
April 29, 2026
SAS makes AI governance the centerpiece of its agent strategy
April 29, 2026
The boardroom divide: Why cyber resilience is a cultural asset
April 28, 2026
Samsung Galaxy AI for business: Productivity meets security
April 28, 2026
Startup tackles knowledge graphs to improve AI accuracy
April 28, 2026
AI won’t fix your data problems. Data engineering will
April 28, 2026
Recent Posts
  • 샤오미, MIT 라이선스 ‘미모 V2.5’ 공개···장시간 실행 AI 에이전트 시장 겨냥
  • SAS makes AI governance the centerpiece of its agent strategy
  • The boardroom divide: Why cyber resilience is a cultural asset
  • Samsung Galaxy AI for business: Productivity meets security
  • Startup tackles knowledge graphs to improve AI accuracy
Recent Comments
    Archives
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.