Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Getting (and keeping) NLP models safely in production

Getting natural language processing (NLP) models into production is a lot like buying a car. In both cases, you set your parameters for your desired outcome, test several approaches, likely retest them, and the minute you drive off the lot, value starts to plummet. Like having a car, having NLP or AI-enabled products has many benefits, but the maintenance never stops — at least to function properly over time, it shouldn’t.

While productionizing AI is hard enough, ensuring the accuracy of models down the line in a real-world environment can present even bigger governance challenges. Model accuracy degrades the moment it hits the market, as the predictable research environment it was trained on behaves differently in real life. Just as the highway is a different scenario than the lot at the dealership.

It’s called concept drift — meaning when variables change, the learned concept may no longer be precise — and while it’s nothing new in the field of AI and machine learning (ML), it’s something that continues to challenge users. It’s also a contributing factor as to why, despite huge investments in AI and NLP in recent years, only around 13% of data science projects actually make it into production (VentureBeat).

So what does it take to move products safely from research to production? Arguably just as important, what does it take to keep them in production accurately with the changing tides? There are a few considerations that enterprises should keep in mind to make sure their AI investments actually see the light of day.

Getting AI models into production

Model governance is a key component in productionizing NLP initiatives and a common reason so many products remain projects. Model governance covers how a company tracks activity, access, and behavior of models in a given production environment. It’s important to monitor this to mitigate risk, troubleshoot, and maintain compliance. This concept is well understood among the AI global community, but it’s also a thorn in their side. 

Data from the 2021 NLP Industry Survey showed that high-accuracy tools that are easy to tune and customize were a top priority among respondents. Tech leaders echoed this, noting that accuracy, followed by production readiness, and scalability, was vital when evaluating NLP solutions. Constant tuning is key to models performing accurately over time, but it’s also the biggest challenge practitioners face.

NLP projects involve pipelines, in which the results from a previous task and pre-trained model are used downstream. Often, models need to be tuned and customized for their specific domains and applications. For example, a healthcare model trained on academic papers or medical journals will not perform the same when used by a media company to identify fake news.

Better searchability and collaboration among the AI community will play a key role in standardizing model governance practices. This includes storing modeling assets in a searchable catalog, including notebooks, datasets, resulting measurements, hyper-parameters, and other metadata. Enabling reproducibility and sharing of experiments across data science team members is another area that will be advantageous to those trying to get their projects to production-grade.

More tactically, rigorous testing and retesting is the best way to ensure models behave the same in production as they do in research — two very different environments. Versioning models that have advanced beyond an experiment to a release candidate, testing those candidates for accuracy, bias, and stability, and validating models before launching in new geographies or populations are factors that all practitioners should be exercising.

With any software launch, security and compliance should be baked into the strategy from the start, and AI projects are no different. Role-based access control and an approval workflow for model release and storing and providing all metadata needed for a full audit trail are some of the security measures necessary for a model to be considered production-ready.

These practices can significantly improve the chances of AI projects moving from ideation to production. More importantly, they help set the foundation for practices that should be applied once a product is customer-ready.

Keeping AI models in production

Back to the car analogy: There’s no definitive “check engine” light for AI in production, so data teams need to be constantly monitoring their models. Unlike traditional software projects, it’s important to keep data scientists and engineers on the project, even after the model is deployed.

From an operational standpoint, this requires more resources, both human capital and cost-wise, which may be why so many organizations fail to do this. The pressure to keep up with the pace of business and move onto the ‘next thing’ also factors in, but perhaps the biggest oversight is that even IT leaders don’t expect model degradation to be a problem.

In healthcare, for example, a model can analyze electronic medical records (EMRs) to predict a patient’s likelihood of having an emergency C-Section based upon risk factors such as obesity, smoking or drug use, and other determinants of health. If the patient is dubbed high-risk, their practitioner may ask them to come in earlier or more frequently to reduce pregnancy complications.

The expectation is that these risk factors remain constant over time, and while many of them do, the patient is less predictable. Did they quit smoking? Were they diagnosed with gestational diabetes? There are also nuances in the way the clinician asks a question and records the answer in the hospital record that could result in different outcomes.

This can become even more tricky when you consider the NLP tools most practitioners are using. A majority (83%) of respondents from the aforementioned survey stated that they used at least one of the following NLP cloud services: AWS Comprehend, Azure Text Analytics, Google Cloud Natural Language AI, or IBM Watson NLU. While the popularity and accessibility of cloud services is obvious, tech leaders cited difficulty in tuning models and cost as major challenges. Essentially, even experts are grappling with maintaining the accuracy of models in production.

Another problem is that it simply takes time to see when something’s amiss. How long that is can vary significantly. Amazon may be updating an algorithm for fraud detection and mistakenly blocks customers in the process. Within hours, maybe even minutes, customer service emails will point to an issue. In healthcare, it can take months to get enough data on a certain condition to see that a model has degraded.

Essentially, to keep models accurate you need to apply the same rigor of testing, automating retrain pipelines, and measurement that was conducted before the model was deployed. When dealing with AI and ML models in production, It’s more pertinent to expect problems than it is to expect optimal performance several months out.

When you consider all the work it takes to get models into production and keep them there safely, it’s understandable why 87% of data projects never make it to market. Despite this, 93% of tech leaders indicated that their NLP budgets grew by 10-30% compared to last year (Gradient Flow). It’s encouraging to see growing investments in NLP technology, but it’s all for naught if businesses don’t take stock in the expertise, time, and continual updating required to deploy successful NLP projects.


Read More from This Article: Getting (and keeping) NLP models safely in production
Source: News

Category: NewsDecember 22, 2021
Tags: art

Post navigation

PreviousPrevious post:Security Models of Tomorrow for Work from AnywhereNextNext post:IT leaders’ top 15 takeaways from 2021

Related posts

Barb Wixom and MIT CISR on managing data like a product
May 30, 2025
Avery Dennison takes culture-first approach to AI transformation
May 30, 2025
The agentic AI assist Stanford University cancer care staff needed
May 30, 2025
Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
May 30, 2025
“AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
May 30, 2025
“ROI는 어디에?” AI 도입을 재고하게 만드는 실패 사례
May 30, 2025
Recent Posts
  • Barb Wixom and MIT CISR on managing data like a product
  • Avery Dennison takes culture-first approach to AI transformation
  • The agentic AI assist Stanford University cancer care staff needed
  • Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
  • “AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.