Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

UL’s leap into the genAI evaluation business raises key questions

UL Solutions, part of the UL enterprise that grew out of Underwriters Laboratories, on Monday jumped into the crowded genAI third-party evaluation service market, joining Stanford University and Microsoft, among many others, but with a more customized approach. The UL team will be asking questions as well as analyzing code. 

Some analysts and others in the AI space have questioned how reliable and precise such an effort would be. Would the workers handling the value-add genAI code know the answers to those questions? Even more cynically they ask whether the workers — or contractors — would answer all questions honestly, or would they be more likely to tell the UL team what they think they want to hear, so that they can get the most favorable rating?

Another issue — and this applies to all of the third-party genAI evaluation efforts — is that the foundation model is off limits. But when evaluating questions about bias, accuracy, and related topics, the foundation model absolutely colors the results. 

Forrester VP/principal analyst Brandon Purcell said the foundation model is a critical element of these kinds of evaluations.

“Even if you get perfect answers to 20,000 questions, an AI is going to have a black box in the middle of it. These foundation models are opaque,” Purcell said, raising questions about the value of the analysis if the foundation model is off-limits.

Another analyst, Paul Smith-Goodson, VP/principal analyst for Moor Insights & Strategy, agreed with the need to factor in the foundation model in any safety or reliability analysis.

“To evaluate layers on top of that, they are also going to have to evaluate how it interacts with the model below it,” Smith-Goodson said. “I think it’s very difficult, very complicated to evaluate the safety aspects of AI because there are so many models. How can they take all of that into consideration? I just don’t know how they are going to do all of that.”

Still some benefits

Despite the potential limitations, Forrester’s Purcell said there are still some benefits to what UL is planning.

“I do think that this is meaningful, but it’s also the best we can do right now, given that there is no transparency,” Purcell said. “This is largely in service of marketing, trying to help companies bridge the trust gap. They really need to bridge that trust gap.”

Assaf Melochna, president of AI vendor Aquant, said good third-party evaluations of genAI code are “increasingly needed, given the hype and explosion of new AI solutions over the past year.”

That said, Melochna stressed that this might be beyond the realistic capabilities of a third party.

“I’ve worked with UL in the past and respect the quality of their work, [but] this rating system risks oversimplifying the complexity of enterprise needs. The UL certification might provide a baseline level of trust, helpful in filtering out weaker solutions, but it shouldn’t replace the in-depth evaluations required to find AI tools that solve specific operational challenges,” Melochna said. “The UL rating should be treated as one piece of the puzzle, not the whole picture. CIOs will need to pair these ratings with their own due diligence to ensure that AI solutions align with business goals.”

src=”https://b2b-contenthub.com/wp-content/uploads/2024/10/UL-Verified-Mark-for-AI-Model-Transparency-example.png” alt=”Sample UL Verified Mark” loading=”lazy” width=”400px”>UL’s new offering “assesses AI model transparency, which is the ability to understand how an AI system makes decisions and produces specific results,” said the company’s news release. “By examining key areas such as data management, model development, security, deployment, and ethical considerations, the benchmark provides a clear and objective rating of an AI system’s transparency and trustworthiness that results in a marketing claim verification.

“A UL Verified Mark for AI Model Transparency may be displayed on products achieving a rating. Systems are awarded a score between 0 and 100 points, with higher scores indicating greater transparency. A score of 50 or less is considered ‘not rated,’ indicating significant transparency issues. Scores between 51 and 60 are rated as Silver, reflecting moderate transparency. Scores between 61 and 70 are rated as Gold, indicating high transparency. Scores between 71 and 80 are rated as Platinum, reflecting very high transparency. Scores of 81 and above are rated as Diamond, indicating exceptional transparency.”

Jason Chan, the UL Solutions VP for data and innovation, said in a CIO interview that the company chose to only evaluate what the enterprise adds on top of the foundation model. 

“Any AI model by design is not transparent. How did it learn all of that stuff? Can you explain how the model works? What was included in the training set of data? How well has it removed data bias?” Chan said. “This is a big undertaking: a couple of months for us to complete the assessment.”

When asked what kind of pricing an enterprise could expect, Chan said it would be based on the enterprise’s needs, the size of the various AI models, and what needs to be evaluated. He declined to offer any guidance on specific pricing.

“How much data is involved and how big is your model? The pricing model is twofold: the survey evaluation and the initial assessment,” he said. “There are also elements of licensing fees so that you can continue to be kept up to date. We need to reassess every 12 months because of data drifting, as the models start to deviate from the original intent.”


Read More from This Article: UL’s leap into the genAI evaluation business raises key questions
Source: News

Category: NewsOctober 16, 2024
Tags: art

Post navigation

PreviousPrevious post:의료 보안의 새로운 취약점··· 스마트 기기와 랜섬웨어 증가NextNext post:IT部門、データセンターの持続可能性にさらなる責任を担う

Related posts

휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
May 9, 2025
Epicor expands AI offerings, launches new green initiative
May 9, 2025
MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
May 9, 2025
오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
May 9, 2025
SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
May 8, 2025
IBM aims to set industry standard for enterprise AI with ITBench SaaS launch
May 8, 2025
Recent Posts
  • 휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
  • Epicor expands AI offerings, launches new green initiative
  • MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
  • 오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
  • SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.