Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

NASA accelerates science with gen AI-powered search

When you generate and collect as much data as the US National Aeronautics and Space Administration (NASA) does, finding just the right data set for a research project can be a problem.

With seven operating centers, nine research facilities, and more than 18,000 staff, the agency continually generates an overwhelming amount of data, which it stores in more than 30 science data repositories across five topical areas — astrophysics, heliophysics, biological science, physical science, earth science, and planetary science. Overall, the agency houses more than 88,000 datasets and 715,000 documents across 128 data sources. Its earth science data alone is expected to hit 250 petabytes by 2025. In light of such complexity, scientists need more than just domain expertise to navigate through it all.

“It requires researchers to know which repository to go to and what that repository has,” says Kaylin Bugbee, NASA data scientist at Marshall Space Flight Center in Huntsville, Ala. “You have to be both science literate and data literate.”

In 2019, NASA’s Science Mission Directorate (SMD) released a report based on a series of interviews with scientists that made it clear those scientists needed a centralized search capability to help them find the data they needed. The SMD’s mission is to engage with the US science community, sponsor scientific research, and use aircraft, balloon, and spaceflight programs for investigations in Earth orbit, in the Solar System, and beyond. Recognizing that giving scientists and researchers access to its data was fundamental to its purpose, SMD developed its Open Source Science Initiative (OSSI) as a result of that report in an effort to make publicly funded scientific research transparent, inclusive, accessible, and reproducible. The mission of the OSSI: a commitment to the open sharing of software, data, and knowledge (including algorithms, papers, documents, and ancillary information) as early as possible in the scientific process.

“It really came from the scientists and scientific community, and it also aligns with our broader SMD priority of enabling interdisciplinary science,” Bugbee says. “That’s where new discoveries are made.”

To facilitate that mission, the agency is now turning to a combination of neural nets and generative AI to put those vast amounts of data at scientists’ fingertips. 

Restoring order

A key element of OSSI is the Science Discovery Engine (SDE), a centralized search and discovery capability for all of NASA’s open science data and information, powered by Sinequa’s enterprise search platform.

“Until the SDE was created, you couldn’t go to a single place to search for our open data and documentation,” Bugbee says. “Now it serves as a single search capability for our open science data.”

New York-based Sinequa, which got its start more than two decades ago with a semantic search engine, focuses on leveraging AI and large language models (LLMs) to deliver contextual search information. It has since integrated Microsoft’s Azure OpenAI Service with its own neural search capabilities to power the platform.

Specifically, Sinequa’s neural search capability uses a combination of keyword and vector search to discover information, while its GPT summarizes the information gathered into rapidly digestible and reusable formats. It also allows scientists to use natural language to ask deeper questions and refine the search or the response. The SDE understands nearly 9,000 different scientific terms, with that number expected to grow as the AI learns.

Bugbee and her interdisciplinary team, which includes scientists with expertise in data stewardship and informatics, as well as developers and AI and ML experts, worked closely with stakeholders to understand their needs, and also with NASA’s Office of the CIO and Sinequa to build a proof of concept.

“They helped us set up the environment we needed,” she explains. “We had to have an open capability, so we had some special architectural needs.”

Bugbee says one of her team’s biggest challenges in getting everything up and running was how dispersed content was across the NASA ecosystem. Her team spent about a year trying to understand the information landscape, the data, and the metadata schemas.

“All of the contextual information that really brings richness to the data — things like code and GitHub, or algorithm documentation that describes how the data was developed — that kind of content is spread over a number of web pages and it’s been an effort to curate and identify where all those things reside,” she says.

Cleared for launch

Bugbee is no stranger to data management and data stewardship. She cut her teeth in the field working to improve metadata quality in Data.gov and on President Obama’s Climate Data Initiative. But working on the SDE really drove home the importance of good curation workflow: the processes for principled and controlled data creation, maintenance, and management.

“If I could go back in time, I’d have a more robust curation workflow built in from the beginning,” she says. “We kind of used the out-of-the-box approach to start and it worked for a time, but to really get the results we wanted, we needed that curation workflow.”

While the SDE is still in beta, Bugbee says her team has received a great deal of positive feedback from scientists to date, and the plan is to deliver a more fully operational system later this year. Already the team has implemented a new user interface that allows users to filter by topics before they begin their search.

Aerospace and Defense Industry, Artificial Intelligence, CIO, Data Center Management, Data Management, Data Mining, Data Quality, Data Scientist, Digital Transformation, Generative AI, IT Leadership
Read More from This Article: NASA accelerates science with gen AI-powered search
Source: News

Category: NewsJanuary 15, 2024
Tags: art

Post navigation

PreviousPrevious post:15 ways to grow as an IT leader in 2024NextNext post:“If organisations are hacked, they should stay calm and act quickly by instantly activating their incident response plan”

Related posts

휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
May 9, 2025
Epicor expands AI offerings, launches new green initiative
May 9, 2025
MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
May 9, 2025
오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
May 9, 2025
SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
May 8, 2025
IBM aims to set industry standard for enterprise AI with ITBench SaaS launch
May 8, 2025
Recent Posts
  • 휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
  • Epicor expands AI offerings, launches new green initiative
  • MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
  • 오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
  • SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.