The overhype of generative AI was unavoidable last year, yet despite all the distraction, unproven benefits, and potential pitfalls, Dana-Farber Cancer Institute CIO Naomi Lenane didn’t want to ban the technology outright. It was undeniably groundbreaking, and there could be interesting things the institute’s employees might be able to do with it.
But allowing free, unfettered use of the public gen AI platforms was not an option. So DFCI took three main steps to deploy gen AI in a controlled way. That included setting up a governance framework, building an internal tool that was safe for employees to use, and developing a process for vetting gen AI embedded in third-party systems.
Proactive governance
The governance framework came first.
“More than a year ago, there was the massive hype everywhere you turned,” says Lenane. “So, as an organization, we started talking to our chief legal counsel and senior clinical leaders and some operational leaders. Can our staff use this? What is our guidance to them?”
These leaders and other stakeholders came together into a gen AI governance community and included the institute’s privacy officer, business leaders, communications, HR, members of the clinical side — “Everyone from physicians to philanthropy,” she says.
The first step was to issue a statement across the organization about using tools expressly approved for clinical work, or anything else, but not free tools. From the beginning, users quickly started coming up with gen AI use cases.
“Most departments had specific problems they were trying to solve, or an example of work that could be exponentially more efficient,” she says. “Actual end users were coming from quality and safety, philanthropy, human resources saying, ‘We want to synthesize seven different job descriptions into one,’ with clear use cases. Some of these I hadn’t thought about.”
These were tasks that gen AI could do faster, more efficiently, and, in some cases, even better or more eloquently. “I was pleasantly surprised that people had solid examples of how to make jobs better,” she says. “No one walked in saying, ‘If we do this, I can cut FTEs.’ But they could get more done, and spend more time on human-centered tasks instead of rewriting job descriptions or proposals.”
The governance group developed a training program for employees who wanted to use gen AI, and created privacy and security policies. At the same time, the Institute’s AI enablement group began working on a gen AI project called GPT4DFCI that would run within the organization’s network and look at how to work with vendors adding gen AI to their systems.
Internal development
GPT4DFCI was designed to be used for non-clinical purposes, says Lenane, and was first tested with users last year, with full release at the end of 2023 and beginning of 2024. Now, it’s used throughout the organization, including within IT. For example, people are encouraged to use it for documentation since it’s something many tech people don’t like to do or want to do, says Lenane. People use it for general research, too. “We encourage people to use it if needed to understand something they don’t know,” she says. “But we’re not sanctioning it or encouraging it yet across the board as a way to code faster. Instead, GPT4DFCI, based on OpenAI’s GPT-4 Turbo and hosted within the institute’s private cloud on Azure so no data is leaked back to OpenAI, is more of an improved search engine to help people better understand something.
Lenane herself uses it to help rewrite emails or documents. “If I’m trying to explain something to my boss, the CFO, I sometimes take the technical paragraphs and use the generative AI tool to make sure it’s worded well or to make it clearer,” she says. Augmented with additional AI models for content filtering and auditing, she also uses GPT4DFCI to update or merge job descriptions.
“I used it the other day to announce someone’s promotion,” she adds. “But it was a little too flowery for me. The whole department would have known I hadn’t written it, so I definitely made some edits.”
Gen AI as an ally
DFCI has about 5,000 direct employees and another 5,000 who are leased to it through sponsors or other contractual arrangements, or who are employees of vendors providing services such as housekeeping. And today, Lenane says around 700 people are actively using GPT4DFCI.
“We have it open and available, and people need to sign up to use it after going through some required training,” she says. “After all, in order to get good content out of the tool, you have to ask questions correctly and think about what you put into it to get a solid answer.” While GPT4DFCI isn’t allowed to be used for clinical purposes, as the governance committee has stipulated, it’s been reviewed by the privacy and information security teams for safety and efficacy.
This is a useful approach for other organizations to follow who want to give their people tools to be more efficient, but don’t want to put their organizations at greater risk, she says.
There’s also a Dana-Farber front end, so employees aren’t interacting with the OpenAI chatbot directly. That allows the institute to track usage and queries, and get an overall view of how people use the tool. It allows for security, compliance, PII checks, and other guardrails to be built around it.
Some compliance concerns are taken care of as well since GPT4DFCI runs on Azure, a HIPAA-compliant cloud environment, says Renato Umeton, director of AI operations and data science services at Dana-Farber.
The obligation to protect patient privacy and data under HIPAA precluded the institute from using public gen AI services like ChatGPT, he says. And training an LLM from scratch was too cost prohibitive.
“While on-prem, open-source LLMs were considered, they’d require significant investment in infrastructure and might not offer the same level of versatility as commercial models like GPT-4,” he says.
When selecting which specific commercial LLM to use, the Institute looked at benchmarks from LMSYS Arena. “GPT-4 consistently emerged as the superior model group,” he adds. “No fine-tuning was applied to the general version accessible to all users. However, we’ve successfully implemented retrieval-augmented generation techniques in various projects.”
Retrieval-augmented generation, or RAG, uses a vector database or other knowledge repository to provide additional context to individual queries, allowing for more accurate and customized results.
“For instance, collaborating with Cornell, we leveraged RAG to synthesize hundreds of manuscripts, significantly informing the forthcoming PathML toolkit release,” Umeton says.
GPT4DFCI’s API is available to initial technical testers, and many are exploring RAG techniques for reliable information extraction and aggregation, he says. People from operations, basic research and clinical research are using it to explore their own use cases of interest while helping further enhance the GPT4DFCI API. And over the past few months, the GPT4DFCI API discussion channel in Teams has grown from three developers to over a hundred, he says. But to ensure information safety, the application was built within the institution’s Azure perimeter, and security practices included dedicated IP addresses, VPN routing, service-to-service authentication, and HTTPS enforcement.
“Content filtering was also implemented to minimize harmful content and report non-complying users,” he says.
The DFCI AI governance committee also issues responsible use policies specifically for GPT4DFCI.
“As an example of policy guardrail, all members of the workforce are required to take responsibility for their work products irrespective of the means of generation,” he adds. So people can only use GPT4DFCI for results they can personally verify, and must watch out for biased or incomplete information. Disclosure of the tool usage is also required in most use cases.
Umeton likened the use of gen AI in hands-on clinical care to that of a new drug. Drugs require clinical trials to assess safety and efficacy, he says. “We should use a similar clinical trial framework to assess all aspects of clinical AI.” But the institute was able to focus its resources on other applications, such as research and operations. These are areas where AI can deliver high return on investment with comparatively lower risk, adds Umeton.
Vendor management
Gen AI is more than just choosing between a free, insecure, public chatbot and a private, controlled one. For many companies, an employee’s main interaction with gen AI will come through the enterprise and productivity software they’re already using. Dana-Farber was no exception.
“All the HR vendors out there, and ours as well, were either releasing or preparing to release gen AI integrations,” says Lenane. For example, the AIs could review documentation or create draft messages.
“There were also business vendors, like the Microsoft Copilot-type solutions,” she says. “Can we use this to make us a prettier PowerPoint, or to rewrite an email to be more clear or less scientific?”
In some cases, the gen AI features are free upgrades to existing software systems that are already approved tools from vendors with which Dana-Farber has existing contracts.
“If there’s an extra cost, we might be piloting a tool to understand if it’s worth the extra cost and to see what people will do with it,” she adds. So the third-party software piloting process began earlier this year, and there’s a Copilot pilot currently underway to test the tool’s use to help people write emails.
The issue of gen AI add-ons to third-party software is something Data-Farber is looking at carefully. It’s become part of the governance group, which ensures key stakeholders are involved in AI-related vendor decisions, including the supply chain leader, the technology leaders, or the privacy officer. For example, a vendor contract might need new language to be added to cover the gen AI use cases. And there are some specific requirements these tools need to comply with.
“We feel strongly that anything generated by AI needs to have a human review, and many of the vendors are building it that way,” she says. “The harder part is the tools that someone purchases outside the process.”
To cover this eventuality, the governance committee is also involved in providing education to the organization.
“We can’t lose rights to our IP, or allow these vendors to use our content to learn for other tools or products,” she says. “These are hard concepts to try to share across 10,000 people, but we’re trying to get there.”
With leadership teams represented on the governance committee, it helps get the word out that gen AI needs to be something that legal and information security have looked at. “Do we understand what they’re doing with our data?” she says.
Looking ahead
Umeton says executive sponsorship and a multidisciplinary governance committee were critical factors in the institute’s gen AI deployment. For GPT4DFCI, a careful rollout process was also key, starting with a small group of advanced users and gradually expanding access.
“GPT4DFCI has empowered our workforce by providing the equivalent of an exceptional intern capable of drafting quality work output based on online information, yet requiring oversight because they have zero years of work experience,” he says.
Next, the institute will create training courses on prompt engineering techniques and the ethics of using gen AI. On the governance side, Dana-Farber will continue to refine its AI policies, communicate with the community, review usage, discuss ethical considerations, and stay updated on external regulations and industry learnings.
Staying updated isn’t always easy, he admits. With the rapid development in AI, including multi-modal models and agents, it’s an overwhelming influx of information to stay on top of. Umeton himself says he has a bot that monitors news outlets and social media feeds, and compiles a weekly digest for his review.
“The advent of ChatGPT was just the beginning,” he says. “With advancements in latency reduction and video stream processing, we’re approaching a future where AI agents could become ubiquitous in our physical world.”
These agents could assist with daily tasks and productivity, he says. But what’s most important to the business is designing and deploying AI to help with its clinical and research operation. “Our ultimate professional mandate is to support Dana-Farber’s mission of reducing the burden of cancer in the world.”
Read More from This Article: Key considerations to cancer institute’s gen AI deployment
Source: News