Module 03 · Lesson 03
Tool and Environment Selection — Operationalizing the Matrix
Reading time: 20 minutes Track: Governance & Compliance · REQUIRED before role specialization Prerequisites: Module 03 · Lessons 01 and 02
What this lesson does
You can classify data. You know the general matrix of what data can go into what kind of environment. This lesson translates that into specific, actionable decisions about which tools to use for which work.
By the end of this lesson, you'll be able to:
- Match specific AI tools to specific biotech work scenarios
- Recognize the trade-offs between capability and data sovereignty
- Identify what's typically already approved in a biotech environment versus what requires special review
- Make defensible tool-selection decisions under time pressure
This lesson is the most "operational" in Module 03. It deals with specific products and configurations. Some of this content will date — frontier AI products evolve faster than guidance documents do. The principles will last; the specific product names may shift.
01 · The current biotech AI tooling landscape
As of late 2025 / early 2026, the AI tooling landscape relevant to biotech sorts into roughly six categories.
Category 1 — Frontier consumer AI
The default tools most professionals use: Claude.ai (consumer), ChatGPT (consumer/Plus), Gemini (consumer). High capability, easy access, no enterprise data-handling guarantees.
Appropriate for: Tier 1 (public data) only. Personal use. Brainstorming with no proprietary content.
Inappropriate for: Anything internal, confidential, restricted, or prohibited. Most biotech work product.
Cost: $0-$30 per user per month for consumer tiers.
Category 2 — Enterprise frontier AI
The same frontier models, but in enterprise environments with data-handling contracts: Claude for Work / Claude Enterprise, ChatGPT Enterprise, Gemini Enterprise (Google Workspace integration), Microsoft 365 Copilot.
Appropriate for: Tier 2 (internal) and most Tier 3 (confidential) work. Routine biotech use.
Inappropriate for: Tier 4 unless your specific enterprise tier provides additional guarantees (zero retention, dedicated environment, BAA where required).
Cost: $25-$60 per user per month, plus admin overhead. This is where most biotechs land for general AI deployment.
Category 3 — Zero-retention / dedicated-tenant enterprise AI
A subset of enterprise offerings with stronger guarantees: dedicated VPCs, zero retention of prompts and outputs, BAA-eligible configurations, audit logs accessible to your security team. Examples: Claude on AWS Bedrock with private endpoints, Azure OpenAI Service in dedicated configurations, Google Cloud Vertex AI with enterprise controls.
Appropriate for: Sensitive Tier 3 and many Tier 4 use cases. The "right" tier for clinical operations work, regulatory drafting, BD/financial work.
Inappropriate for: Anything still on the prohibited list (Lesson 01). The strong environment doesn't override prohibitions.
Cost: Variable, typically $5-50K+ per month at organizational scale. Requires IT integration.
Category 4 — On-premises / open-source AI
AI models running entirely on infrastructure you control. Open-source models like Llama, Mistral, Falcon. Deployed on your own GPU infrastructure or in your cloud tenant. Data never leaves your environment.
Appropriate for: Tier 4 use cases where data sovereignty is required. Tier 5 use cases where they can be unblocked by the deployment model (rare; many bright lines apply regardless of environment).
Trade-off: Capability lag. Open-source models are typically 12-18 months behind frontier models in quality. For complex biotech reasoning, this matters. For routine document processing, it may not.
Cost: High up-front (infrastructure, MLOps capability) but low marginal cost per use.
Category 5 — AI-enabled SaaS
Tools that have AI baked in: meeting transcription with AI summaries (Otter, Fireflies), CRM with AI features (Salesforce Einstein, HubSpot), document management with AI (M365, Google Workspace AI features). The AI is one feature among many.
Appropriate for: Depends entirely on the specific tool and configuration. Read the data-handling terms carefully. Many of these tools route data through third-party AI providers that you didn't choose explicitly.
Risk: This is where "invisible AI" lives. People use the tools without realizing they're AI tools and without classifying the data accordingly. The handoff problem from Lesson 02 lives here.
Cost: Bundled into the SaaS pricing, often.
Category 6 — Custom internal AI
AI integrations your company builds itself: a Claude or GPT API connected to your internal data via MCP servers, RAG systems, or custom workflows. The model is enterprise-grade but the integration is yours.
Appropriate for: Use cases where your data and your workflows need integration that off-the-shelf tools don't provide. Usually appropriate for the same tier as the underlying model environment (the API call is to a Tier-3-appropriate or Tier-4-appropriate provider).
Risk: The integration layer can introduce vulnerabilities. A well-designed MCP server is safe. A poorly designed one can expose data in ways the underlying model wouldn't.
Cost: Engineering effort to build, low marginal cost to operate.
02 · The biotech-typical stack
Most mid-size biotechs (50-500 people) end up with a stack that looks roughly like this:
General-purpose AI for the workforce:
- Claude Enterprise or ChatGPT Enterprise as the primary tool
- Microsoft 365 Copilot or Google Workspace AI for office productivity integration
Restricted-data AI for sensitive functions:
- AWS Bedrock, Azure OpenAI, or Google Vertex with private endpoints for clinical/regulatory functions
- Sometimes a specialized clinical AI platform (e.g., for safety case processing)
Specialty tools:
- Coding assistants for the data science / IT teams (Cursor, GitHub Copilot, Claude Code)
- Domain-specific tools where they exist (clinical NLP, biomarker analysis, etc.)
What's typically NOT in the stack:
- Consumer ChatGPT or free Claude as official channels (though people use them informally — and Module 03 Lesson 04 covers that)
- Open-source on-prem AI (most mid-size biotechs don't have the infrastructure)
What "approved" means in practice:
- IT/Security has reviewed the tool
- There's a contract with data-handling terms
- It appears on the official "approved tools" list
- Use cases above a certain sensitivity may still require specific approval
If your company hasn't published an approved-tools list, ask for one. If they can't produce one, that's the first AI governance work that needs to happen at your company. (Don't be the person who builds it from scratch unless that's your job — but flag the gap.)
03 · The selection decision tree
For any given task, use this decision tree:
Step 1: Classify the data. (Lesson 02 framework.)
Step 2: Identify the tier-appropriate tools available to you.
- Tier 1 → any approved tool, including consumer where allowed
- Tier 2 → enterprise tier of your primary AI provider
- Tier 3 → enterprise tier with explicit data-use guarantees
- Tier 4 → zero-retention / dedicated / on-prem only, with specific approval
- Tier 5 → no AI tool (use other approaches)
Step 3: Within tier-appropriate options, select for task fit.
- Long-document reasoning, regulatory drafting → frontier model (Claude Opus, GPT-5, Gemini Advanced)
- Fast routine drafting → smaller model (Claude Sonnet, GPT-5-mini)
- Code/analysis → coding-specialized tool or frontier model
- Document collaboration → AI integrated with your document tooling
- Workflow automation → API-based integration
Step 4: Verify the specific integration is approved.
For tools you haven't used before in your environment, check with IT/Security/Compliance. Don't assume "Claude Enterprise" means "every Claude Enterprise integration." The specifics matter.
Step 5: Confirm session settings.
Some tools have per-session settings that affect data handling: turning training opt-in/opt-out, choosing models, selecting environments. Make sure your settings match the tier you're working in.
This whole tree takes about 30 seconds of thought after a few weeks of practice. The first few times it'll feel heavy. Push through.
04 · Common selection mistakes
The five most frequent tool-selection errors:
Mistake 1 — Using consumer for confidential work
Someone is on the go, on a personal device, has an idea, opens consumer ChatGPT. The data they paste is confidential. They don't realize they just exposed it.
The fix: Make enterprise AI as easy to access as consumer AI. If your enterprise tool isn't available on mobile or isn't responsive, people will route around it. This is an IT/policy problem, not a discipline problem — discipline doesn't survive friction.
Mistake 2 — Using the wrong model within the right tool
Someone is in Claude Enterprise, but they default to Haiku (small/fast) for a complex regulatory drafting task. The output is mediocre. They blame "AI."
The fix: Match model size to stakes. Know which models are available to you and when to use which. This is partly tooling design (does the interface make it easy to switch?) and partly habit.
Mistake 3 — Not knowing what AI underlies an integration
Someone uses an AI-enabled SaaS tool without knowing which AI provider powers the AI features. The data goes somewhere the user didn't explicitly authorize.
The fix: Before using any AI feature in any SaaS tool, ask: which AI? In what environment? Find out before you use it on anything non-public.
Mistake 4 — Stale assumptions about what's approved
Approved tools change. A tool that was approved last year may have been retired or restricted. A tool that wasn't approved last year may be approved now.
The fix: Re-check the approved-tools list periodically. Subscribe to updates if your IT team publishes them.
Mistake 5 — Treating "enterprise" as automatically sufficient for all data classes
"Claude Enterprise" or "ChatGPT Enterprise" sounds enterprise. People assume that means "fine for everything." It doesn't — enterprise tiers vary in their data-handling guarantees, and even the best enterprise tier may not be appropriate for some Tier 4 data.
The fix: Read the data-handling terms of the specific enterprise tier you have. Know what's covered and what isn't.
05 · Mobile and remote considerations
A practical reality: AI usage increasingly happens on mobile devices and outside corporate networks. This creates specific risks.
Mobile-specific issues
- Personal devices accessing work AI: Your enterprise AI may or may not be installed on personal devices. If it's not, people will reach for consumer alternatives.
- Mobile keyboards with AI suggestions: Some keyboard apps send typed content to AI services for suggestions. If you type confidential content with one of these, the keyboard is the leak.
- Voice assistants: Siri, Google Assistant, Alexa, etc. — these may transcribe and process voice through AI services. Be aware of what they're hearing during work conversations.
Remote-specific issues
- Home network is not corporate network: Some enterprise AI integrations only work on corporate VPN. Check before assuming.
- Family/personal accounts logged into work browser: Easy to accidentally use your personal ChatGPT account while on work content.
- Shared devices: A laptop your spouse also uses may have personal AI tool logins active.
The work-from-anywhere defaults
- Use enterprise AI tools via approved access methods
- Don't fall back to consumer alternatives because they're more convenient
- Treat any device you don't fully control as potentially compromised for confidential AI work
- If you can't access the right tool, do the work later, not now in a wrong tool
This sounds restrictive. It is. The cost of getting this wrong is severe enough to justify the restriction.
06 · The "specialty tool" question
You'll periodically be sold on specialty AI tools — products that promise specific capabilities for biotech use cases. "AI for clinical narrative generation." "AI for protocol design." "AI for safety signal detection."
Some are excellent. Some are wrappers around frontier AI with extra branding. Many are something in between.
When evaluating a specialty tool:
Ask about the underlying AI:
- Which foundation model(s) does this use?
- Is it the latest model or an older one?
- Does the vendor get access to your prompts and outputs?
- Is data used for training (theirs or yours)?
- What are the retention policies?
Ask about the differentiation:
- What does this tool do that I couldn't get from a well-prompted frontier model?
- Is the differentiation the data (proprietary corpora, validated content), the workflow (pre-built templates, integrations), or the model (specialized training)?
- Is the differentiation worth the lock-in?
Ask about the integration:
- How does this tool fit with my existing AI environment?
- Does it create a new data path I need to govern separately?
- Can I export my data and prompts if I want to leave?
Specialty tools can be great fits or expensive mistakes. The questions above help separate them.
07 · Documentation expectations
A piece you'll see throughout Module 03: documenting what AI tool was used, when, and for what.
This is not optional in regulated functions. For most biotech work, you should be able to answer (later, if asked):
- What AI tool did I use for this output?
- What model / version?
- When was the interaction (date / approximate time)?
- What data did I provide?
- What did the AI produce?
- What did I do to verify or revise?
Some companies require this in formal logs. Some embed it in document version history. Some keep it informal. The specific format matters less than the habit.
The lesson here is: assume someone will ask later. Build documentation habits that survive that scrutiny.
Module 03 Lesson 05 develops the audit-trail discipline at depth. For now, just notice that tool selection isn't only about choosing the right tool — it's also about being able to demonstrate later that the choice was right.
08 · A scenario-based walkthrough
Three realistic scenarios, with appropriate tool selections.
Scenario 1 — Drafting an internal memo summarizing recent literature
Data class: The literature is Tier 1 (public). Your synthesis is Tier 2 (internal). Net: Tier 2.
Appropriate tools: Any approved enterprise AI. Claude Enterprise, ChatGPT Enterprise, M365 Copilot, etc.
Considerations: Model selection — use a frontier model if the synthesis is complex. Use a smaller model if it's a basic literature roundup.
Documentation: Note which model was used in the memo's version history.
Scenario 2 — Drafting a section of an IND submission
Data class: Pre-submission regulatory content. Tier 4 (Restricted), arguably bright-line Tier 5 in some companies' policies.
Appropriate tools: Zero-retention enterprise (Bedrock with private endpoint, dedicated Azure OpenAI, etc.) ONLY IF your company has explicitly approved AI use for regulatory drafting. Many companies haven't.
Default if approval is unclear: Don't use AI. Draft manually. Escalate to seek approval if you think AI use is appropriate for the workflow.
Documentation: Formal logging required if AI is used. Note model, environment, date, what was provided, what was produced, what verification was done.
Scenario 3 — Brainstorming a competitive response strategy
Data class: The brainstorm itself contains internal strategic thinking. Tier 3 (Confidential).
Appropriate tools: Enterprise AI with explicit data-handling guarantees (zero retention preferred).
Considerations: Avoid pasting non-public competitive intelligence. Stay at the conceptual level. The brainstorm doesn't require pasting your strategic plan into the prompt; it requires asking good strategic questions.
Documentation: Light — note in the resulting strategy doc that AI was used in brainstorming.
These three scenarios cover most biotech AI use cases. Internalize them as templates.
09 · Knowledge check
Three questions to lock in this lesson.
Q1. You need to draft a section of a pre-IND briefing document. Your company has Claude Enterprise but hasn't published specific approval for AI use in regulatory drafting. What's the right action?
a) Use Claude Enterprise — it's an approved enterprise tool b) Use consumer Claude — it's faster c) Default to drafting manually; escalate to seek approval for AI use in this workflow if you believe AI use would be appropriate d) Use an on-premises open-source model
Q2. Which is the most accurate statement about "enterprise" AI tools?
a) Enterprise AI is automatically sufficient for any biotech data class b) Enterprise AI tiers vary in their data-handling guarantees, and the appropriate tier depends on the data class — not all enterprise offerings are equivalent c) Enterprise AI is more expensive but functionally identical to consumer AI d) Enterprise AI should only be used for restricted data
Q3. What's the single highest-leverage practice for reducing risky shadow-AI usage in an organization?
a) Block consumer AI tools at the network level b) Mandate AI training for all employees c) Make the approved enterprise AI tools as easy to access (mobile, performance, integrations) as the consumer alternatives — friction drives people to unapproved tools d) Audit AI usage logs monthly
Answers: Q1: c · Q2: b · Q3: c
10 · What's next
Two more lessons in Module 03:
- Lesson 04 · Decision Frameworks for Gray Zones — the genuinely hard judgment calls
- Lesson 05 · Audit Trails and Documentation — what to record and how
After Module 03 you'll be cleared to begin your role-specific path module.
End of Lesson 03.