PILOT — Private preview. Progress is saved for this browser session only.
HaiPhai.AI Fluency for Biotech

Capstone — Your Computational Biologist's AI Playbook

Lesson 5~18 min

Module 04B · Lesson 05

Capstone — Your Computational Biologist's AI Playbook

Reading time: 18 minutes (plus 90-minute workshop) Track: Role Path — Computational Biology · Capstone Prerequisites: Module 04B · Lessons 01-04


What this lesson does

You've finished four lessons covering AI for computational biology. This capstone turns it into a personal operating system you'll use this week.

By the end of this lesson, you'll have produced:

  1. A personal prompt library — 10-15 prompts for your common work
  2. A workflow map for your top 3-5 recurring projects
  3. A verification checklist tuned to computational work
  4. A 30-day plan to internalize the capability

Same structure as Module 04A's capstone, customized for computational work. Set aside 90 minutes.


01 · Why a personal playbook matters (revisited for comp bio)

Computational biologists are particularly susceptible to one failure mode: thinking that because we work in code, we'll naturally remember our workflows.

We don't. Without externalization, your effective prompts decay into the ether. The good workflow you developed for the project six months ago isn't available for the new project starting today.

The playbook is the bridge from episodic mastery to compounded capability. In a year, you'll wonder why you didn't build it sooner.


02 · Part One — Your prompt library

Build a file (Markdown, OneNote, Notion, whatever you'll use). First section: prompt library.

Structure for each entry

## [Prompt name]

**Use case:** [When this is appropriate]
**Status:** [Draft / Tested / Reliable]
**Last used:** [Date]

**Prompt:**
[Full text]

**Common refinements:**
- [Refinement 1]
- [Refinement 2]

**Verification:**
- [What to check after running]

The 10-15 prompts to start with

Build these now.

Prompt 1 — Code generation from specification

From Lesson 02. Customize for your typical languages and libraries.

Prompt 2 — Code debugging

Senior software engineer with deep experience in [language/domain]. I have an error in the following code. The error message is [error]. Diagnose the root cause and propose a fix.

Address:
1. What's the actual root cause (not just the symptom)?
2. Why is this code producing this error?
3. What's the minimal fix?
4. Are there related issues this code might also have?

Be specific. Don't paper over the bug with a workaround.

[Code attached]

Prompt 3 — Code review and refactoring

Senior engineer reviewing the following code. Provide:
1. Specific issues (correctness, performance, readability, maintainability)
2. Suggested refactor with reasoning
3. Test cases I should add
4. Code smells worth addressing now vs. tolerable for now

[Code attached]

Prompt 4 — Explain unfamiliar code

From Lesson 02.

Prompt 5 — Methodology consultation

From Lesson 03. Customize for your domain.

Prompt 6 — Stress test before finalizing

From Lesson 03.

Prompt 7 — Visualization choice and code

Senior data visualization expert. I have data with [structure] and want to communicate [message] to [audience].

Provide:
1. 2-3 chart types that could work, with trade-offs
2. Recommendation with reasoning
3. Code to produce the recommended chart in [Python/R] with [library]
4. Specific design choices (color, scale, annotations) and why

[Data description]

Prompt 8 — Plain-language summary

From Lesson 04.

Prompt 9 — Structured report drafting

From Lesson 04.

Prompt 10 — Presentation outline

From Lesson 04.

Prompt 11 — Pre-meeting prep

Senior computational scientist preparing for a 30-minute discussion with [audience]. The topic is [analysis]. Key findings: [bullets].

Generate:
1. Single sentence to lead with
2. 3-5 likely questions
3. Concise answers to each
4. The one thing they should take away

Prompt 12 — Pipeline architecture

Senior bioinformatician designing a [pipeline type]. Walk me through architecture for [setup: data type, scale, deliverables].

Cover:
1. Tool choices with justification
2. Pipeline orchestration approach
3. Containerization strategy
4. Reproducibility requirements
5. Failure handling
6. Performance considerations

Prompt 13 — Method comparison

Senior [domain] biologist. I'm choosing between [Method A] and [Method B] for [task]. For my specific context [details], compare the methods on:
1. Statistical assumptions and how they hold for my data
2. Strengths and limitations
3. Implementation effort
4. Standard practice in the field
5. Your recommendation with reasoning

Prompt 14 — Adversarial review of analysis

Senior biostatistician reviewing the following analysis. Identify:
1. Claims that exceed what the data supports
2. Methodological choices that might be questioned
3. Alternative explanations for the findings
4. Specific revisions to make the analysis more defensible

Be specific. Don't manufacture issues if the analysis is solid.

[Analysis details]

Prompt 15 — Documentation generation

Senior bioinformatician documenting code/pipeline for handoff to other team members. Generate documentation that covers:

1. Purpose (what it does and why)
2. Inputs (format, location, requirements)
3. Outputs (format, location, interpretation)
4. Dependencies (software, versions)
5. How to run
6. How to validate the run worked
7. Known issues and gotchas
8. Maintenance notes

Audience: bioinformaticians joining the team next month.

[Code/pipeline]

03 · Part Two — Your workflow map

Identify the 3-5 recurring projects in your work. For each, map the workflow.

Example workflow: Standard RNA-seq analysis project

## Workflow: Bulk RNA-seq analysis project

**Trigger:** Bench team has sequencing data and biological question
**Output:** Analysis report, code, figures, recommendations
**Time:** 1-2 weeks with AI (vs. 2-4 weeks without)

**Steps:**

1. Project scoping (manual, 1-2 hours)
   - Meet with bench team
   - Define biological question and decision being supported
   - Confirm data availability and quality

2. Pipeline setup (Prompt #12 for architecture, then Prompt #1 for code)
   - Define pipeline architecture
   - Implement or adapt existing pipeline
   - Set up data flow and configuration

3. Data QC (manual + standard pipeline)
   - Sample QC
   - Sequencing QC
   - Library complexity check
   - Flag any issues to bench team

4. Primary analysis (Prompt #1 for code, Prompt #5 for methodology)
   - Differential expression
   - Pathway analysis
   - Cell-type-specific analyses if relevant

5. Verification (manual, Prompt #14 for adversarial review)
   - Spot-check key results against raw data
   - Verify methodology choices
   - Adversarial review

6. Visualization (Prompt #7)
   - Generate figures for report
   - Iterate based on what they show

7. Communication (Prompts #8, #9, #10)
   - Slack summary for team lead
   - Structured report for team
   - Presentation for program team

8. Documentation (Prompt #15)
   - Code documentation
   - Methodology document
   - Reproducibility info
   - Save all prompts and outputs

**Verification gates:**
- Step 3: data quality acceptable before primary analysis
- Step 5: methodology and results verified before communication
- Step 7: report reviewed before circulation

**Common variants:**
- Small study (n<10 per group): more conservative methodology, more uncertainty in communication
- Time-course study: different statistical approach, longitudinal visualization
- Multi-condition: more complex contrasts, additional QC for confounding

Build 3-5 of these for your most common project types. They become invisible after a month of use.


04 · Part Three — Your verification checklist

Tailored for computational work:

## Pre-finalization verification checklist (computational biology)

### Code
- [ ] Read carefully before running
- [ ] Tested on known case
- [ ] Tested on edge cases
- [ ] Dependencies pinned
- [ ] Random seeds set
- [ ] Reproducibility verified (rerun produces same output)

### Statistical analysis
- [ ] Test choice appropriate and justified
- [ ] Assumptions checked
- [ ] Multiple comparisons handled correctly
- [ ] Effect sizes reported, not just p-values
- [ ] Statistical significance distinguished from biological meaningfulness

### Data handling
- [ ] Sample sizes correct at every step
- [ ] Groups correctly assigned
- [ ] Missing data handled deliberately (and documented)
- [ ] Outliers handled deliberately (and documented)

### Visualization
- [ ] Charts accurately represent underlying data
- [ ] Error bars represent what's intended (SD/SEM/CI)
- [ ] Axes scaled appropriately
- [ ] No misleading aspect ratios

### Communication
- [ ] Conclusions calibrated to what data supports
- [ ] Limitations stated honestly
- [ ] Plain-language summary readable by non-computational colleagues
- [ ] Voice rework from AI default

### Documentation
- [ ] AI use documented (tool, model, date, what was generated)
- [ ] Code archived with version info
- [ ] Methodology decisions documented with reasoning
- [ ] Reproducibility info complete

Adapt this. Use it mechanically for high-stakes work, mentally for routine work.


05 · Part Four — Your 30-day plan

Week 1 — Library building

Build the 10-15 prompts. Test each on real work. Save to library.

Week 2 — Workflow mapping

Build 3-5 workflow maps. Use them on real projects.

Week 3 — Verification habituation

Run the checklist on every AI-assisted work product. Note what it catches.

Week 4 — Integration and refinement

Refine the playbook based on three weeks of practice. Address the day-30 wall — the temptation to stop investing.

After 30 days, the playbook becomes invisible. The practices are habits.


06 · The capstone exercise

Spend 90 minutes building your playbook now:

  • 30 min: prompt library (Section 02)
  • 30 min: workflow maps (Section 03)
  • 15 min: verification checklist (Section 04)
  • 15 min: 30-day plan (Section 05)

Version 1 will be worse than version 5. Version 5 only exists if you build version 1 today.


07 · End of Module 04B

You've finished the computational biology role path.

You can now:

  1. Use AI to draft code that you verify confidently
  2. Apply structured discipline to methodology choices
  3. Translate computational work for the rest of your organization
  4. Maintain a personal playbook that compounds your capability

Path forward:

  • Advanced modules (05-10): Connectors, Agent Design, Skills, Operating Model, Cross-Functional, Capstone
  • Cross-role learning if your work spans multiple paths
  • Practice — the playbook compounds only if used

The 30-day plan turns understanding into capability. The next 60-90 days harden it. The next 12-18 months distinguish you.


End of Module 04B.