Elanco Knowledge Systems · Child 03 of 3 Child 03 / ElancoGPT · Applied AI · 2023–2024

A decision surface, not a chatbot.

ElancoGPT used OpenAI GPT-3.5 + Google PaLM 2 behind a federated retrieval layer. The interesting design problem wasn't the model — it was the output shape. Every response renders as four typed blocks: fact, inference, risk, action. Each block is independently citable, copyable, and attachable to downstream artifacts.

Applied AIFederated LLMRAGStructured outputEnterprise
Company
Elanco Animal Health
Timeline
Apr 2023 — Aug 2024 · 17 mo
Role
Lead Product Designer · AI surface design
Stack
OpenAI GPT-3.5 · Google PaLM 2 · custom retrieval · EDS
Adoption
71% weekly active across eligible employees (18 mo)
Surfaces
Web app · Salesforce embed · MS Teams bot
01 / JourneyAsk → retrieve → render → act

A query is the start of a workflow, not the end of one.

The journey shows what happens between a salesperson asking a question and a customer receiving a structured answer. Each lane has a specific commitment.

Actor
Ask
Retrieve
Render
Verify
Act
Salesperson
"What's the dosing for senior dogs?"
Receives fact / infer / risk / action.
Reads provenance citations.
Copies action block into email.
Retrieval layer
Federates OpenAI + PaLM + corpus.
Returns ranked sources.
Surfaces ambiguity for human review.
Vet affairs
Audits flagged outputs weekly.
Updates corpus when gaps appear.
Customer (clinic)
Receives sourced answer in email.
Forwards or escalates.
Query lifecycle01 / 05
A field salesperson asks a clinical-positioning question.
01 · Ask

The query starts in plain language.

A field salesperson types: "What's the dosing protocol for Galliprant in senior dogs, and what's the LFT risk?" No prompt engineering. No keyword formatting. The interface meets them where they already think.

Input: natural language · multi-question allowed

02 · Retrieve

Federated retrieval over the regulatory + clinical corpus.

Ranked sources surface from FDA labels, internal clinical guidance, and approved customer-facing materials. Provenance is mandatory. A query without source-grounding never reaches the rendering layer.

Sources: 3 corpora · top-k = 8 · confidence threshold 0.72

03 · Render

Output is structured as four typed blocks.

Fact / inference / risk / action. Each block has a confidence indicator, each is separately citable, each is copyable to a downstream artifact. The AI never returns "an answer." It returns a structured argument.

Pattern: VerticalSemanticCard (from EDS)

04 · Verify

Every claim links to its source. Confidence is visible.

Low-confidence outputs surface a warning chip and route to a vet-affairs audit queue. 100% of responses ship with provenance by design. Salespeople can defend any claim they pass to a customer.

Audit rate: ~4% of weekly volume · escalation: vet affairs

05 · Act

Each block becomes a downstream artifact.

The Action block becomes the body of a customer email. The Fact block becomes a regulatory citation. The Risk block goes into the clinical-decision log. Structure travels with the content all the way to the customer.

Downstream artifacts created from outputs: ~240/week

02 / Output shapeLive example

What a Galliprant query returns.

FACTGalliprant (grapiprant) is approved for osteoarthritis pain in dogs ≥ 9 months old at 2 mg/kg PO once daily. FDA label · NADA 141-455
INFERFor this customer's clinic mix (mostly senior dogs), dosing simplicity is the most likely conversion lever vs. competitor NSAIDs. conf 0.84
RISKNot recommended in dogs with severe hepatic impairment. Confirm baseline LFTs before recommending substitution. Clinical guidance · 2023-08
ACTIONSend the clinic-specific dosing card + LFT-baseline reminder template attached. → create email draft

Every block has provenance. Every block is independently exportable. The user controls which blocks travel to the customer.

03 / ArtifactsImage slots reserved

Wireframes, screens, retrieval diagrams.

04 / Outcomes18 mo post-launch

Adopted globally. Time-to-answer collapsed.

Weekly active users71%Eligible commercial + vet org
Time-to-answer−68%vs pre-tool baseline (same query class)
Source citation rate100%Every block ships with provenance
Artifacts / week240+Emails · regulatory notes · briefs
05 / ReflectionDefend / redo
What I'd defend

Structured output, every time.

A chatbot that returns paragraphs is the lowest-value AI surface in an enterprise context. Fact / inference / risk / action made the AI a decision surface, not a reading surface.

What I'd do differently

Ship the audit queue with v1.

Vet-affairs audit ran on a spreadsheet for the first six months. A purpose-built queue could have surfaced corpus gaps faster.