Usability Study · UW HCDE · 2020

Lab software under real operational pressure, studied across four countries.

OpenELIS is the open-source laboratory information system used by high-volume labs in lower-resource settings. The study examined how usability issues compound when infrastructure, staffing, and workflow conditions vary site to site — and which interventions would matter most.

HealthcareOpen sourceInternational field researchMixed methods

Organization: OpenELIS × UW HCDE
Timeline: Sept — Dec 2020
Role: Lead Researcher · Moderator
Team: 4 researchers · 2 advisors
Participants: n = 18 across 4 countries
Method: Field interviews · Task-based usability · Synthesis

01 / BriefWhy a study, why now

Premise

OpenELIS serves high-volume laboratories in lower-resource settings, where software has to support accuracy, speed, and training under real operational pressure.

The system had been deployed in over a dozen countries, but the team didn't have a comparative picture of how usability issues showed up across sites. Single-site reports were rich but anecdotal; bug trackers captured failures but not friction. The study filled that gap.

Sites studied

Haitin = 5 · Port-au-PrinceHIV / TB workflow load; intermittent connectivity.

Côte d'Ivoiren = 4 · AbidjanFrancophone interface; multi-shift handoff focus.

Mauritiusn = 4 · Port LouisMid-volume regional reference lab; reporting workflow.

Vietnamn = 5 · Hanoi + Hai PhongHighest device-density site; barcode + sample throughput.

02 / MethodHow the study ran

Mixed methods, structured for cross-site comparison.

The study deliberately staged contextual interviews before usability sessions so that the moderation script could be calibrated against each site's actual workflow vocabulary. A structured but human protocol kept four moderators consistent without sacrificing probing depth.

Contextual interviewsSite-by-site framing: physical layout, staffing model, where OpenELIS sits in the lab workflow, training history, language of use.n = 18

Task analysisSix representative workflows — accessioning, result entry, batch reporting, patient lookup, audit export, error correction — broken down to step level.6 flows

Moderated usability sessionsSingle-task and chained-task scenarios using a sandboxed test server. Think-aloud + post-task questionnaire. 60-min sessions, dedicated note-taker.n = 16

SynthesisIssues coded at session level, aggregated by site and by workflow stage. Severity (1–4) × frequency (% sessions affected) drove final ranking.142 issues

Validation passTop 12 issues taken back to a subset of participants for confirmation. Two issues reweighted; one dropped from recommendations.n = 7

03 / FindingsFrom scattered issues to patterns

142 issues compressed into 7 system patterns.

Synthesis after the third site revealed that most issues weren't site-specific — they were workflow-stage specific. The bubble cluster below shows issues grouped by emergent theme, scaled by frequency.

Issue clusters · bubble = frequencyforce-directed

Fig 1 — 142 issues clustered into 7 themes. Bubble area = frequency across sessions. Color = severity weighting.

FACTResult entry and audit export accounted for 63% of high-severity issues despite being only 22% of session time.

INFERENCEThe system was over-tuned to data capture and under-tuned to data verification. Lab work is fundamentally an audit practice; the software treated it as a data-entry one.

ACTIONTop recommendation set focused on confirmation, error recovery, and audit visibility — not on raw entry speed, which was already adequate.

04 / FrictionRanked, severity-weighted

The top twelve, by composite friction score.

Friction score = severity (1–4) × frequency (% sessions affected) × site count (1–4). The top three issues alone accounted for 41% of session-time lost across all four countries.

Top 12 issues · sorted by composite frictioninteractive

Fig 2 — Bars scale to max score. Critical issues (sev 4) in oxblood; sev 2–3 in graphite.

05 / MatrixSeverity × frequency

The two-by-two that drove prioritization.

Issues plotted by severity (y) and frequency (x). The top-right quadrant — high frequency, high severity — became the must-fix set for the next deployment cycle.

Severity × frequency · 142 issuesscatter

Must-fix Investigate Document only Defer

Fig 3 — Bubble size = site reach. Annotated issues are the four prioritized in the recommendation set.

06 / RecommendationsWhat we sent back

Six recommendations, ordered by expected impact.

The report distinguished interface changes (immediate) from workflow changes (next release) from training-material changes (parallel). Each recommendation was tied to specific issues from the matrix.

R · 01

Add explicit confirmation to high-cost destructive actions.

Delete-sample and re-accession actions in the current build proceeded without confirmation. In 11 of 16 sessions, participants triggered one of these by mistake and had no clear recovery path. A two-step confirm + 30-second undo eliminates the failure mode entirely.

Interface · immediate · ties to #001, #004, #007

R · 02

Restructure batch-result entry as a verification flow.

The current screen optimizes for keystroke count. Participants consistently treated it as a verification step — re-reading the printout, cross-checking — but the layout encouraged keyboarding past the very fields that were most error-prone. Reorder columns, surface prior values, add inline mismatch highlighting.

Workflow · next release · cluster T-03

R · 03

Make audit history visible from the result detail view.

Finding "who changed what when" required leaving the working context, pulling a separate audit report, and matching by timestamp. Three of four sites had developed informal workarounds (paper logbooks) that this fix would replace.

Interface · next release · ties to #022–#031

R · 04

Standardize the French translation of error and confirmation strings.

Côte d'Ivoire participants encountered three distinct French translations of the same destructive-action confirmation. Two of the three could be read as the opposite of the intended meaning. A short translation review cycle resolves this without a code change.

Localization · immediate · ties to #034, #035

R · 05

Document the offline-to-online sync behavior in the operator manual.

All sites had intermittent connectivity. None of the participants could correctly describe what happened to data captured during an outage. The behavior is well-defined in the codebase but invisible to operators. Documentation alone closes most of the perceived risk.

Training · parallel · cluster T-06

R · 06

Add a "what changed" panel to the release notes shipped to deployment partners.

A meta-recommendation: deployment partners reported they were often surprised by behavior changes between releases. A short standardized panel would let trainers prepare site staff before a rollout.

Operational · ongoing · cross-cluster

07 / OutcomesWhat changed

Three of six recommendations were implemented in the following release cycle.

The deployment team prioritized the confirmation + audit + translation fixes (R-01, R-03, R-04) for immediate inclusion. The verification-flow restructure (R-02) became a roadmap item; the documentation pass (R-05) shipped alongside the next operator manual update.

Issues prioritized12From 142 surfaced · 6 became formal recommendations

Implemented in next cycle3/6R-01, R-03, R-04 · ship time ≤ 90 days

Site coverage4Haiti · Côte d'Ivoire · Mauritius · Vietnam

OpenELIS supports laboratories involved in testing for conditions like COVID-19 and HIV. Usability improvements aren't abstract; they affect daily work for the people relying on the system, and ultimately for the people those labs serve.

08 / ReflectionIn retrospect

What I'd defend, what I'd redo.

What I'd do differently.

Run a pilot earlier. An earlier pilot, ideally with international participants, would have helped refine task wording, post-task questions, timing, and the cultural shape of feedback before the full study began.

What I'd defend.

A structured but human interview script made the sessions stronger. It created consistency across moderators while leaving room to probe, clarify, and respond naturally.

What changed for me.

This study deepened my appreciation for well-documented qualitative and quantitative analysis. Ranking issues by frequency and severity made the recommendations much more credible and actionable.

Why it mattered.

OpenELIS supports clinics and laboratories doing essential infectious-disease work. The methodological discipline wasn't academic posturing — it was the bridge between what we observed and what could actually ship.

09 / Read nextAdjacent work

02 / 2022–2024