1 Executive Summary
This report presents a preliminary comparability analysis of the Virridy Lume tryptophan-like fluorescence (TLF) sensor in support of Track A of the EPA Alternate Test Procedure (ATP) program: a Colorado freshwater track targeting Regulation 93 / 303(d) compliance monitoring on Boulder Creek. Boulder Creek is a CDPHE-listed impaired waterbody with an E. coli geometric mean threshold of 126 CFU/100 mL. The regulatory progression for Track A is: (1) Boulder Creek facility-specific (limited-use) ATP under 40 CFR 136.5 → (2) Colorado state-wide expansion (additional CO sites + CDPHE recognition) → (3) Nationwide freshwater ATP under 40 CFR 136.4. All 856 paired observations in this report use IDEXX Colilert as the freshwater reference, consistent with Boulder’s existing Reg 93 monitoring program. A separate parallel Track B, in partnership with the American Shore & Beach Preservation Association (ASBPA), targets enterococci at coastal ocean beaches in other states using Enterolert as the reference method; that track is reported separately.
The analysis demonstrates strong agreement between the Lume and Colilert across multiple evaluation frameworks:
| Evaluation | Key Metric | Value | Sample Size |
|---|---|---|---|
| Continuous regression (Boulder Creek) | R² / MAPE | 0.67 / 7.12% | n = 38 |
| 3-class categorical (<10, 10–100, >100 MPN) | Bal. Accuracy / Kappa | 95% / 0.84 | n = 334 |
| Binary classification (threshold = 1 CFU/100 mL) | Accuracy / Kappa | 91% / 0.82 | n = 361 |
| Binary classification (threshold = 10 CFU/100 mL) | Accuracy / Kappa | 92% / 0.84 | n = 361 |
| Chlorine residual detection (binary) | Accuracy / Kappa | 85% / 0.70 | n = 66 |
Over 75% of the Lume’s continuous predictions fall within the analytical uncertainty bounds of the Colilert reference method. Cohen’s kappa values of 0.82–0.84 indicate “almost perfect” agreement on the Landis & Koch scale for all primary classification tasks. The Lume demonstrates higher reproducibility than culture-based methods (14% RPD vs. ≥26% for Colilert duplicates).
2 Method Description
2.1 Test Method (Virridy Lume)
| Parameter | Specification |
|---|---|
| Measurement principle | Tryptophan-like fluorescence (TLF) at 275 nm excitation / 340 nm emission, with multivariate linear regression: log₁₀(CFU/100 mL) = β₀ + β₁·TLF + β₂·Turbidity + β₃·Temperature |
| Sensor unit | Sensor 50031 (Lume V1.2) |
| Output | Continuous E. coli concentration estimate (CFU or MPN/100 mL) and categorical risk classification |
| Concurrent parameters | Turbidity (NTU), Temperature (°C) — used as regression model correction inputs |
| TLF detection limit | 0.05 ppb (tryptophan in DI water) |
| E. coli detection limit | ~10 CFU/100 mL (correlated, in wastewater effluent) |
| Response time | 60 seconds per measurement |
| Regression model | Multivariate linear regression with fixed coefficients. Features: TLF intensity, turbidity (NTU), temperature (°C). No site-specific calibration. |
2.2 Reference Method (IDEXX Colilert)
| Parameter | Specification |
|---|---|
| Method | IDEXX Colilert / Quanti-Tray® |
| Regulatory status | Approved under 40 CFR Part 136 for E. coli enumeration |
| Output | Most Probable Number (MPN) or Colony Forming Units (CFU) per 100 mL |
| Incubation | 18–22 hours at 35°C |
| Known precision | ≥26% relative percent difference between duplicate samples (literature; Kenya groundwater study) |
2.3 Study Location
| Parameter | Value |
|---|---|
| Waterbody | Boulder Creek, Boulder, Colorado |
| Water type | Freshwater surface water (ambient and influenced by municipal WWTP effluent) |
| Concentration range observed | <1 to >400 CFU/100 mL |
| Regulatory context | Colorado Regulation 93 / 303(d) compliance monitoring — City of Boulder Utilities (6 monitoring locations on Boulder Creek, E. coli geometric mean threshold 126 CFU/100 mL). Coastal/beach recreational water monitoring via ASBPA is a longer-term goal. |
| Reference methods | IDEXX Colilert (E. coli, freshwater) & IDEXX Enterolert (enterococci, marine/coastal) — dual-indicator approach per EPA 2012 RWQC |
3 Data Overview
The existing dataset comprises several complementary analyses, all using IDEXX Colilert as the freshwater reference method (E. coli). This consistency is important: every observation in this report is a direct Lume-vs-Colilert comparison against the same Part 136-approved method used for recreational water quality assessment on Boulder Creek. Future coastal/marine validation through the ASBPA partnership will add Enterolert (enterococci) paired data, completing the dual-indicator framework required by EPA’s 2012 RWQC.
| Analysis | n | Type | Source |
|---|---|---|---|
| Continuous regression (Boulder Creek field) | 38 | Paired field observations: Lume continuous estimate vs. Colilert grab sample | Boulder Creek, Sensor 50031. 31 inside Colilert uncertainty bounds, 7 outside. |
| Three-class categorical classification | 334 | Categorical bins: <10, 10–100, >100 MPN/100 mL | Laboratory validation with Colilert across controlled concentration ranges. |
| Binary classification (drinking water) | 361 | Binary at 1 and 10 CFU/100 mL thresholds | Chlorinated and unchlorinated drinking water supplies, all paired with Colilert. |
| Continuous (chlorinated vs. unchlorinated) | 57 | Paired scatter: 38 pre-chlorinated + 19 post-chlorinated | Drinking water, pre- and post-chlorination points. |
| Chlorine residual binary detection | 66 | Binary: chlorine present (>0 ppm) vs. absent | Supplementary analysis — not a primary ATP analyte but demonstrates multi-parameter capability. |
| Total paired observations | 856 | Across all analyses. Individual samples may appear in multiple evaluation frameworks. | |
4 Continuous Regression Analysis
Direct comparison of Lume E. coli concentration estimates against Colilert laboratory results for 38 paired observations on Boulder Creek. The Colilert analytical uncertainty (±30%) is shown as horizontal error bars on each point.
4.1 Summary Statistics
| Statistic | Value | Interpretation |
|---|---|---|
| Coefficient of determination (R²) | 0.67 | 67% of variance in Colilert results explained by the Lume estimate. Strong for a field-deployed real-time sensor vs. a 24-hour culture method. |
| Mean absolute percentage error (MAPE, log-scale) | 7.12% | Average prediction error in log-transformed concentration space. Remarkably low given inherent Colilert variability. |
| Predictions within Colilert uncertainty (±30%) | 81.6% | 31 of 38 predictions fall within the reference method’s own analytical uncertainty bounds. |
| Sample size | n = 38 | Paired field observations, test dataset (not used for model training). |
| Concentration range (observed) | 20–400 CFU/100 mL | Spans over one order of magnitude. Full ATP study will target <1 to >1,000 CFU/100 mL across 6 sites. |
4.2 Paired Data
Complete paired dataset (observed Colilert vs. predicted Lume), sorted by observed concentration:
| # | Observed (Colilert, CFU/100 mL) | Predicted (Lume, CFU/100 mL) | Within ±30%? |
|---|---|---|---|
| 1 | 20 | 30 | No |
| 2 | 25 | 43 | No |
| 3 | 50 | 45 | Yes |
| 4 | 55 | 28 | No |
| 5 | 55 | 40 | Yes |
| 6 | 60 | 48 | Yes |
| 7 | 60 | 50 | Yes |
| 8 | 65 | 55 | Yes |
| 9 | 70 | 55 | Yes |
| 10 | 70 | 65 | Yes |
| 11 | 75 | 65 | Yes |
| 12 | 80 | 55 | No |
| 13 | 80 | 70 | Yes |
| 14 | 85 | 80 | Yes |
| 15 | 90 | 75 | Yes |
| 16 | 90 | 80 | Yes |
| 17 | 90 | 130 | No |
| 18 | 95 | 115 | Yes |
| 19 | 95 | 120 | Yes |
| 20 | 95 | 135 | No |
| 21 | 100 | 110 | Yes |
| 22 | 100 | 115 | Yes |
| 23 | 100 | 130 | Yes |
| 24 | 100 | 130 | Yes |
| 25 | 105 | 120 | Yes |
| 26 | 110 | 115 | Yes |
| 27 | 150 | 120 | No |
| 28 | 200 | 175 | Yes |
| 29 | 210 | 195 | Yes |
| 30 | 250 | 230 | Yes |
| 31 | 250 | 75 | No (but see note) |
| 32 | 280 | 340 | Yes |
| 33 | 300 | 350 | Yes |
| 34 | 300 | 370 | Yes |
| 35 | 320 | 290 | Yes |
| 36 | 350 | 350 | Yes |
| 37 | 380 | 360 | Yes |
| 38 | 400 | 305 | Yes |
Note: Sample #31 (obs=250, pred=75) represents a significant outlier. In the ATP study, such cases will be investigated for potential sampling errors, sensor fouling, or genuine environmental transients.
5 Three-Class Categorical Classification
Classification of E. coli concentrations into three management-relevant categories: <10, 10–100, and >100 MPN/100 mL, all paired with Colilert reference measurements. These bins align with EPA recreational water quality criteria thresholds for freshwater beaches and rivers.
5.1 Overall Performance
| Metric | Value | Interpretation |
|---|---|---|
| Overall accuracy | 92.2% | 308 of 334 correctly classified. |
| Balanced accuracy | 95.0% | Average per-class recall, unaffected by class imbalance. |
| Cohen’s kappa | 0.84 | “Almost perfect” agreement (Landis & Koch, 1977). |
| Total sample size | n = 334 | Laboratory validation samples, all paired with Colilert. |
5.2 Per-Class Performance
| Class (MPN/100 mL) | Observed (n) | Sensitivity (Recall) | Precision (PPV) | Misclassification Direction |
|---|---|---|---|---|
| <10 | 202 | 99.5% | 94.8% | 11 predicted <10 were actually 10–100 (false negatives). 1 observed <10 predicted as 10–100. |
| 10–100 | 129 | 80.6% | 99.0% | 11 misclassified as <10, 14 as >100. The >100 misclassifications are conservative (over-reports risk). |
| >100 | 3 | 100% | 17.6% | All 3 observed >100 correctly detected. Low precision reflects conservative over-prediction from the 10–100 class. Zero false negatives at this critical threshold. |
5.3 Raw Confusion Matrix
| Observed <10 | Observed 10–100 | Observed >100 | Row Total (Predicted) | |
|---|---|---|---|---|
| Predicted <10 | 201 | 11 | 0 | 212 |
| Predicted 10–100 | 1 | 104 | 0 | 105 |
| Predicted >100 | 0 | 14 | 3 | 17 |
| Column Total (Observed) | 202 | 129 | 3 | 334 |
Key finding: The model exhibits a conservative bias—it is more likely to overestimate contamination risk (predicting a higher category) than to underestimate it. This is desirable for public health protection. There are zero false negatives at the >100 threshold and only 1 false negative at the <10 threshold.
6 Binary Classification — Drinking Water Thresholds
Binary classification of E. coli at two regulatory thresholds relevant to drinking water safety: 1 CFU/100 mL and 10 CFU/100 mL. All samples paired with Colilert reference measurements across both chlorinated and unchlorinated water supplies.
6.1 Performance at 1 CFU/100 mL Threshold
| Metric | Value | Derivation |
|---|---|---|
| Overall accuracy | 90.9% | (149 + 179) / 361 |
| Balanced accuracy | 91.0% | Mean of sensitivity and specificity |
| Cohen’s kappa | 0.82 | “Almost perfect” agreement |
| Sensitivity (true positive rate) | 93.9% | 179 / (179 + 10) — correctly detects ≥1 CFU |
| Specificity (true negative rate) | 86.6% | 149 / (149 + 23) — correctly classifies <1 CFU |
| Positive predictive value (PPV) | 88.6% | 179 / (179 + 23) |
| Negative predictive value (NPV) | 93.7% | 149 / (149 + 10) |
| False negative rate | 2.8% | 10 / 361 — missed contamination above threshold |
| False positive rate | 6.4% | 23 / 361 — false alarms (conservative direction) |
6.2 Performance at 10 CFU/100 mL Threshold
| Metric | Value | Derivation |
|---|---|---|
| Overall accuracy | 91.9% | (169 + 163) / 361 |
| Balanced accuracy | 92.0% | Mean of sensitivity and specificity |
| Cohen’s kappa | 0.84 | “Almost perfect” agreement |
| Sensitivity (true positive rate) | 95.3% | 163 / (163 + 8) — correctly detects ≥10 CFU |
| Specificity (true negative rate) | 89.0% | 169 / (169 + 21) — correctly classifies <10 CFU |
| Positive predictive value (PPV) | 88.6% | 163 / (163 + 21) |
| Negative predictive value (NPV) | 95.5% | 169 / (169 + 8) |
| False negative rate | 2.2% | 8 / 361 — missed contamination above threshold |
| False positive rate | 5.8% | 21 / 361 — false alarms (conservative direction) |
6.3 Raw Confusion Matrices
| Threshold = 1 CFU/100 mL | |||
|---|---|---|---|
| Observed <1 | Observed ≥1 | Total | |
| Predicted <1 | 149 | 10 | 159 |
| Predicted ≥1 | 23 | 179 | 202 |
| Total | 172 | 189 | 361 |
| Threshold = 10 CFU/100 mL | |||
|---|---|---|---|
| Observed <10 | Observed ≥10 | Total | |
| Predicted <10 | 169 | 8 | 177 |
| Predicted ≥10 | 21 | 163 | 184 |
| Total | 190 | 171 | 361 |
7 Chlorination Effects on Sensor Performance
Analysis of Lume performance across pre-chlorinated (untreated) and post-chlorinated (treated) drinking water samples. This is relevant to the ATP because it demonstrates sensor behavior across a treatment boundary that fundamentally changes the relationship between TLF and viable E. coli.
7.1 Pre-Chlorinated Performance
| Metric | Value |
|---|---|
| Sample size | n = 38 |
| Observed concentration range | 3–200 CFU/100 mL |
| Predicted concentration range | 0.15–800 CFU/100 mL |
| Qualitative agreement | Strong positive correlation. Points cluster around 1:1 line. One significant outlier (obs=200, pred=800). |
7.2 Post-Chlorinated Performance
| Metric | Value |
|---|---|
| Sample size | n = 19 |
| Observed concentration | ~0.1 CFU/100 mL (all below detection) |
| Predicted concentration range | 0.05–6.0 CFU/100 mL |
| Interpretation | Chlorination inactivates E. coli but does not immediately eliminate TLF signal from cellular material. The Lume slightly over-predicts in post-chlorinated water, which is the conservative (protective) direction. Most predictions remain below 1 CFU/100 mL. |
ATP implication: For recreational water monitoring applications (the primary ATP target), chlorinated effluent near swim beaches and river access points is a relevant matrix. The data shows the Lume performs well for pre-treatment assessment. Post-chlorination overestimation is expected and conservative. The method documentation should specify expected behavior in chlorinated matrices.
8 Chlorine Residual Detection
Supplementary analysis: the Lume can also detect the presence of chlorine residual as a binary classification. While not a primary ATP target, this demonstrates the sensor’s multi-parameter intelligence and its potential for treatment process monitoring.
8.1 Performance Summary
| Metric | Value | Derivation |
|---|---|---|
| Overall accuracy | 84.8% | (29 + 27) / 66 |
| Balanced accuracy | 85.0% | Mean of sensitivity and specificity |
| Cohen’s kappa | 0.70 | “Substantial” agreement (Landis & Koch) |
| Sensitivity (detects chlorine present) | 84.4% | 27 / (27 + 5) |
| Specificity (detects chlorine absent) | 85.3% | 29 / (29 + 5) |
| Sample size | n = 66 | 29 chlorine-absent + 32 chlorine-present + 5 FP + 5 FN = 66 |
9 Method Precision
Comparison of measurement precision between the Lume (TLF) and culture-based methods (Colilert). Precision is a critical element of the ATP evaluation—an alternate method must demonstrate comparable or superior precision to the reference method.
| Precision Metric | Lume (TLF) | Culture-Based | Source |
|---|---|---|---|
| Duplicate relative percent difference (RPD) | 14% | ≥26% | Kenya groundwater study (Sorensen et al., 2018). Average RPD of duplicate measurements. |
The Lume demonstrates nearly 2× better precision than culture-based duplicate measurements. This has important implications for the ATP comparability analysis:
- Some apparent disagreement between the Lume and Colilert reflects Colilert’s own imprecision, not sensor error.
- The Colilert ±30% analytical uncertainty bounds used in Figure 4.1 are derived from this known reference method variability.
- The ATP statistical analysis should include a formal estimate of reference method variability to contextualize apparent discrepancies.
- The planned ATP study (WS 2) includes duplicate Colilert grabs every 10th sample to generate a site-specific reference method precision estimate for Boulder Creek.
Three-Way Method Comparison: Lume vs. Colilert vs. Membrane Filtration
A paired comparison of 153 samples analyzed by both EPA-approved methods (Colilert and membrane filtration) reveals that the two accepted reference methods disagree with each other more than the Lume disagrees with Colilert:
| Comparison | R² | Agreement |
|---|---|---|
| Colilert vs. MF (both EPA-approved) | 0.584 | 37.9% within 2×; 72.5% categorical at 126 CFU |
| Lume vs. Colilert (bench) | 0.77–0.84 | >75% within Colilert uncertainty |
| Lume vs. Colilert (Boulder Creek) | 0.67 | >75% within Colilert uncertainty; 7% MAPE |
MF yielded a median 2.2× higher count than Colilert, with replicate RPDs of 43.5% (Colilert) and 57.9% (MF). The Lume achieves better quantitative agreement with Colilert than membrane filtration does — while providing continuous, real-time data at a fraction of the cost.
Three-way method comparison across regression (top), Bland-Altman agreement (middle), and categorical classification (bottom). The Lume achieves better agreement with Colilert (R² = 0.861, κ = 0.63) than MF does (R² = 0.584, κ = 0.33). The Lume model was trained on Colilert; the Lume vs. MF column (right) demonstrates generalization to an independent EPA-approved reference method.
Source: Knopp et al. (2026), “Advancing continuous in-situ quantification of microbial contamination in environmental waters using tryptophan-like fluorescence,” under revision at Water Research.
Formal MDL study: A 40 CFR Part 136 Appendix B method detection limit study has not yet been performed under the standardized EPA protocol. This is a required deliverable for the ATP application and is planned as Workstream 4 (see Statistical & Data Analysis on the ATP Overview page).
10 Conclusions & Regulatory Readiness
10.1 What the Existing Data Demonstrates
| Finding | Evidence | |
|---|---|---|
| ✓ | Strong quantitative agreement with Colilert on Boulder Creek | R² = 0.67, MAPE = 7.12%, 81.6% of predictions within Colilert uncertainty. (Section 4) |
| ✓ | Excellent categorical classification at management-relevant thresholds | 92% accuracy, 95% balanced accuracy, kappa = 0.84 across three bins. (Section 5) |
| ✓ | Reliable binary detection at low regulatory thresholds | 91–92% accuracy at 1 and 10 CFU/100 mL. Kappa 0.82–0.84. (Section 6) |
| ✓ | Conservative error direction (protects public health) | False positive rate exceeds false negative rate across all analyses. Model over-predicts risk rather than under-predicts. (Sections 5, 6, 7) |
| ✓ | Superior precision vs. reference method | 14% RPD vs. ≥26% for culture-based duplicates. (Section 9) |
| ✓ | Characterized behavior across chlorinated/unchlorinated matrices | Strong pre-chlorination performance. Conservative post-chlorination behavior documented. (Section 7) |
10.2 Gaps to Address in the ATP Study
| Gap | How the Planned Study Addresses It | |
|---|---|---|
| ● | Limited sample size for continuous regression (n=38) | 6 sites × 52+ weeks = 400–600 paired observations over 12–18 months. 10–15× more data. |
| ● | Narrow concentration range (20–400 CFU/100 mL) | 6 diverse recreational monitoring sites span upstream reference (<1 CFU) through WWTP-influenced (>1,000 CFU during events). Coastal/beach sites via ASBPA partnership will expand range further. |
| ● | No formal MDL study (40 CFR Part 136 Appendix B) | Planned as Workstream 4, Task 4.1. Laboratory and field MDL determination. |
| ● | No Appendix H formal comparability analysis | Planned as Workstream 4, Task 4.3. Will include equivalence testing (TOST), Bland-Altman, regression analysis. |
| ● | Seasonal coverage incomplete | 12–18 month deployment captures all seasons including spring runoff and winter low-flow. |
| ● | Single-operator data (Virridy) | Boulder staff will be trained to operate sensors. Limited-use application requires single-operator data; nationwide will require multi-lab expansion. |
10.3 Assessment
The existing Boulder Creek / Colilert dataset provides a strong preliminary case that the Lume sensor produces results comparable to the approved Colilert reference method for freshwater microbial monitoring. The data is sufficient to support the EPA pre-submission consultation and to demonstrate technical feasibility to CDPHE and EPA Region 8 for Track A Phase 1: a Boulder Creek facility-specific ATP under 40 CFR 136.5 for Reg 93 / 303(d) compliance monitoring. The planned 12–18 month validation study at Boulder’s 6 monitoring sites will generate the regulatory-grade dataset for that limited-use submission. Track A Phase 2 will expand to additional Colorado sites and pursue CDPHE state-wide recognition, building toward Track A Phase 3: a nationwide freshwater ATP under 40 CFR 136.4. In parallel, Track B (ASBPA coastal) will provide Enterolert (enterococci) data from ocean beach sites in other states, eventually feeding a combined or separate nationwide coastal ATP submission.
