Statistical Rigor

Reproducibility & Statistical Rigor

Every result is deterministic, every claim is verifiable. All analyses use seed=42, Bonferroni + BH-FDR correction, and 10,000 bootstrap iterations.

Deterministic Pipeline

The entire pipeline is bitwise-reproducible. Same image → same 16 features, every time, on any machine.

Random seed

Fixed

1,407

Images analyzed

Zenodo 10301912

Organoids (N)

4 cell lines

0.788

Median IoU

Segmentation

Reproducibility guarantee: All random operations (bootstrap resampling, cross-validation splits) use numpy seed=42. The segmentation pipeline (Otsu thresholding + morphological operations) is entirely deterministic. Results can be independently verified by downloading source data from Zenodo 10301912 and running the pipeline via Docker.

Effect Size Forest Plot

All 27 morphological comparisons (9 features × 3 disease clones vs. wildtype) with 95% bootstrap confidence intervals (10,000 iterations). Faded rows are non-significant (BH-FDR q ≥ 0.05).

A1A (TUBA1A)

B2A (TUBB2A)

TH2 (TH Deficient)

23/27 significant (BH-FDR q < 0.05) • *** p<.001 • ** p<.01 • * p<.05

-1.0-0.50+0.5+1.0

AreaA1A

+0.94***

AreaB2A

-0.52*

AreaTH2

+1.00***

PerimeterA1A

+0.90***

PerimeterB2A

-0.55*

PerimeterTH2

+0.99***

Equiv. DiameterA1A

+0.95***

Equiv. DiameterB2A

-0.70**

Equiv. DiameterTH2

+1.00***

Major AxisA1A

+0.87***

Major AxisB2A

-0.50*

Major AxisTH2

+0.98***

Minor AxisA1A

+0.94***

Minor AxisB2A

-0.77***

Minor AxisTH2

+1.00***

CircularityA1A

+0.46*

CircularityB2A

-0.66**

CircularityTH2

+0.51*

SolidityA1A

+0.59**

SolidityB2A

-0.67**

SolidityTH2

+0.63**

EccentricityA1A

-0.38

EccentricityB2A

+0.48*

EccentricityTH2

-0.29

ElongationA1A

+0.41

ElongationB2A

-0.57*

ElongationTH2

+0.31

Rank-biserial effect size (r) with 95% bootstrap CI

Effect size: rank-biserial correlation (r). Positive r = disease clone > wildtype; negative r = disease clone < wildtype. CIs via percentile bootstrap with 10,000 resamples (seed=42).

Jackknife Stability Analysis

Leave-one-out jackknife on all 23 significant comparisons. Every effect survives removal of any single organoid — zero sign changes across 736 jackknife subsamples.

23/23

Highly stable

std < 0.05, no sign changes

Sign changes

Across 736 jackknife subsamples

<0.04

Max jackknife std

All comparisons highly stable

Effect size stability per comparison (jackknife range)

areaA1A

±0.011

areaB2A

±0.033

areaTH2

±0.000

perimeteA1A

±0.014

perimeteB2A

±0.033

perimeteTH2

±0.002

circularA1A

±0.034

circularB2A

±0.027

circularTH2

±0.032

eccentriB2A

±0.036

solidityA1A

±0.031

solidityB2A

±0.026

solidityTH2

±0.029

elongatiB2A

±0.033

equiv_diA1A

±0.009

equiv_diB2A

±0.026

equiv_diTH2

±0.000

major_axA1A

±0.019

major_axB2A

±0.036

major_axTH2

±0.004

minor_axA1A

±0.012

minor_axB2A

±0.023

minor_axTH2

±0.000

Batch Effects: Negligible

ICC(1) analysis across 2 independent lab preparations (LabA, LabB) shows negligible lab-to-lab variability. Average ICC = 0.012 (max: 0.026).

ICC(1) per Feature

Major Axis

2.6%

Perimeter

2.2%

Equiv. Diameter

2.2%

Area

2.0%

Minor Axis

1.9%

Circularity

0.0%

Eccentricity

0.0%

Solidity

0.0%

Elongation

0.0%

% of variance attributable to lab. All < 5% (negligible). 0/9 features show significant lab effects.

1.2%

Average lab variance (ICC)

98.8% of total variance is biological signal, not batch artifact. Clone effects are 7–14× larger than lab effects in mixed-effects models.

Mixed-Effects Model

Formula: area ~ clone + (1|lab)

A1A effect: +264,309 μm² (p < 10&sup-16;)

B2A effect: -155,646 μm² (p = 2.1×10&sup-6;)

TH2 effect: +518,528 μm² (p < 10&sup-55;)

QC Threshold Robustness

Results are invariant to quality control threshold. Even under strict QC (retaining only 14% of images), all significant comparisons persist.

Standard QC

Sharpness ≥ 1.0, SNR ≥ 1.0

Images retained1,361/1,407 (96.7%)

Organoids retained64/64

Significant results27/48

Strict QC

Sharpness ≥ 3.0

Images retained196/1,407 (13.9%)

Organoids retained64/64

Significant results27/48

Strict QC removes 86% of images but retains all 64 organoids and all 27 significant comparisons. This demonstrates that results are driven by true biological signal, not image quality artifacts.

Feature Independence & Correlation Structure

Spearman rank correlation matrix (N=64 organoids) reveals 3 independent feature clusters matching the 3 PCA components that capture 99.4% of variance.

Size:Area, Perim., Eq.Dia., Major, Minor → PC1-Size (71.9%)

Roundness:Circ., Solid. → PC3-Roundness (8.2%)

Asymmetry:Ecc., Elong. → PC2-Asymmetry (19.4%)

Area

Perim.

Circ.

Ecc.

Solid.

Elong.

Eq.Dia.

Major

Minor

Area

1.00

0.99

0.58

-0.44

0.64

0.50

0.99

0.98

Perim.

0.99

1.00

0.55

-0.43

0.61

0.49

0.99

0.98

Circ.

0.58

0.55

1.00

-0.52

0.98

0.56

0.61

0.52

0.64

Ecc.

-0.44

-0.43

-0.52

1.00

-0.47

-0.99

-0.48

-0.32

-0.57

Solid.

0.64

0.61

0.98

-0.47

1.00

0.53

0.67

0.59

0.69

Elong.

0.50

0.49

0.56

-0.99

0.53

1.00

0.54

0.38

0.63

Eq.Dia.

0.99

0.61

-0.48

0.67

0.54

1.00

0.97

0.99

Major

0.98

0.99

0.52

-0.32

0.59

0.38

0.97

1.00

0.94

Minor

0.98

0.64

-0.57

0.69

0.63

0.99

0.94

1.00

-1.0

+1.0Spearman \u03C1

Size

Area, Perim., Eq.Dia., Major, Minor

→ PC1-Size (71.9%)

Roundness

Circ., Solid.

→ PC3-Roundness (8.2%)

Asymmetry

Ecc., Elong.

→ PC2-Asymmetry (19.4%)

Hierarchical clustering (complete linkage, distance = 1 - |ρ|, cut at 0.2). Features within clusters are highly correlated (|ρ| > 0.8); features across clusters are independent. This validates the PCA-based multi-modal integration strategy on the Multi-Modal page.

FAIR Data Principles

All data supporting ConductScreen is Findable, Accessible, Interoperable, and Reusable.

Source Data

Zenodo 10301912 (public, DOI-persistent)

Live Verification

screen.conductscience.com (all analyses live)

CSV Export

1,407 images × 16 features downloadable

Docker Container

Exact pipeline reproduction, seed=42

Multi-Modal Integration Biological Validation