K
Training Datasets · De-Identified Corpora07:30:11Sun, May 24
AI Operations
Datasets
Curated, de-identified imaging corpora used for training and validation. Lineage tracked end-to-end.
Total datasets
48
Total volume
94.2 TB
Annotated images
12.4M
IRB approvals
32
Active datasets
| Dataset | Modality | Studies | Size | Label coverage | Split | License |
|---|---|---|---|---|---|---|
Chest CT Curated v8 ds-chest-ct-v8 | CT | 412,884 | 18.4 TB | 98.2% | train/val/test 80/10/10 | BAA + IRB |
Wrist X-Ray v4 ds-wrist-xr-v4 | XR | 84,212 | 412 GB | 99.1% | 75/15/10 | BAA + IRB |
Brain MRI Multi-site v6 ds-brain-mr-v6 | MR | 94,118 | 8.2 TB | 96.8% | 80/10/10 | BAA + IRB |
Mammography Screening v3 ds-mammo-v3 | MG | 208,402 | 3.1 TB | 99.4% | 80/10/10 | BAA + IRB |
Abdomen CT Segmentation v2 ds-abd-ct-v2 | CT | 32,508 | 2.8 TB | 94.0% | 70/15/15 | BAA + IRB |
Demographics balance · Chest CT v8
Female48%
Male51%
Non-binary1%
<40 yrs18%
40-6552%
65+30%
Asian24%
Black18%
Hispanic21%
White33%
Other4%
Lineage · last 90 days
- Todayds-chest-ct-v8 → FractureNet v4.2.1 training
- May 18Added 4,212 studies from Singapore General
- May 12Re-labeled 1,801 priors after RADPEER session
- May 04Compliance scan: 0 PHI leaks, attestation #2401 signed
- Apr 28Bias audit: gender parity within 4pp, ethnicity within 6pp
- Apr 19IRB amendment approved (UCSF #2024-3082)