Skip to content

Commit f59fd9c

Browse files
jam-sudoclaude
andcommitted
feat(data): expand platinum to 176 drugs via OpenFDA extraction
OpenFDA API mining: 5,000 labels → 177 extractions → 29 new validated drugs Quality filters: dose ≥ 1mg, Cmax/dose ∈ [1e-6, 0.5], SMILES available Holdout expanded: 71 → 100 drugs (29 truly unseen OpenFDA drugs) Removed 7 bad extractions (dose=0, impossible Cmax/dose, known prodrugs) Honest performance on expanded holdout: Tier 1 ALL (99 drugs): AAFE 2.903 [2.34, 3.67] Tier 2 Mechanistic AD (90 drugs): AAFE 2.563 [2.11, 3.18] The expanded set gives MORE HONEST generalization estimate: - Previous Tier 2 (62 drugs): AAFE 2.329 (curated, selection bias) - Expanded Tier 2 (90 drugs): AAFE 2.563 (includes automated extraction) - Delta +0.234 reflects genuinely harder unseen drugs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent bd16f9e commit f59fd9c

5 files changed

Lines changed: 1731 additions & 4 deletions

File tree

data/clinical/holdout_split.json

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -148,11 +148,40 @@
148148
"vilazodone",
149149
"vonoprazan",
150150
"zolpidem",
151-
"zonisamide"
151+
"zonisamide",
152+
"indomethacin",
153+
"ranolazine",
154+
"sildenafil",
155+
"pravastatin",
156+
"penicillamine",
157+
"carbinoxamine",
158+
"atovaquone",
159+
"febuxostat",
160+
"levofloxacin",
161+
"sumatriptan",
162+
"fluvoxamine",
163+
"acamprosate",
164+
"abiraterone",
165+
"pindolol",
166+
"amantadine",
167+
"budesonide",
168+
"alvimopan",
169+
"dalfampridine",
170+
"nilotinib",
171+
"ponatinib",
172+
"clomipramine",
173+
"bexagliflozin",
174+
"mefenamic acid",
175+
"rifabutin",
176+
"fesoterodine",
177+
"ulipristal",
178+
"ramelteon",
179+
"ketoconazole",
180+
"selegiline"
152181
],
153182
"metadata": {
154183
"n_train": 76,
155-
"n_holdout": 71,
184+
"n_holdout": 100,
156185
"split_method": "murcko_generic_scaffold_stratified",
157186
"seed": 42,
158187
"n_scaffolds": 101,

0 commit comments

Comments
 (0)