Generative models enable large-scale exploration of chemical space; however, achieving controllable generation that balances novelty with preservation of bioactive features remains challenging. In this study, we introduce a controllable framework based on the VeGA architecture, leveraging SMARTS-RX functional descriptors. We present two distinct SMILES-based generative models: VeGA-RX, conditioned on semantic tokens for flexible exploration, and VeGA-SCX, which integrates topological guidance via Bemis-Murcko scaffolds for high-precision generation. We show that sampling temperature modulates chemical quality, with lower temperatures (T = 0.6) reducing structural alerts and improving drug-likeness without additional reinforcement learning (RL). Retrospective benchmarking on five pharmacological targets using a strict leakage-safe holdout strategy reveals a clear functional dichotomy: VeGA-SCX maximizes the recovery of bioactive chemotypes (e.g., >90% on mTORC1), while VeGA-RX effectively navigates complex chemical spaces with superior creativity. Comparative analysis against LDMol, a state-of-the-art text-to-molecule diffusion model, confirms the advantages of our autoregressive approach, which delivers higher chemical validity, stricter adherence to constraints, and significantly lower computational latency. In addition, we validate the framework in data-scarce scenarios targeting DCAF1 and WRN helicase, prioritizing candidate ligands exhibiting stable computational binding modes supported by docking and molecular dynamics simulations. Overall, this framework provides a controllable and interpretable strategy for precision-oriented generative chemistry. The complete workflow is open-source and available on GitHub.
VeGA-RX and VeGA-SCX: Controllable SMARTS-Guided Generative Transformers for Precision-Driven De Novo Drug Design / Delre, P.; Bellofatto, G.; Lavecchia, A.. - In: JOURNAL OF CHEMICAL INFORMATION AND MODELING. - ISSN 1549-9596. - 66:9(2026), pp. 5189-5205. [10.1021/acs.jcim.6c00535]
VeGA-RX and VeGA-SCX: Controllable SMARTS-Guided Generative Transformers for Precision-Driven De Novo Drug Design
Delre P.Primo
;Bellofatto G.Secondo
;Lavecchia A.
Ultimo
2026
Abstract
Generative models enable large-scale exploration of chemical space; however, achieving controllable generation that balances novelty with preservation of bioactive features remains challenging. In this study, we introduce a controllable framework based on the VeGA architecture, leveraging SMARTS-RX functional descriptors. We present two distinct SMILES-based generative models: VeGA-RX, conditioned on semantic tokens for flexible exploration, and VeGA-SCX, which integrates topological guidance via Bemis-Murcko scaffolds for high-precision generation. We show that sampling temperature modulates chemical quality, with lower temperatures (T = 0.6) reducing structural alerts and improving drug-likeness without additional reinforcement learning (RL). Retrospective benchmarking on five pharmacological targets using a strict leakage-safe holdout strategy reveals a clear functional dichotomy: VeGA-SCX maximizes the recovery of bioactive chemotypes (e.g., >90% on mTORC1), while VeGA-RX effectively navigates complex chemical spaces with superior creativity. Comparative analysis against LDMol, a state-of-the-art text-to-molecule diffusion model, confirms the advantages of our autoregressive approach, which delivers higher chemical validity, stricter adherence to constraints, and significantly lower computational latency. In addition, we validate the framework in data-scarce scenarios targeting DCAF1 and WRN helicase, prioritizing candidate ligands exhibiting stable computational binding modes supported by docking and molecular dynamics simulations. Overall, this framework provides a controllable and interpretable strategy for precision-oriented generative chemistry. The complete workflow is open-source and available on GitHub.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


