Abstract
Proteolysis-targeting chimeras (PROTACs) have emerged as effective tools to selectively degrade disease-related proteins by using the ubiquitin-proteasome system. Developing PROTACs involves extensive tests and trials to explore the vast chemical space. To accelerate this process, we propose a novel deep generative model for the rational design of PROTACs in a low-resource setting, which is then guided to sample PROTACs with optimal pharmacokinetics through deep reinforcement learning. Applying this method to the bromodomain-containing protein 4 target protein, we generated 5,000 compounds that were further filtered through machine learning-based classifiers and physics-driven simulations. As a proof of concept, we identified, synthesized and experimentally tested six candidate bromodomain-containing protein 4-degrading PROTACs, of which three were validated by cell-based assays and western blot analysis. One lead candidate was further tested and demonstrated favourable pharmacokinetics in mice. This combination of deep learning and molecular simulations may facilitate rational PROTAC design and optimization.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data used in this paper are publicly available and can be accessed at http://cadd.zju.edu.cn/protacdb/ for the PROTAC-DB dataset, https://zinc15.docking.org/ for the ZINC dataset and https://www.rcsb.org for the protein crystal structure. Source data are provided with this paper.
Code availability
Demo, instructions and codes for PROTAC-RL are available at https://github.com/biomed-AI/PROTAC-RL.
References
Sakamoto, K. M. et al. Protacs: chimeric molecules that target proteins to the Skp1–cullin–F box complex for ubiquitination and degradation. Proc. Natl Acad. Sci. U. S. A. 98, 8554–8559 (2001).
Deshaies, R. J. Prime time for PROTACs. Nat. Chem. Biol. 11, 634–635 (2015).
Dale, B. et al. Advancing targeted protein degradation for cancer therapy. Nat. Rev. Cancer. 21, 1–17 (2021).
Pettersson, M. & Crews, C. M. PROteolysis TArgeting Chimeras (PROTACs)—past, present and future. Drug Discov. Today Technol. 31, 15–27 (2019).
Lai, A. C. & Crews, C. M. Induced protein degradation: an emerging drug discovery paradigm. Nat. Rev. Drug Discov. 16, 101–114 (2017).
Bai, L. et al. A potent and selective small-molecule degrader of STAT3 achieves complete tumor regression in vivo. Cancer Cell 36, 498–511. e417 (2019).
Liu, Z. et al. Design and synthesis of EZH2-based PROTACs to degrade the PRC2 complex for targeting the noncatalytic activity of EZH2. J. Med. Chem. 64, 2829–2848 (2021).
Han, X. et al. Discovery of ARD-69 as a highly potent proteolysis targeting chimera (PROTAC) degrader of androgen receptor (AR) for the treatment of prostate cancer. J. Med. Chem. 62, 941–964 (2019).
Zoppi, V. et al. Iterative design and optimization of initially inactive proteolysis targeting chimeras (PROTACs) identify VZ185 as a potent, fast, and selective von Hippel–Lindau (VHL) based dual degrader probe of BRD9 and BRD7. J. Med. Chem. 62, 699–726 (2018).
Nowak, R. P. et al. Plasticity in binding confers selectivity in ligand-induced protein degradation. Nat. Chem. Biol. 14, 706–714 (2018).
Bemis, T. A., La Clair, J. J. & Burkart, M. D. Unraveling the role of linker design in proteolysis targeting chimeras. J. Med. Chem. 64, 8042–8052 (2021).
Smith, B. E. et al. Differential PROTAC substrate specificity dictated by orientation of recruited E3 ligase. Nat. Commun. 10, 131 (2019).
Edmondson, S. D., Yang, B. & Fallan, C. Proteolysis targeting chimeras (PROTACs) in ‘beyond rule-of-five’chemical space: recent progress and future challenges. Bioorg. Med. Chem. Lett. 29, 1555–1564 (2019).
Garber, K. The PROTAC gold rush. Nat. Biotechnol. 40, 12–16 (2022).
Cecchini, C., Pannilunghi, S., Tardy, S. & Scapozza, L. From conception to development: investigating PROTACs features for improved cell permeability and successful protein degradation. Front. Chem. 9, 672267 (2021).
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).
Ekins, S. et al. Exploiting machine learning for end-to-end drug discovery and development. Nat. Mater. 18, 435–441 (2019).
Segler, M. H. S., Kogej, T., Tyrchan, C. & Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2018).
Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inf. 37, 1700111 (2018).
Kotsias, P.-C. et al. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2, 254–265 (2020).
Zheng, S. et al. QBMG: quasi-biogenic molecule generator with deep recurrent neural network. J Cheminform 11, 5 (2019).
Wang, J., Zheng, S., Chen, J. & Yang, Y. Meta learning for low-resource molecular optimization. J. Chem. Inf. Model. 61, 1627–1636 (2021).
Zheng, S. et al. Deep scaffold hopping with multimodal transformer neural networks. J Cheminform 13, 1–15 (2021).
Gomez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Skalic, M., Jimenez, J., Sabbadin, D. & De Fabritiis, G. Shape-based generative modeling for de novo drug design. J. Chem. Inf. Model. 59, 1205–1214 (2019).
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. Preprint at https://arxiv.org/abs/1805.11973 (2018).
Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A. & Zhavoronkov, A. druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol Pharm 14, 3098–3104 (2017).
Walters, W. P. & Barzilay, R. Applications of deep learning in molecule generation and molecular property prediction. Acc. Chem. Res. 54, 263–270 (2021).
Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).
Das, P. et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat. Biomed. Eng. 5, 613–623 (2021).
Mason, D. M. et al. Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning. Nat. Biomed. Eng. 5, 600–612 (2021).
Imrie, F., Bradley, A. R., van der Schaar, M. & Deane, C. M. Deep generative models for 3D linker design. J. Chem. Inf. Model. 60, 1983–1995 (2020).
Yang, Y. et al. SyntaLinker: automatic fragment linking with deep conditional transformer neural networks. Chem. Sci. 11, 8312–8322 (2020).
Weng, G. et al. PROTAC-DB: an online database of PROTACs. Nucleic Acids Res. 49, D1381–D1387 (2021).
Altae-Tran, H., Ramsundar, B., Pappu, A. S. & Pande, V. Low data drug discovery with one-shot learning. ACS Cent Sci 3, 283–293 (2017).
Vaswani, A. et al. Attention is all you need. In Guyon, I. et al. (eds). Advances in Neural Information Processing Systems, 30 (2017). https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Arús-Pous, J. et al. Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11, 1–13 (2019).
Wang, Z. et al. Sample efficient actor-critic with experience replay. Preprint at https://arxiv.org/abs/1611.01224 (2016).
Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J Cheminform 9, 48 (2017).
ClinicalTrials.gov database, https://clinicaltrials.gov/
Winter, G. E. et al. BET bromodomain proteins function as master transcription elongation factors independent of CDK9 recruitment. Mol. Cell 67, 5–18 (2017). e19.
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Baell, J. B. & Holloway, G. A. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J. Med. Chem. 53, 2719–2740 (2010).
Butina, D. Unsupervised data base clustering based on daylight’s fingerprint and Tanimoto similarity: a fast and automated way to cluster small and large data sets. J. Chem. Inf. Comput. Sci. 39, 747–750 (1999).
Zaidman, D., Prilusky, J. & London, N. PRosettaC: Rosetta based modeling of PROTAC mediated ternary complexes. J. Chem. Inf. Model. 60, 4894–4903 (2020).
Paggi, J. M. et al. Leveraging nonstructural data to predict structures and affinities of protein-ligand complexes. Proc. Natl Acad. Sci. U. S. A. https://doi.org/10.1073/pnas.2112621118 (2021).
Zheng, S., Li, Y., Chen, S., Xu, J. & Yang, Y. Predicting drug–protein interaction using quasi-visual question answering system. Nat. Mach. Intell. 2, 134–140 (2020).
Paiva, S. L. & Crews, C. M. Targeted protein degradation: elements of PROTAC design. Curr. Opin. Chem. Biol. 50, 111–119 (2019).
Cheng, M. et al. Discovery of potent and selective epidermal growth factor receptor (EGFR) bifunctional small-molecule degraders. J. Med. Chem. 63, 1216–1232 (2020).
Jimenez-Luna, J., Skalic, M., Weskamp, N. & Schneider, G. Coloring molecules with explainable artificial intelligence for preclinical relevance assessment. J. Chem. Inf. Model. 61, 1083–1094 (2021).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Sterling, T. & Irwin, J. J. ZINC 15–ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Ermondi, G., Garcia-Jimenez, D. & Caron, G. PROTACs and building blocks: the 2D chemical space in very early drug discovery. Molecules 26, 672 (2021).
Hussain, J. & Rea, C. Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J. Chem. Inf. Model. 50, 339–348 (2010).
Nair, V. & Hinton, G. E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Furnkranz, J. and Joachims, T. (eds) Proceedings of the 27th International Conference on Machine Learning, 807-814, (2010). https://icml.cc/Conferences/2010/papers/432.pdf
Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization. Preprint at https://arxiv.org/abs/1607.06450 (2016).
He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Computer Society, 770–778 (2016).
Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9, 1–14 (2017).
Cai, C. et al. Transfer learning for drug discovery. J. Med. Chem. 63, 8683–8694 (2020).
Burslem, G. M. et al. The advantages of targeted protein degradation over inhibition: an RTK case study. Cell Chem. Biol. 25, 67–77. e63 (2018).
Goracci, L. et al. Understanding the metabolism of proteolysis targeting chimeras (PROTACs): the next step toward pharmaceutical applications. J. Med. Chem. 63, 11615–11638 (2020).
Dressman, J. B. & Reppas, C. In vitro–in vivo correlations for lipophilic, poorly water-soluble drugs. Eur. J. Pharm. Sci. 11, S73–S80 (2000).
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).
Veber, D. F. et al. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623 (2002).
DeGoey, D. A., Chen, H.-J., Cox, P. B. & Wendt, M. D. Beyond the rule of 5: lessons learned from AbbVie’s drugs and compound collection: miniperspective. J. Med. Chem. 61, 2636–2651 (2017).
Schneidman-Duhovny, D., Inbar, Y., Nussinov, R. & Wolfson, H. J. PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic Acids Res. 33, W363–W367 (2005).
Acknowledgements
This study has been supported by the National Key R&D Program of China (2020YFB0204803, Y.Y.), National Natural Science Foundation of China (61772566, Y.Y.) and Guangdong Key Field R&D Plan (2019B020228001, Y.Y.; 2018B010109006, Y.Y.). We thank R. Hu, W. Lu, L. Shi and J. Zhang for helpful discussions.
Author information
Authors and Affiliations
Contributions
S.Z. and Y.Y. contributed the concept and experimental design. S.Z., Y.T. and C.L. contributed the code implementation. Z.W., X.S. and Y.T. contributed the development of the molecular simulations part. S.Z. and Q.Z. contributed to the wet experiment design. Y.Y., S.Z. and Y.T. wrote the manuscript. H.C participated in the discussion and revision of the manuscript. All authors contributed to the interpretation of the results. All authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
S.Z., Z.W., C.L., Z.Z. and X.S. work directly or indirectly for Galixir. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Guowei Wei and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary information
Supplementary text, Figs. 1–9, Tables 1-6 and chemical synthesis and analytical data.
Source data
Source Data Fig. 4
Unprocessed western blots for Fig. 4d–f.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zheng, S., Tan, Y., Wang, Z. et al. Accelerated rational PROTAC design via deep learning and molecular simulations. Nat Mach Intell 4, 739–748 (2022). https://doi.org/10.1038/s42256-022-00527-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-022-00527-y
This article is cited by
-
PocketFlow is a data-and-knowledge-driven structure-based molecular generative model
Nature Machine Intelligence (2024)
-
AI-DPAPT: a machine learning framework for predicting PROTAC activity
Molecular Diversity (2024)
-
Discovery of small molecule degraders for modulating cell cycle
Frontiers of Medicine (2023)
-
The rise of targeting chimeras (TACs): next-generation medicines that preempt cellular events
Medicinal Chemistry Research (2023)
-
An overview of PROTACs: a promising drug discovery paradigm
Molecular Biomedicine (2022)