Glycoinformatics
Glycoinformatics is a field of bioinformatics that pertains to the study of carbohydrates involved in protein post-translational modification. It broadly includes (but is not restricted to) database, software, and algorithm development for the study of carbohydrate structures, glycoconjugates, enzymatic carbohydrate synthesis and degradation, as well as carbohydrate interactions. Conventional usage of the term does not currently include the treatment of carbohydrates from the better-known nutritive aspect.
Issues to consider
[edit]Even though glycosylation is the most common form of protein modification, with highly complex carbohydrate structures, the bioinformatics on glycome is still very poor.[2][3]
Unlike proteins and nucleic acids which are linear, carbohydrates are often branched and extremely complex.[4] For instance, just four sugars can be strung together to form more than 5 million different types of carbohydrates[5] or nine different sugars may be assembled into 15 million possible four-sugar-chains.[6]
Also, the number of simple sugars that make up glycans is more than the number of nucleotides that make up DNA or RNA. Therefore, it is more computationally expensive to evaluate their structures.[7]
One of the main constrains in the glycoinformatics is the difficulty of representing sugars in the sequence form especially due to their branching nature.[6] Owing to the lack of a genetic blue print, carbohydrates do not have a "fixed" sequence. Instead, the sequence is largely determined by the presence of a variety of enzymes, their kinetic differences and variations in the biosynthetic micro-environment of the cells. This increases the complexity of analysis and experimental reproducibility of the carbohydrate structure of interest.[8] It is for this reason that carbohydrates are often considered as the "information poor" molecules.
Databases
[edit]Table of major glyco-databases.[9][10]
Database | Description | URL |
---|---|---|
GlycomeDB (outdated) | Portal for glycan structures that have been integrated from several of the major glycan-related databases. | http://www.glycome-db.org |
GLYCOSCIENCES.de | One of the earliest databases of glycan structure data, also includes NMR data and literature references. | https://web.archive.org/web/20180521104202/http://www.glycosciences.de/ |
Consortium for Functional Glycomics (CFG) | Glycan structures, glycan binding affinity data, glycan profiling data from MALDI-TOF analysis, knock-out mouse phenotype data and glyco-enzyme expression data. | http://www.functionalglycomics.org |
Japanese Consortium for Glycobiology and Glycotechnology Database (JCGGDB) | A comprehensive database portal for major glyco- related databases in Japan, including mass spectral data of glycan profiles, lectin array data, glycoproteindata, glycogene information including disease information, etc. | http://jcggdb.jp |
KEGG GLYCAN | Glycan structures and their pathway data, including glycogene information as organized by the KEGGORTHOLOGY. | http://www.genome.jp/kegg/glycan/ |
GlyConnect | Curated glycomic and glycoproteomic structural and site information based on published data in scientific journal.[11] | https://glyconnect.expasy.org |
UniCarb-DB | Curated tandem MS spectra with associated glycan structures.[12] | https://unicarb-db.expasy.org |
UniCarb-DR | Public repository for tandem MS spectra with associated glycan structures for MIRAGE compatible submission of glycomic data for supplementing glycomic publications.[13] | https://unicarb-dr.glycosmos.org |
GlyGen | Retrieves information from multiple international data sources and integrates and harmonizes data for glycoconjugates and carbohydrates. The web portal provides an easy starting point for users to search for information regarding protein glycosylation, glycan occurrence, glycosylation in diseases etc. | https://www.glygen.org/ |
Carbohydrate Structure Database (CSDB) | Curated structural, bibliographic, taxonomical, NMR and other data on carbohydrates from prokaryotes, plants, and fungi. | http://csdb.glycoscience.ru |
GlyTouCan | GlyTouCan is an international repository that assigns unique accession numbers to glycan structure.[14] | https://glytoucan.org/ |
References
[edit]- ^ Dervilly-Pinel G, et al. (2004). Carbohydrate Polymers 55:171–177.
- ^ Helenius A, Aebi M (2001) Intracellular functions of N-linked glycans. Science 291:2364–2369
- ^ Kikuchi N, et al. (2005). Bioinformatics 21:1717–1718. http://bioinformatics.oxfordjournals.org/cgi/content/full/21/8/1717
- ^ Seeberger PH (2005). Nature 437:1239.
- ^ Service RF (2001). Science 291:805-806. http://www.sciencemag.org/cgi/content/full/291/5505/805a
- ^ a b Dove A (2001). Nature Biotechnology 19:913-917. http://www.columbia.edu/cu/biology/courses/w3034/LACpapers/bittersweetNatBiot01.pdf Archived 2010-06-29 at the Wayback Machine
- ^ von der Lieth CW, et al. (2011). EUROCarbDB: An open-access platform for glycoinformatics. Glycobiology 21:4:493–502
- ^ Lutteke T. (2012). The use of glycoinformatics in glycochemistry. Beilstein J. Org. Chem. 8:915–929. doi:10.3762/bjoc.8.104
- ^ Aoki-Kinoshita KF. (2011). Introduction to Glycoinformatics And Computational Applications. Beilstein-Institut. (PDF 1.57 MB)
- ^ Egorova K.S., Toukach Ph.V. (2018). Glycoinformatics: bridging isolated islands in the sea of data. Angewandte Chemie International Edition 57:14986-14990 | doi:10.1002/anie.201803576
- ^ Alocci, Davide; Mariethoz, Julien; Gastaldello, Alessandra; Gasteiger, Elisabeth; Karlsson, Niclas G.; Kolarich, Daniel; Packer, Nicolle H.; Lisacek, Frédérique (February 2019). "GlyConnect: Glycoproteomics Goes Visual, Interactive, and Analytical". Journal of Proteome Research. 18 (2): 664–677. doi:10.1021/acs.jproteome.8b00766. hdl:10072/382780. ISSN 1535-3893. PMID 30574787. S2CID 58570954.
- ^ Lisacek, Frederique; Mariethoz, Julien; Alocci, Davide; Rudd, Pauline M.; Abrahams, Jodie L.; Campbell, Matthew P.; Packer, Nicolle H.; Ståhle, Jonas; Widmalm, Göran (2017), Lauc, Gordan; Wuhrer, Manfred (eds.), "Databases and Associated Tools for Glycomics and Glycoproteomics", High-Throughput Glycomics and Glycoproteomics, vol. 1503, New York, NY: Springer New York, pp. 235–264, doi:10.1007/978-1-4939-6493-2_18, ISBN 978-1-4939-6491-8, PMID 27743371, retrieved 2021-02-05
- ^ Rojas-Macias, Miguel A.; Mariethoz, Julien; Andersson, Peter; Jin, Chunsheng; Venkatakrishnan, Vignesh; Aoki, Nobuyuki P.; Shinmachi, Daisuke; Ashwood, Christopher; Madunic, Katarina; Zhang, Tao; Miller, Rebecca L. (December 2019). "Towards a standardized bioinformatics infrastructure for N- and O-glycomics". Nature Communications. 10 (1): 3275. Bibcode:2019NatCo..10.3275R. doi:10.1038/s41467-019-11131-x. ISSN 2041-1723. PMC 6796180. PMID 31332201.
- ^ Tiemeyer, Michael; Aoki, Kazuhiro; Paulson, James; Cummings, Richard D.; York, William S.; Karlsson, Niclas G.; Lisacek, Frederique; Packer, Nicolle H.; Campbell, Matthew P.; Aoki, Nobuyuki P.; Fujita, Akihiro (2017). "GlyTouCan: an accessible glycan structure repository". Glycobiology. 27 (10): 915–919. doi:10.1093/glycob/cwx066. ISSN 1460-2423. PMC 5881658. PMID 28922742.