Abstract
Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.
Original language | English |
---|---|
Pages (from-to) | 86-97 |
Number of pages | 12 |
Journal | IUCrJ |
Volume | 9 |
DOIs | |
State | Published - 1 Jan 2022 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022.
Keywords
- SIMBAD
- bioinformatics
- cryo-EM
- findMySequence
- neural networks
- protein sequences
- protein structures
- structure determination