The WFDC1 gene is frequently down-regulated or lost in prostate cancer, and the encoded protein, ps20, has been implicated in epithelial cell behaviour and angiogenesis. However, ps20 remains largely uncharacterised with respect to its structure and interacting partners. This study characterised the evolution, functionality and structural characteristics of WFDC1/ps20 using phylogenetic reconstruction and other computational approaches. Bayesian phylogenetic analyses suggested that ps20 appeared in a common ancestor of deuterostomes-protostomes. The rate of evolutionary change within the coding regions of vertebrate WFDC1 genes and the synteny conservation in mammals differed from that of other vertebrate clades, indicating a possible functional diversity of ps20 homologues. A gene set enrichment analysis of the genes around WFDC1 (conserved synteny) showed functional relationships between the WFDC1, CDH13, CRISPLD2, IRF8 and TFPI2 genes. The molecular evolution of ps20 has been driven by purifying selection, particularly in the segments corresponding to exons 3 and 4, which encode the most conserved regions of the protein. A co-evolution analysis showed that residues within these regions co-vary with each other during the evolution of ps20. These results show that the regions corresponding to exons 3 and 4 are ps20-specific structure-function modules. Homology modelling of the exon 2-encoded polypeptide and subsequent dynamics calculus using a Gaussian network model showed that residues with high conformational flexibility are part of a loop region involved in protein-protein recognition, given the similarity with other serine protease inhibitors. Residues C96, R94, L105, and C66 are critical for the integrity and functionality of this ps20 region.
Bibliographical noteFunding Information:
This work was supported by grants from São Paulo State Research Funding Foundation ( FAPESP ; Grant number 2009/16150-6 ) and the National Research and Technology Council (CNPq). The authors are grateful to the Centro Nacional de Processamento de Alto Desempenho em São Paulo (CENAPAD-SP), and the Laboratório de Genômica e Proteômica da Universidade Estadual de Campinas (UNICAMP) for the access to their computational resources.
- Gaussian network model
- Homology modelling
- Phylogenetic analysis
- Purifying selection