"Trigonometric B-spline"

Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions

We considered the circular nature of the angular data using trigonometric spline, which was more efficient than the triangulation technique. This general framework also provides comprehensive machinery for clustering, model assessment, or data modeling for groups of protein backbone angles. Specifically, the estimated angular density corresponding to a protein structure has a basis expansion whose coefficients can be used as an input to a clustering algorithm. Furthermore, most of the existing protein classification techniques use sequence and 3D structure comparison to classify the proteins based on some (dis)similarity scores obtained after pairwise alignments. The proposed method is an alignment-free procedure that provides a vector of coefficients (i.e., features) associated with each structure (density) that can be directly used to classify the proteins. This general framework also provides a comprehensive means for assessing clustering models for various other data groups with circular nature. We also developed a shiny web application available at https://pscde-t.shinyapps.io/PSCDE-T/) that can be used by the research community to reproduce the results in this paper and estimate Ramachandran distributions collectively