Abstract
The main idea of S-curve diagram is to assign different angle values to different nucleotide acid residues or to different protein amino acids, and then according to cos α j and sin α j, the values are accumulated to construct an S-curve diagram, which is in strict one-to-one correspondence with the biological sequence. In addition, the S-curve diagram proves to be without the degeneracy phenomenon, so that both the degeneracy problem represented by diagrams and the problem of visualization for biological sequence data are solved. Meanwhile, a new approach to differentiate the similarity of biological sequences—the degree of similarity—is put forward on the basis of the S-curve diagram. To put it in detail, the least square approach is first adopted to obtain a straight line equation according to the S-curve diagram, then according to the distance formula of the point to the straight line, the average ratio of square sum for the distance between the S-curve and the straight line is calculated, and finally, the similarity of the biological sequences is presented by the new standard—the degree of similarity. As is shown by the experimental results, the S-curve diagram can better represent biological sequences within Cartesian coordinate system, and the mutation point of biological sequence. Thus, it turns out that the new standard—the degree of similarity is of obviously great advantage