An all protein pair from the superfamily cd Figure The comparison of CDD and DaliLite alignments for an all protein pair from the superfamily cd. The structurebased sequence alignment made by CDD (A) and DailLite (B) for two immunoglobulin proteins. The conserved cysteine pairs are colored in white. Otherwise,precisely the same as in Figure . For this pair,all procedures but VAST agreed with DaliLite,even though VAST agreed with CDD. DaliLite achieved . and . for fcar,fcar and fcar,respectively.Page of(page quantity not for citation purposes)BMC Bioinformatics ,.RMSD of reference alignmentsSequence similiarity (identity)Figure similarity (fraction of identical pairs) dependence of Fcar within the Sequence root node set Sequence similarity (fraction of identical pairs) dependence of Fcar within the root node set. Alignments have been grouped into sequence similarity bins of size . after which the alignments inside every single bin have been grouped in accordance with its CD name for averaging. The avearge Fcar values are shown together with the scale on the left yaxis: open symbols,Fcar; closed symbols,Fcar. The xaxis shows the midpoint of each sequence similarity bin. The histogram (grey bars) shows the amount of superfamilies in every single bin with the scale around the right yaxis. households. However,every process provides alignment accuracies that differ drastically more than diverse protein pairs and over unique superfamilies. The box plots in Figure give the distribution of Fcar and Fcar values more than the CDD PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/25352391 superfamilies for each method. DaliLite has the narrowest distribution of Fcar values with the highest imply and median though CE has the widest distribution together with the lowest mean and median. All techniques give Fcar values much less than . to get a number of superfamilies and completely fail for at the least one particular superfamily. The distribution for Fcar is a lot tighter in comparison. The existence of superfamilies for which distinct techniques give zero Fcar worth raises the possibility of systematic deviation with the outcome from human curation for some superfamilies. So as to recognize such superfamilies,averages of Fcar values had been calculated more than all procedures for each superfamily. Figure shows the methodMedChemExpress CI947 averaged Fcar and Fcar values for superfamilies sorted within the order of escalating Fcar worth. The distribution with the methodaveraged Fcar values more than the superfamilies follows exponential decay except for 5 superfamilies with the lowest methodaveraged Fcar values (see inset of Figure. These superfamilies are listed in Table . AllFiguredependence of Fcar within the root node set RMSD RMSD dependence of Fcar in the root node set. The structure pairs were superposed applying the reference alignments to calculate the RMSDs. The test alignments had been grouped into RMSD bins of size . and then the alignments inside each and every bin have been grouped according to its CD name for averaging. The avearge Fcar values are shown together with the scale around the left yaxis: open symbols,Fcar; closed symbols,Fcar. The xaxis shows the midpoint of each and every RMSD bin. All of the structure pairs with RMSD greater than . were collected within the last bin. The histogram (grey bars) shows the number of superfamilies in each bin using the scale around the right yaxis.the procedures give low Fcar values for these 5 superfamilies (Figure. Integrated in Figure are the RMSD values averaged for every single superfamily. They typically decrease as the FcarTable : The largest CDD superfamily and also the superfamilies for which all applications score poorlyNameSCOP classPairsSubfamilies Description in CDDcd cd cd cd cdf a.