ISSN: 2644-1373
Nada M Doleib1,2*
Received: March 08, 2021 Published: March 15, 2021
Corresponding author: Nada M Doleib, Department of Biology, Faculty of Sciences and Arts,-Khulais, University of Jeddah, Saudi Arabia
DOI: 10.32474/LOJPCR.2021.02.000141
SARS-CoV-2 Nucleocapsid protein considered as a vaccine target. Viral nucleocapsid protein is a potential antiviral drug target, serving multiple critical functions during the viral life cycle. However, the structural information of SARS-CoV-2 in different geographical locations considered as an important factor in order to produce a global vaccine for the pandemic. The present investigation relies on the analysis of the similarities and differences of the SARS-CoV-2 Nucleocapsid protein from various geographical locations, considering certain protein parameters, as well as the study of the alignment of the protein sequence of amino acids.
Keywords: Nucleocapsid protein; SARS-CoV-2; geographical locations; bioinformatics
SARS-CoV-2 is a single-stranded, positive-sense RNA virus with a 30 kb genome, one of the largest among RNA viruses [1]. The viral envelop of SARS-CoV-2, contains many proteins such as spike protein, and glycoprotein [2]. Another protein incorporated with the viral genome is nucleocapsid protein which is responsible for protection the virus from the host cell environment [3]. The SARSCoV- 2 spike protein is being used as the principal antigen target in vaccine progress. However, the multifaceted molecular details of viral entrance may lead to obstacles with the vaccine reaction [4]. The nucleocapsid (N) protein is a significant antigen for coronavirus, which contribute to RNA package and virus particle discharge [5]. After infection, the N protein enters the host cell along with the viral RNA to facilitate its replication and to process the assembly and release of the virus. SARS-CoV N comprises two distinct RNA-binding domains (N-terminal domain and C-terminal domain) connected by a poorly structured linkage region. Due to positive amino acids, SARS-CoV N-terminal domain and C-terminal domain have been documented to bind to the viral RNA genome [6]. Serological diagnosis has shown that the unique antibodies to the N protein in the serum of SARS patients have a higher sensitivity and longer persistence than those of other structural SARS-CoV proteins. In addition, anti-N antibodies have been observed at an early stage of infection with a high specificity. Thus, any information obtained from the study of this protein, whether in vivo or in vitro, will improve our understanding of COVID-19 and enable us to develop better biological agents for the treatment or diagnosis of diseases [7]. It is becoming clearer how important this protein is, to the multiple stages of the viral life cycle. This studie provide valuable and timely insights specific to the SARS-CoV-2 N protein, a vaccine target that has some distinct advantages over other possible SARSCoV- 2 antigens according to geographical location. Because of the conservation of the N-protein sequence, the increased awareness of its genetics and biochemistry regarding different geographical location, the N-protein SARS-CoV-2 should be considered as a global vaccine candidate for SARS-CoV-2. The present investigation relies on study the similarities and difference of Nucleocapsid protein “of SARS-CoV-2 from different geographical locations considering some protein parameters as well as study the alignment of amino acid sequence of protein.
Amino acids sequences for the Nucleocapsid protein of SARSCoV- 2 from different geographical locations were obtained from the National Center for Biotechnology Information database. The accession numbers of the corresponding database entries and isolation source are listed in Table 1. Multiple sequence alignment of Nucleocapsid protein of SARS-CoV-2 was performed in order to find the evolutionary relationships between sequences from different geographical location to identify shared patterns. Sequences were multiply aligned Jalview software version 2.8 [8]. Phylogenetic trees were constructed with neighbour-joining using MEGA [9]. A distance matrix was generated using the model building Jones-Taylor-Thornton [10]. Graphical way of representing and visualizing consensus data developed of amino acid multiple sequence alignment developed were displayed according to method described by Tom Schneider and Mike Stephens [11].
Table 1: Accession numbers of NCBI entries for Nucleocapsid protein of SARS-CoV-2 from differen geographical locations.
*aa is arrangement of amino acids in a protein.
Estimation of the amino acid composition and molecular weight was determined using Isoelectric Point Calculator (IPC), a web service and a standalone program for the accurate estimation of protein and peptide characteristic [12].
From the currently available sequence, of the solvent content of Nucleocapsid protein. The fraction of the crystal volume occupied by solvent was calculated according to Matthews [13].
Table 2 is displayed the chemical composition of the Nucleocapsid protein “of SARS-CoV-2 from different geographic location. The preset data revealed that amino acid composition of the tested 13 sequences showed high similarity except for the protein sequence from Spain. Figure 1 represented the overall height of the stack indicates the amino acid sequence conservation at each position of protein, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, sequence logo of Nucleocapsid protein provides that almost the sequence from different location did not differ from each other and revealed high similarity. Regarding Phylogeny estimation for Nucleocapsid protein of SARS-CoV-2 from different geographical locations. Neighbour-joining tree constructed using Mega 10.1.8. The multiple sequence alignments are depicted in Figure 2. Constructed phylogenetic tree is depicted in Figure 3. The current results indicated that; Neighbour-joining tree of Nucleocapsid protein of SARS-CoV-2 displays the comparison of amino acid sequences of Nucleocapsid protein from different location. The results shows that the phylogenetic analysis rooted by two clusters (Figure 3). Cluster 1 represents France, Australia, China, India, South Africa, Morocco, United Kingdom and Spain. In the Cluster 1 Spain considered as the main root. Cluster 2 represents USA, Canada, Argentina, Saudi Arabia and Nigeria. The main root of the cluster 2 is Canada. Figure 4 shows Matthews coefficient , the current results revealed that the parameter value equal to 0.68 for all locations except for Spain its value reaches 0.85. Figure 5 shows the results for Solvent content of protein crystals were near 80 % for all geographical locations but has been observed that the value calculate for Spain was the lowest equal to 44%. As showed in Table 3 molecular weight of Nucleocapsid protein “of SARS-CoV-2 from different geographic location ranges from 36759.93 to 45696.7 Dalton, exhibited low variability in molecular weight.
Figure 1: Multiple sequence alignment of Nucleocapsid protein “of SARS-CoV-2 from different geographic location.
The amino acid sequences were grabbed from NCBI (National Center for Biotechnology Information) database.
Table 2: Chemical composition of the Nucleocapsid protein “of SARS-CoV-2 from different geographic location.
Figure 2: Representation of a scoring matrix constructed for a multiple sequence alignment of Nucleocapsid protein
“of SARS-CoV-2 from different geographic location. Each of the 20 amino acids frequently found in proteins is
assumed a mark for each location in the sequence agreeing to the frequency with which they occur in the original
alignment.
Figure 3: Phylogenetic analysis of protein sequences Nucleocapsid protein “of SARS-CoV-2 from different geographic
location. Bootstrap percentage values as obtained from 100 resampling of the date set are given at the nodes of the
tree. The obtained tree calculated by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances
estimated using the Jones-Taylor-Thornton model.
Table 3: Molecular weight of Nucleocapsid protein “of SARS-CoV-2 from different geographic location.
Figure 4: Matthew’s coefficient for Nucleocapsid protein of SARS-CoV-2 from different geographical locations.
Figure 5: Solvent content for Nucleocapsid protein of SARS-CoV-2 from different geographical locations.br>
The current study demonstrated that the Nucleocapsid protein
analysis from different location shared a common primary amino
acid composition which resulted in a uniform sequence alignment.
Previously reports declared that alignments are a powerful way to
compare related protein sequences. They can be used to capture
various facts about the sequences aligned, such as communal
evolutionary descent or shared structural function [14]. The obtain
composition information of SARS-CoV-2 Nucleocapsid protein
revealed high similarity in different geographic location with mild
variability in amino acid concentration. The study compared aminoacid
distributions between different locations. When looking at
overall amino acid frequencies it could be recognized that byand-
large, amino acid frequencies in all investigated sequences
mirrored to each other except for the sequence obtained from
Spain. The biggest differences arose in all amino acids except for
Histidine, Isoleucine, Tryptophan and valine. According to Jackson
et al. [15] there are several patterns of sequence variation that are
consistently seen in natural proteins.
The first steps in a macromolecular structure determination of
protein are to detect the count of molecules in the crystallographic
asymmetric unit. The crystal volume per unit of protein molecular
weight, known as Matthews coefficient. A substantial percentage
of the protein crystals volume is employed by solvent ranged from
27% to 78%, with the most common value being about 43% [16].
The Matthews Coefficient and solvent content are calculated for
Nucleocapsid protein “of SARS-CoV-2 from different geographic
location which revealed almost the same results or all locations
except for Spain. Water plays an important role in the structure
of biomolecules and often influences protein function [17]. Water
molecules not only affect protein folding, but also mediate biological
processes such as enzymatic reactions and molecular recognition.
Information about the fraction of water (solvent) plays a significant
role in the X-ray structure determination process [18]. The present results revealed that solvent content for Nucleocapsid protein of
SARS-CoV-2 from different geographical locations were almost
similar except for Spain.
In the current study, the variation of Nucleocapsid protein of
SARS-CoV-2 composition and its alignment were investigated for
different location, it revealed high similarity for each other and
showed a uniform sequence alignment with mild variability in
amino acid concentration. The crystal volume per unit of protein
molecular weight of Nucleocapsid protein “of SARS-CoV-2 revealed
almost the same results or all locations except for Spain.
Conflicts of Interest: The authors declare no conflict of interest.