Functional studies of the coronavirus nonstructural proteins

Coronaviruses, including SARS-CoV, SARS-CoV-2, and MERS-CoV, have caused contagious and fatal respiratory diseases in humans worldwide. Notably, the coronavirus disease 19 (COVID-19) caused by SARS-CoV-2 spread rapidly in early 2020 and became a global pandemic. The nonstructural proteins of coronaviruses are critical components of the viral replication machinery. They function in viral RNA transcription and replication, as well as counteracting the host innate immunity. Studies of these proteins not only revealed their essential role during viral infection but also help the design of novel drugs targeting the viral replication and immune evasion machinery. In this review, we summarize the functional studies of each nonstructural proteins and compare the similarities and differences between nonstructural proteins from different coronaviruses.

The genome of coronaviruses is a nonsegmented, positive-sense, single-strand RNA of about 28 -32 nt in length (17), which encodes for six major open reading frames (ORFs) and a various number of accessory genes (18). The first two major ORFs (ORF1a, ORF1ab) are the replicase genes, and the other four encode viral structural proteins that comprise the essential protein components of the coronavirus virions, including the spike surface glycoprotein (S), envelope protein (E), matrix protein (M), and nucleocapsid protein (N) (14,19,20).

Nsp1: the immune repressor
Nsp1 is the very 5' proximal nonstructural protein of β-CoVs. It is released by PLpro at conserved proteolytic sites from polyprotein encoded by ORF1ab (63)(64)(65) (Reviewed in Table 1). Due to the severe outbreak of SARS in 2003, SARS-CoV nsp1 is among the most extensively studied. SARS-CoV nsp1 is 20-kDa in size and distributed in the cytoplasm when transiently expressed in 293 cells (66). It was first observed that SARS-CoV nsp1 transient expression strongly inhibited IFN-β mRNA accumulation during Sendai virus infection, and promoted degradation of overexpressed exogenous mRNA and host endogenous mRNA, leading to an overall decrease in protein synthesis (66). Similar results were then obtained from studies on other β-CoVs including mouse hepatitis virus (MHV), bat coronavirus strains (Rm1, 133, and HKU9-1), and on a β-CoV human coronavirus 229E (HCoV-229E) (67,68). These findings suggest that host translation arresting is a common feature during coronavirus infections.
The nsp1-mediated translation inhibition can be reproduced in the cell-free translation system (69). SARS-CoV nsp1 was demonstrated to bind and inactivate the 40S ribosomal subunits, resulting in translational inhibition. Meanwhile, the nsp1 protein, in its 40S ribosomal subunit binding form, could recruit a cellular endonuclease to mediate mRNA cleavage in the 5' untranslated region (5'-UTR) (70). Subsequently, 5'-truncated host mRNAs were degraded by host 5' -3' exonuclease Xrn1 (71). Interestingly, protein translation under the control of internal ribosome entry site (IRES) from hepatitis C or cricket paralysis viruses, but not encephalomyocarditis virus, could escape from the SARS-CoV nsp1-mediated RNA cleavage, possibly due to different requirement for translational initiation factors in forming 48S initiation complex with the 40S subunit (69). A SARS-CoV nsp1 mutant with two positively charged amino acid substitutions (R124A/K125A) loses the target mRNA/viral RNA binding and mRNA cleavage function but remains the ability to inhibit translation (72). MERS-CoV nsp1 contains the same RK motif. Nevertheless, this RK motif is not involved in binding to mRNAs but is required for the RNA cleavage. Instead, the R13 on the first alpha-helix of MERS-CoV nsp1 that is missing from SARS-CoV nsp1 is essential for mRNA binding (73).
Another SARS-CoV nsp1 mutant K164A/H165A was unable to bind to the 40S subunit and lost the ability to interfere with host gene expression (69). When the same mutations (K164A/ H165A) were introduced to SARS-CoV infectious clone, the recovered virus was replicationincompetent and unable to suppress innate immune responses or degrade host mRNA (66). A similar study was performed with MHV. MHV-nsp1-Δ99, which lacked 99 nucleotides in the nsp1 coding region essential for host translation arresting, could not reproduce well in wild-type mice (67). Moreover, the MHV-nsp1-Δ99 mutant restored its replication to the wt virus level when infecting mice defected in the type I interferons (IFN-I) recognition, highlighting the role of nsp1 in counteracting the IFN-I (72). Notably, the SARS-CoV-2 nsp1 shares a high protein sequence identity of 84.44% with SARS-CoV nsp1, including R124/K125 for mRNA binding (red-colored) and cleavage (asterisked, Figure 1), K164/H165 for translation shutoff (green colored, Figure 1), suggesting a likely same role of SARS-CoV-2 nsp1 in counteracting host immune responses.
SARS-CoV genomic RNA and subgenomic RNAs are somehow resistant to the nsp1 induced RNA cleavage (70). The resistant element was mapped to the 5'-end leader sequence of SARS-CoV RNAs, which contains two important nucleotides AU at the very 5' terminal, followed by AUUA. Coincidently, the SARS-CoV-2 genome also starts with the same two nucleotides AU, followed by UAAA. Whether SARS-CoV-2 avoids nsp1 mediated RNA cleavage by the same RNA coding remains to be answered.

Nsp2: fine-tuner of replication
Nsp2 is the most variable nonstructural protein among Table 1 Proteolysis sites of the ORF1a/ORF1ab polyproteins. The polyproteins were cleaved by the nsp3 papain-like protease and the nsp5 3C-like protease, resulting in the release of sixteen nonstructural proteins. The position of the scissors represents the cleavage site.   (74). A comparative analysis of the protein sequences of SARS-CoV-2 with SARS-CoV showed 61 amino acid substitutions in nsp2 between these two viruses (20). Due to the sequence variability of nsp2 among coronaviruses, it was speculated that nsp2 protein coevolved with the hosts to acquire host-specific functions, and modulating infection (75). A study utilizing nsp2-deleted MHV or SARS-CoV recombinant clones showed that nsp2 is required for optimal viral replication (30). When the nsp2 was deleted from the viruses, the viral titer, and viral RNA synthesis rate moderately reduced compared to the wild type virus. Nsp2 is shown to localize to the viral TRCs (43). However, MHV mutant with nsp2 deletion did not affect the morphology and subcellular localization of the TRCs (30). Importantly, Nsp2 expressed from other genomic loci still could not rescue the replication deficiency, pointing out that the function of nsp2 in viral growth depends on its correct genomic loci between nsp1 and nsp3 (76). Nsp2 was known to from nsp2-nsp3 proteolytic intermediate (30). Nsp2 may regulate protease cleavage in the form of nsp2-nsp3, thus fine-tuning the viral replication.

Nsp3: the scaffold protein and protease
Nsp3 is the largest among all the nonstructural proteins of β-CoV. It is cleaved off from ORF1a/ORF1ab by the papain-like protease domain or PL2 pro domain that is within nsp3 itself. The function of nsp3 is to mediate the genome replication/transcription (77)(78)(79)(80) and pathogenesis (81). Due to its large size and complex domain organization, nsp3 interacts with other nonstructural proteins (77,78), structural proteins (79,80), and host proteins (62) as a scaffold during viral infection (33). Ubl1 is the first domain that locates on the N-terminus of β-CoV nsp3. The Nuclear magnetic resonance (NMR) structure of Ubl1 showed that the Ubl1 domain is structurally similar to ubiquitin-like proteins, albeit two additional helices (3 10 helix and α helix) make the core structural more oval other than globular, comparing to human Ubiquitin or ISG15 (58,82). Ubiquitination and ISGylation are associated with host regulation of innate antiviral responses (83)(84)(85)(86), but the role of mimicry of ubiquitin by ubl1 (as well as ubl2) is currently unknown. It is reasonable to speculate that the ubiquitin-like domains of nsp3 could bridge the protease function of nsp3 to ubiquitination machinery in the cell and interfere with host antiviral immunity. Ubl1 domain of nsp3 was shown to predominantly bind to single-stranded trinucleotide RNA sequence AUA, as mass-spectrometry analysis of the purified recombinant SARS-CoV Ubl1 from E. Coli. revealed co-purified unique RNA fragments (58). It is noted that both SARS-CoV and SARS-CoV-2 have AU-rich 5'-UTR or even 5' terminus in their genomic or subgenomic RNAs (20,70). Whether this coincidence has a functional role remains to be tested. In addition to the studies on SARS-CoV, the Ubl1 domain of MHV nsp3 was found to bind to the viral. N protein (80). This interaction is essential for N proteinmediated enhancement of viral infectivity (79).
The HVR domain locates at the C-terminus of the Ubl1 domain. HVR, also known as acidic domain, is rich in negatively charged amino acids aspartic acid (Asp/D) and glutamic acid (Glu/E). As its name indicates, it is the most variable region found in nsp3. Amino acids sequence identity between SARS-CoV HVR and SARS-CoV-2 HVR is 47.14%, much lower than the 76.6% overall nsp3 amino acid sequence identity. The HVR region is intrinsically disordered in SARS-CoV (58) and MHV (57). The same feature is also observed in SARS-CoV-2 as well as in three highly similar bat coronavirus isolates BatCoV RaTG13 (accession no. MN996532) (9), Bat SL-CoV VZC45 (accession no. MG772933), and Bat SL-CoV VZXC21 (accession no. MG772934) (63) which show high nucleotide identity and protein identity to SARS-CoV-2 ( Figure 3A). There are 45 consensus amino acids in the HVR among SARS-CoV-2 and these three bat viruses (Figure 3B), of which 48.9% are Asp/Glu. In the nonconsensus regions, the Asp/Glu percentage is 20%, 20.4%, 15.2%, and 8.3% for SARS-CoV-2, Bat-CoV RaTG13, Bat SL-CoV VZC45, and Bat SL-CoV VZXC21, respectively, and is much lower than that of the consensus sequence. These differences of HVR Asp/Glu percentage between consensus and nonconsensus regions indicate a possible function of negatively charged amino acids in viral replication that was selected during viral evolution. But currently, the exact role of HVR in the viral life cycle is unknown, studies on MHV suggest HVR is dispensable for viral infection in vitro (79). Following the HVR is the Macrodomain I (MacI, previously known as X domain). Macrodomains are evolutionarily conserved domains that are ubiquitously existing in prokaryotes and eukaryotes. Three decades ago, bioinformatic analyses identified that members in Coronaviridae, Togaviridae, and Hepeviridae families encode this conserved domain of an unknown function, to which the name X domain was given (29,59,(87)(88)(89). Protein crystallography studies on macrodomains of SARS-CoV (90, 91), MERS-CoV (54), and other coronaviruses (55,56,90) showed a three-layered alpha/beta/alpha core fold similar to the C-terminal nonhistone region of MacroH2A, a variant of human histone H2A (92). Macrodomains of SARS-CoV and some other coronaviruses contain in vitro ADP-ribose-1″-phosphate phosphatase (ADRP) activity (90, 91), demono-ADP-ribosylation (deMARylation) activity (36), and de-poly-ADP-ribosylation (dePARylation) activity (93). Studies by using a series of mutations on SARS-CoV and MHV showed that ADRP, deMARylation, and dePARylation activities of MacI are essential to viral virulence in vivo by suppressing the innate immune responses (36,81,91). These sites are conserved between SARS-CoV and SARS-CoV-2 (29) (Figure 2, boxed in red).
The SARS-CoV MacII+MacIII+DPUP forms a previously recognized SARS-unique domain (SUD), although more reports on betacoronavirus genome sequences suggest that this domain is not unique to SARS-CoV (94). MacII is the second marcodomain locating at the C-terminal side of MacI, and is dispensable for SARS-CoV replicon replication, while the third macrodomain

. The HVR region of SARS-CoV-2 and its hypothetical ancestors is intrinsically disordered.
A. The degree of disorder is shown graphically based on the analysis of IUPred2A (226). A score of more than 0.5 is considered disordered. B. The alignment of HVR region shows a high degree of negatively charged amino acid percentage in conserved amino acids.
(MacIII) is required for SARS-CoV replication (32). MacIII binds to the G-quadruplexes formed by the Quadruplex forming G-Rich Sequences (QGRS) located in the nsp2 and nsp12 coding region (95). The MacII-III region also preferentially binds to oligo(G)-strings, which are present in the 3'-UTR of human mRNAs encoding defense-related genes (96). These RNA binding features are possibly essential for viral replication. DPUP is the domain that follows the MacIII. SARS-CoV lacking this domain displays reduced viral RNA even though the virus is still viable (32). Although the DPUP of SARS-CoV and MHV resembles a frataxin-like structure (95), which may involve controlling cellular oxidative stress (97,98), the exact role of DPUP in viral infection is currently unknown.
The NAB domain, which only exists in betacoronaviruses (94), forms flexibly extended linkers between the PL2 pro domain and the following domains of nsp3 (52). NAB domain could bind to RNA, especially repeats of GGGs (52), similar to the RNA recognition pattern of MacIII (95). Betacoronavirus-specific marker (βSM) domain follows NAB within nsp3. SARS-CoV βSM is intrinsically disordered (82), and its role in the viral life cycle is currently unknown.
Downstream of βSM domain is the transmembrane region that contains two transmembrane domains (TM1/2) and one luminal loop domain (3Ecto) (94). Subcellular localization analysis of SARS-CoV nsp3 truncated mutants revealed that the TM1/2 and the luminal 3Ecto domain are essential for the recruitment of nsp4 to discrete ER loci (78). The luminal 3Ecto domain of nsp3, possibly forming a disulfide bond, was proposed to interact with the luminal domains of nsp4 to "zipper" the ER membrane and induce discrete membrane formation (78). This membrane modification was recognized as the first step in forming the ER-origin viral replication organelles (28). Nsp3-nsp4 interaction of MERS-CoV also leads to the zippering of ER membranes and subsequent formation of Double-Membrane Vesicles (DVMs) (107). AH1+Y1 & CoV-Y domain is the C-terminal portion of nsp3 that is facing towards the cytosol. AH1 encodes a predicted transmembrane domain that was shown to be a cytosolic region in SARS-CoV and MHV (42). Currently, the functions of these domains are less well understood as the N-terminal nsp3.

Nsp4: the DVM builder
Coronavirus nsp4 is an integral membrane protein with four transmembrane domains (42). In partnership with nsp3, it plays an essential role in the formation of the membranous structure of TRCs (27,107,108). SARS-CoV or MERS-CoV nsp4 /nsp3 localized to the reticular ER membrane when expressed separately (78,107). However, when nsp4 and nsp3 were coexpressed, the formation of distinct perinuclear loci representing stacked double ER membranes was observed (78,107). Such membrane rearrangements represent the critical step in TRCs formation. The N-terminus nsp4, including the first transmembrane domain and the first luminal loop region between TM1 and TM2, is required for this membrane rearrangement (78). The C-terminal TM4 and cytosolic part of nsp4 are dispensable for either formation of SARS-CoV induced aggregated ER loci (78), or efficient viral growth of MHV (27).
This first luminal region of SARS-CoV nsp4 could interact with the nsp3 luminal 3Ecto domain to bring two ER membranes in close proximity (109). This region was predicted to be glycosylated for various of coronaviruses (108). When glycosylated sites of this region were mutated in MVH, viral growth was reduced along with deficient DVM formation (108). Two-amino acid changes (H120N/ F121L) near the SARS-CoV nsp4 glycosylation site (N131) abolished the nsp4-nsp3 interaction and also led to reduced genome replication and viral production (109). Comparing with SARS-CoV, the emerging SARS-CoV-2 contains same sites including both nsp3-interacting H120/ F121 and the N131 glycosylation site (Figure 4).
It is a cysteine protease with a chymotrypsin-like fold and is often referred to as the main protease (Mpro). Similar to the PL2 pro , 3CL pro is essential to the nonstructural protein processing by cleavage at 11 sites downstream of the nsp4 coding region ( Table 1). SARS-CoV nsp5 consists of an N-terminal domain with proteolytic activity, as well as a C-terminal domain that contains five alpha-helices (22,51,110,111). SARS-CoV 3CL pro has at least three formats, an inactive monomer (22,112), an active homodimer (22,51,111,112), and a highly active homooctamer (110). Besides its proteolytic activity, porcine deltacoronavirus (PDCoV) nsp5 cleaves Signal transducer and activator of transcription 2 (STAT2) at two locations with glutamine (Q) residue at the P1 position, leading to the inhibition on the transcription of IFN-stimulated genes (39). Deltacoronavirus nsp5 also targets the NF-κB essential modulator (NEMO) for degradation and also suppresses type I IFN production (37,38). Thus, coronavirus nsp5 assists viral infection by proteolytically releasing nsp4-16, and suppresses innate immune responses by digesting essential enzymes in the immune signal transduction pathway.

Nsp6: forming DVM and activating autophagy
Coronavirus nsp6 is a transmembrane protein with six transmembrane domains (77). When expressed alone, it localized to ER and induced the generation of DFCP1 (Double FYVE domain-containing protein 1)positive early autophagosomes, or omegasomes. Such a structure can mature into autophagosomes that are capable of delivering LC3 for lysosomal degradation (113). Autophagy is not required for either coronavirus replication or antiviral responses in vitro. The knockout of ATG5 or ATG7, essential genes in the autophagy pathway, does not affect betacoronavirus MHV replication (114,115). ATG5 silencing in Vero cells or treatment with wortmannin, the class3 PI3K inhibitor, also does not affect replication of the Infectious bronchitis virus (IBV), a gammacoronavirus, (113). Although induction of autophagy is not required for coronavirus genome replication, the nsp6 plays a vital role in the viral life cycle. Coronavirus encodes three nonstructural proteins with transmembrane domains, nsp3, nsp4, and nsp6. While nsp3+nsp4 only produces aggregated zippered ER structures or maze-like body, nsp6 expression in addition to nsp3+nsp4 leads to the DMVs formation (77), resembling the authentic membranous structures of TRCs (77). Two HCoV-229E mutants, both contain single amino acid mutation on nsp6, confer antiviral drug K22 resistant, and result in partial recovery of drug-related DVMs loss. These mutations affected progeny infectivities, suggesting that nsp6 is critical for the viral life cycle (61). Although nsp6 expression induces autophagosomes-like DVMs, coronaviruses do not require autophagy for viral replication. Nsp6 or coronavirus likely recruits some host proteins shared with the autophagy pathway for viral DVMs production. However, such speculation needs further investigation.

Nsp7+Nsp8: the RdRp cofactor
Coronavirus nsp7 and nsp8 are indispensable and essential for viral survival (116). The crystal structure of SARS-CoV nsp7 with nsp8 is a hollow cylinder-like supercomplex, formed by two asymmetric units. Each unit includes four nsp7 and four nsp8 (26). In the nsp7+nsp8 supercomplex, a channel structure is apparent (26). The channel is mainly formed by the bridging of the four long helices of N-terminus nsp8, of which the structure resembles the "shaft" of a "golf-club" (26). Mutations on the positive-charged amino acids on this "shaft" region significantly reduce dsRNA binding ability of the supercomplex, while the mutations of positive-charged amino acids on nsp7 near the channel structure do not (26).
Unlike nsp12, nsp8 is a non-canonical RNA-dependent RNA polymerase (RdRp) that does not encode the conserved RdRp motif (117). SARS-CoV nsp8 could initiate short oligonucleotide (< 6 nt) synthesis at an internal template cytidine with a distance of at least two nucleotides from the 3'-end (117). A later study misinterpreted this internal initiating primer synthesis ability from nsp8 as the de novo initiation (118). In this study, the authors also reported that nsp8 has primer extension activities (118). The association of nsp8 with nsp7 was shown to enhance thermal stability (117) and primer extension activity of nsp8 (118). Thus, SARS-CoV nsp8, together with nsp7, provides RNA primer internally complementary to the viral genomic RNA for viral replication, which also requires the "main" RdRp nsp12 (119). Nsp7+nsp8 complex of feline coronavirus (FCoV), an alphacoronavirus, is a 2:1 heterotrimer containing two conformational different nsp7 molecules and one nsp8 molecule. Two copies of heterotrimers could bind to each other through nsp8-nsp8 interaction and form a heterohexamer (120). This nsp7+nsp8 complex is also capable of synthesizing short oligonucleotides (120). Similar to FCoV, the alphacoronavirus HCoV-299E nsp7-10 polyprotein has this noncanonical RdRp activity as well (120).
However, due to the internal initiation nature of the nsp8 primase, the model including primer synthesis of nsp8 plus the primer-dependent RdRp activity of nsp12 still could not provide mechanism insights for the viral RNA synthesis of the 5'-end. On the other hand, a later study on the nsp7+nsp8+nsp12 complex showed that this protein complex possesses both de novo initiation and primer extension RdRp activities. In this work, the nsp7+nsp8 complex indeed misses the de novo initiation activity, suggesting that this activity is mediated by nsp12 (119). Furthermore, the primase activity of nsp8 was not observed (119). Recently a single particle cryo-electron microscopic structure of nsp12 with nsp7 and nsp8 shows a heterodimer of nsp7+nsp8 as well as second nsp8 subunit binding to the N-terminal region of nsp12 (44). This structure favors biochemically established de novo initiation activity of nsp7+nsp8+nsp12 complex (119), where nsp7+nsp8 does not mediate RNA primer synthesis or form the higher-order oligomer (26,117).
In SARS-CoV infected Vero cells, nsp8 can be detected as two forms, a 22 kDa full-length protein and a ~15 kDa version (65). The later was confirmed to be N-terminally truncated version (nsp8C) by western blotting analysis using an antibody only recognizing the C-terminal part (121). Nsp7+nsp8C forms a structure that displays the ability to fuse into the nsp7+nsp8 hexadecamer and was proposed to help the virus switch the replication to genome assembly (121).

Nsp9: the dimer forming RNA binding protein
Nsp9 is a ~12kDa proteolytic cleavage product of pp1a that has the nucleic acid binding activity (45,122). It preferentially binds to single-stranded RNA (45,122,123). Biotin pull-down assay showed that IBV nsp9 preferentially interacts with the 3'-UTR region of the positive-strand viral RNA (124,125). Nsp9 can interact with itself as well as the non-canonical RdRp nsp8 (124)(125)(126)(127). Like most of the coronavirus nonstructural proteins, it locates in the viral TRCs (65,128).
The crystal structure of the nsp9 monomer revealed a cone-shaped N-terminal β-barrel composed of seven β-strands and a C-terminal α-helix that is conserved among alpha-, beta, and gamma-CoVs (45,122,123,129,130). However, its dimerization varies among different coronaviruses. SARS-CoV, IBV, and porcine delta coronavirus (PDCoV) nsp9 were reported to form the "parallel helix-dimer" structure that stabilized by the hydrophobic interactions between two C-terminal α-helices. PDCoV nsp9 dimer is slightly different in that it also requires the N-terminal extended finger motif to stabilize the dimer structure. Besides, SARS-CoV nsp9 can form the "sheet-dimer" structure formed by interactions between β-strand five from both subunits (45). HCoV-229E nsp9 forms an "anti-parallel helix dimer" that requires interaction between two α-helices in the opposite direction with disulfide-bond from two nsp9 subunits (123). The porcine epidemic diarrhea virus (PEDV) nsp9 forms two possible dimer structure resembles the "parallel helix-dimer" and "sheet-dimer" stabilized by a disulfidelinkage (129).
Albeit various of dimerization structures, the dimer formation could enhance the nucleic acid binding and viral replication (123,(129)(130)(131). Mutations of the proteinprotein interaction motif GXXXG on the C-terminal α-helix of SARS-CoV nsp9 disrupted dimer formation, and significantly decreased RNA binding of nsp9. The corresponding mutations in the SARS-CoV genome were either lethal to the viral growth or reverted to wt type amino acid coding (131). Similarly, the G98D mutant of IBV nsp9 significantly destabilized homodimer and also abolished the activity of RNA binding. The incorporated viral mutant was deficient in subgenomic RNAs transcription as well as viral growth. Interestingly, IBV nsp9 mutation I95N showed almost no effect on the RNA binding activity but moderately destabilized dimer formation, while the virus with this mutation has severe growth defects (132). Nsp9 dimerization may have essential roles in replication beyond the RNA binding.

Nsp10: cofactor in viral replication
Nsp10 is a zinc finger protein that contains two zinc finger domains conserved among coronaviruses (49,50,133). Several oligomer forms were reported for nsp10. MHV nsp10 shows the monomeric form in reducing SDS-PAGE and gel filtration analysis. At the same time, it also forms ~80 kDa and ~19600 kDa protein complexes when supplemented with zinc ions in dynamic light scattering assay (133). SARS-CoV nsp10 was reported to form a dimer in solution analyzed by gel filtration (50). A simultaneous report on SARS-CoV nsp10 also revealed a dodecameric structure (49). Currently, no evidence confirmed the biological relevance of nsp10 oligomerization in viral replication.
Nsp10 is essential to coronavirus infection. A temperature-sensitive mutation of nsp10 (Q65E) significantly inhibited MHV RNA synthesis at the nonpermissive temperature (134). Furthermore, the reverse-genetics study identified 16 nsp10 mutants of MHV clone, of which eight were viable but displayed attenuated viral growth, while the other eight clones were inviable (135). One of the nsp10 mutant (D47A/H48A) was studied into depth and had subtle effects on nsp4-10/11 polyprotein processing.
During viral replication, nsp10 enhances the enzyme activities of other replication proteins (136)(137)(138). SARS-CoV nsp10 interacts with the exoribonuclease domain of nsp14, resulting in significantly increased exoribonuclease activity (136). Mutations on MHV nsp10 (R80A/E82A) led to increased sensitivity of the virus towards RNA mutagen treatments (25), suggestive of the involvement of nsp10 in coronavirus proof-reading function, which relies on the exoribonuclease activity of nsp14. A heterodimer complex structure was also identified for nsp10/nsp16 (137). Nsp10 interacts with the nsp16 S-adenosyl-L-methionine (SAM)-binding pocket and stimulates the association of both the methyl donor SAM and capped RNA acceptor to nsp16 (137), thus activates nsp16 to methylate coronaviral mRNA cap at the 2'O-site (139).
In addition to the role as a part of viral TRCs, nsp10 is involved in the development of viral cytopathic effects. SARS-CoV nsp10 interacts with human NADH 4L subunit and cytochrome oxidase II and alters the activity of the NADH-cytochrome (140). Through these interactions, nsp10 caused an impaired oxidoreductase system and induced the depolarization of the mitochondria inner membrane (140).

Nsp11: small peptide with unknown function
Nsp11 is a small peptide located at the C-terminus of ORF1a. A three-stemmed mRNA pseudoknot containing a typical hepta-nucleotide sequence UUUAAAC is situated in the nsp11 coding region (141). This RNA structure results in a programmed -1 ribosomal frameshift, which leads to the production of ORF1ab (141,142).
A proximity-labeling experiment identified the existence of nearly all nonstructural proteins in the microenvironment of MHV replication complexes, except for nsp11 (128). SARS-CoV nsp11 does not interact with other nonstructural proteins in the mammalian two-hybrid assay (124). Currently, no known function was discovered for nsp11. According to these data, nsp11 is likely not a member of the replication complexes. However, the exact role of nsp11 remains to be explored.

Nsp12: the main RdRp
Nsp12 is the first nonstructural protein encoded by ORF1b and functions as the primary RNA-dependent RNA polymerase of coronaviruses. Nsp12 is at the center of the viral TRCs, which participate in both the synthesis of new full-length genomic RNA and the discontinuous transcription of subgenomic RNAs (18,(143)(144)(145)(146).
Coronavirus nsp12 mainly contains two functional domains. The C-terminal portion of nsp12 is the canonical RdRp domain that resembles a cupped right hand with fingers, palm, and the thumb holding the template RNA (44,147). The palm subdomain is the catalytic core that contains a conserved SDD motif in the active site. Like other positive-strand RNA viruses, mutations on the SDD motif abolished its RdRp activity (148). Asides from the C-terminal polymerase domain, nsp12 also contains a Nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain, which is unique to the Nidovirales (44,147). The arterivirus equine arteritis virus (EAV) nsp9 is the homolog of coronaviruses nsp12. The NiRAN domain of EAV nsp9 can be nucleotidylated, as a phosphoamide bond can be formed in between the protein and a GTP or UTP molecule (149). Single-particle cryo-EM imaging shows a structure where the NiRAN domain of nsp12 interacts with an nsp7+nsp8 heterodimer as well as a second nsp8 (44). The interaction with nsp7 and nsp8 cofactors seems to help stabilize the nsp12 RNA binding region as well as extending the RNA-binding surface (44). Genetic studies also supported the essential role of this nucleotidylating activity of nsp12 in EAV and SARS-CoV replication (149); however, the exact function of this activity is still unknown.
The SARS-CoV nsp12 RdRp activity was investigated after the outbreak of the SARS epidemic in 2002-2003 by several groups. Early studies using recombinant nsp12 showed a primer-dependent RNA polymerase activity (118,119,150). This primer-dependent RdRp activity of nsp12 was proposed to work with the nsp8 primase for the viral genome synthesis (118,150). However, biochemical data of nsp8 lacking the de novo initiation RNA synthesis activity could not fill the gap in virus replication cycles, as to how the virus maintains its 5'-end can not be explained. The biochemical assay by using recombinant nsp12 and nsp7/nsp8 cofactors showed de novo initiation ability of nsp12 (119), while the nsp8 primase activity could not be detected. Cryo-EM structure of this complex supported the de novo initiation activity for the nsp12 polymerase. This structure clearly showed that the active site of the nsp8 primase could not fit into the nsp12 RNA synthesis pocket (44), further confirmed the biochemical finding that nsp8 does not have primase activity.

Nsp13: the helicase
Helicases are the enzymes that unwind double-stranded DNAs or RNAs (151,152). RNA viruses encode RNA helicases (153) or recruit host alternatives (154)(155)(156) to promote their genome replication and viral gene expression. Apart from the "pro-viral" functions, RNA helicases are also involved in host antiviral responses (157). Explicitly, both animal and plant host innate immune systems encode RNA helicases to recognize and respond to foreign double-stranded RNA in the cytoplasm (157)(158)(159)(160).
Coronavirus nsp13 encodes a C-terminal helicase domain that belongs to the superfamily one helicases (152). While at the N-terminus of nsp13 is a zinc-binding domain (ZBD), which is conserved among the members in Nidovirales (142,161,162). Nsp13 exhibits both RNA and DNA duplex unwinding activities in vitro shown by the biochemical study of recombinant nsp13 from HCoV-229E (163,164) and SARS-CoV (165,166). Nsp13 unwinds its substrates in a 5'-3' direction by using the energy generated from NTPs and dNTPs hydrolysis, with the most effectivity towards ATP, dATP, and GTP. Transient kinetic analysis showed that SARS-CoV nsp13 unwinds nucleic acid in discrete steps of 9.3 bp each, with a catalytic rate of 30 steps per second (167). Moreover, the unwinding activity can be enhanced 2-fold by nsp12 through nsp12-nsp13 interaction (167). Nsp13 preferentially binds to 5'-overhang and processes the double-strand with higher activity (168). ZBD is essential for helicase activity, and replacement of conserved ZBD Cys and His residues disrupted ATPase activities of HCoV-229E nsp13 (161). In addition to NTPase and dNTPase activities, RNA 5'-triphosphatase activity was discovered for HCoV-229E and SARS-CoV nsp13, which may catalyze the first step in the formation of the 5'-cap structure of viral RNAs (164,165). The NTPase activity of arterivirus nsp10, the homolog of coronavirus nsp13, is essential to the viral survival (169). While mutation (A335V) on the RNA binding channel of MHV nsp13 conferred decreased viral replication both in vitro and in vivo (170).

Nsp14: dual-functional RNA modifier
The CoV nsp14 has dual functions in viral RNA processing (142). The N-terminus of nsp14 is a 3'-5' exonuclease (ExoN) (19). Nsp14 ExoN belongs to the DEDD superfamily of exonucleases. The exonuclease activity that acts on both ssRNA and dsRNA, but can not hydrolyze DNA or ribose-2'-O-methylated RNA substrates in vitro (171). The ExoN activity can be stimulated to >35 fold by interacting with nsp10, and this complex can release one mismatched nucleotide from the 3'-end of the newly synthesized RNA strand (172). Coronavirus nsp14 ExoN activity is involved in the RNA proofreading machinery during viral replication. Mutant MHV or SARS-CoV with deficient ExoN activity displays a higher mutation rate and lower replication fidelity (173,174). Moreover, ExoN is related to the host innate immune response. MHV lacking ExoN activity showed increased sensitivity to cellular pretreatment with IFN-β (175). On the contrary, TGEV nsp14 probably is responsible for the induction of IFN-β production through interaction with cellular RNA helicase DDX1 (176).
The C-terminus of nsp14 encodes for the guanosine N7-methyltransferase activity (177). When SARS-CoV or TGEV nsp14 was overexpressed in yeast null mutant of mRNA guanine 7-methyltransferase abd1, the growth-deficient phenotype was restored (177). The nsp14 specifically methylates GTP, dGTP, or the inverted guanosine molecule attached to the 5'-end of RNA (178). When a point mutation at the MTase domain (D331A) was incorporated into SARS-CoV replicon carrying a luciferase reporter, the luciferase activity dropped to 10%, and the subgenomic RNA accumulation dropped to 19% of the wt level (177). The defect in viral transcription and gene expression is likely due to the unstable viral RNA produced by the virus.
Nsp15 forms a hexamer (46,48,184) that depends on manganese as a cofactor for its ribonuclease activity (185)(186)(187). Nsp15 can process both ssRNA and dsRNA, but not DNA (185). Blocking the 5'-or the 3'-ends of substrate RNAs did not prevent the RNA degradation, suggestive of the activity is mainly towards the middle (endo) portion of the RNA substrate (185,188). A mass spectrometry analysis of SARS-CoV nsp15 digested RNA products revealed that the major RNA cleavage site is the 3' of the uridylate. The 3' of cytidylate in favored sequence contexts can also be the site for cleavage (189). The hexamer form of SARS-CoV nsp15 was found to be responsible for RNA binding (189). Other studies on MHV nsp15 showed a higher binding affinity to RNA and similar ribonuclease activity of the monomer form (47).
Early genetic attempts using mutations of the vital amino acids in the MHV nsp15 catalytic pocket found that nsp15 deficient caused decreased RNA replication and viral growth in cell culture (190). Breakthrough in understanding its unique function was made a decade later by testing viral replication in mouse macrophages and in vivo (24,191). The nsp15 EndoU-deficient mutant MHV stimulated an early induction of cytosolic dsRNA during infection, led to robust induction of IFN-I and PKR-mediated apoptosis, and exhibited impaired viral growth (24,191). Moreover, the infection of the mutant virus was restricted in primary cells in vivo and could not efficiently spread (24). Thus the endoribonuclease activity of nsp15 promotes digestion of excessive viral dsRNAs at the replication sites and mediates viral evasion of host dsRNA-mediated innate immunity at the early stage of infection (186).
SARS-CoV or MERS-CoV nsp16 mutants that contain mutations on conserved KDKE motif strongly attenuated viral infection in vitro and in vivo (34,194). As common strategies among various RNA viruses to counteract innate immunity (192), the cap-1 type of modification help coronavirus evade the RNA recognition machinery and the antiviral responses mediated by IFN-I (34,194). Viruses defective in the 2'-O-MTase activity showed increased sensitivity to IFN-I treatment comparing to the wt virus (34,194). Host cytoplasmic RNA sensor Mda5 was shown to recognize those viral transcripts produced by the nsp16 mutant virus, as in the absence of Mda5, the replication and virulence of the mutant virus was restored (34,194).

Nonstructural proteins are useful drug targets
Phylogenetic studies and serological evidence shows that the human-infecting betacoronaviruses, including highly pathogenic SARS-CoV, SARS-CoV-2, and MERS-CoV, have animal origins (9,63,(195)(196)(197). Bats are identified as the natural reservoirs for human coronaviruses (198)(199)(200). Bats usually do not display signs of disease when infected with coronaviruses and have evolved an immune system that allows virus propagation (201,202). Due to the increasing human activities and global warming that result in the changing of bat habitats, the emergency of new zoonotic coronaviral diseases are very likely to occur (199)(200)(201). This consensus demands the development of novel anticoronaviral medicines. The nonstructural proteins or the viral replication processes of coronaviruses have been shown as potential antiviral drug targets (203).
The coronaviral nsp12 RdRp serves as an important drug target. Nucleotide analogs can directly compete with nucleotide substrates of RdRp, resulting in the halt of the reaction as well as disruption of the viral replication (147). A recent report showed that remdesivir, an adenosine analog, can efficiently inhibit viral infection in SARS-CoV-2 sensitive Huh-7 cell (219). The first SARS-CoV-2 patient in the United States administered remdesivir under the protocol of "compassionate use" and showed improved clinical conditions in about 24 hours and finally discharged from hospital, suggestive of a possible efficacy of remdesivir against coronavirus in this individual case (8). The uses of remdesivir and several other RdRp inhibitors for the treatment of COVID-19 are currently in clinical trials (213,214).
Helicase domain of nsp13 also showed promise as a potential target. SSYA10-001, a 1,2,4-triazole derivative, can block the unwinding activity of nsp13 in a non-competitive manner (220), while myricetin and scutellarein suppress the ATPase activity of nsp13 (221). Furthermore, the adamantane-derived bananins can inhibit both ATPase and helicase activity of nsp13 and cause decreased viral infection in cell culture (222). However, none of these drug candidates went to clinical trials by the end of February 2020 (214).

Summary
Significant progress has been made in the understanding of the coronavirus nonstructural proteins, especially after the SARS epidemic in 2003. Most of the studies utilized cultured cells to investigate coronaviral infection, but the transgenic humanized mouse model also played critical roles in dissecting viral pathogenesis as well as aiding drug discoveries (223)(224)(225). These studies of coronaviral nonstructural proteins provided in-depth knowledge of how the viruses establish their infection and will continue to aid the discovery of new drugs effectively against coronaviruses.