<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "JATS-journalpublishing1.dtd">
<article xml:lang="en" article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Exploration of Immunology</journal-id>
<journal-title-group>
<journal-title>Exploration of Immunology</journal-title>
</journal-title-group>
<issn pub-type="epub">2768-6655</issn>
<publisher>
<publisher-name>Open Exploration</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">10033</article-id>
<article-id pub-id-type="doi">10.37349/ei.2021.00003</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Original Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title><italic>In silico</italic> investigation of binding affinities between human leukocyte antigen class I molecules and SARS-CoV-2 virus spike and ORF1ab proteins</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-1087-4126</contrib-id>
<name>
<surname>Charonis</surname>
<given-names>Spyros A.</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>1</sup></xref>
<xref ref-type="aff" rid="AFF2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-5088-3986</contrib-id>
<name>
<surname>Tsilibary</surname>
<given-names>Effie-Photini</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>1</sup></xref>
<xref ref-type="aff" rid="AFF2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-4412-725X</contrib-id>
<name>
<surname>Georgopoulos</surname>
<given-names>Apostolos P.</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>1</sup></xref>
<xref ref-type="aff" rid="AFF2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="C1"><sup>&#x0002A;</sup></xref>
</contrib>
<contrib contrib-type="academic-editor">
<name>
<surname>Mehra</surname>
<given-names>Narinder K.</given-names>
</name>
</contrib>
<aff id="AFF1"><label>1</label>Brain Sciences Center, Department of Veterans Affairs Health Care System, Minneapolis, MN 55417, USA</aff>
<aff id="AFF2"><label>2</label>Department of Neuroscience, University of Minnesota Medical School, Minneapolis, MN 55455, USA</aff>
<aff id="AFF3">the Indian Council of Medical Research (ICMR), India</aff>
</contrib-group>
<author-notes>
<corresp id="C1"><label>&#x0002A;</label><bold>Correspondence:</bold> Apostolos P. Georgopoulos, Brain Sciences Center, Department of Veterans Affairs Health Care System, 1 Veterans Drive, Minneapolis, MN 55417, USA; Department of Neuroscience, University of Minnesota Medical School, 321 Church St SE, Minneapolis, MN 55455, USA. <email>omega@umn.edu</email></corresp>
</author-notes>
<pub-date pub-type="ppub">
<year>2021</year>
</pub-date>
<pub-date pub-type="epub">
<day>30</day>
<month>04</month>
<year>2021</year>
</pub-date>
<volume>1</volume>
<fpage>16</fpage>
<lpage>26</lpage>
<history>
<date date-type="received">
<day>11</day>
<month>01</month>
<year>2021</year></date>
<date date-type="accepted">
<day>19</day>
<month>02</month>
<year>2021</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; The Author(s) 2021.</copyright-statement>
<copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This is an Open Access article licensed under a Creative Commons Attribution 4.0 International License (<ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link>), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.</license-p></license>
</permissions>
<abstract>
<sec><title>Aim:</title>
<p>The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes coronavirus disease 2019, a global pandemic. There is hence an urgent need for effective approaches to understand the mechanism of viral interaction with immune cells that lead to viral elimination and subsequent long-term immunity. The first, immediate response to the viral infection involves mobilization of native immunity and human leukocyte antigen (HLA) class I mechanisms to kill infected cells and eliminate the virus. The second line of defense involves the activation of HLA class II system for the production of antibodies against the virus which will add to the elimination of the virus and prevent future infections. In a previous study, investigated the relations between SARS-CoV-2 spike glycoprotein (S protein) and HLA class II alleles were investigaed; here report on the relations of the S protein and the open reading frame 1ab (ORF1ab) of SARS-CoV-2 to HLA class I alleles.</p>
</sec>
<sec><title>Methods:</title>
<p>An <italic>in silico</italic> sliding window approach was used to determine exhaustively the binding affinities of linear epitopes of 10 amino acid length (10-mers) to each of 61 common (global frequency &#x2265; 0.01) HLA class I molecules (17, 24 and 20 from gene loci <italic>A</italic>, <italic>B</italic> and <italic>C</italic>, respectively). A total of 8,354 epitopes were analyzed; 1,263 from the S protein and 7,091 from ORF1ab.</p>
</sec>
<sec><title>Results:</title>
<p>HLA-<italic>A</italic> genes were the most effective at binding SARS-CoV-2 epitopes for both spike and ORF1ab proteins. Good binding affinities were found for all three genes and were distributed throughout the length of the S protein and ORF1ab polyprotein sequence.</p>
</sec>
<sec><title>Conclusions:</title>
<p>Common HLA class I molecules, as a population, are very well suited to binding with high affinity to SARS-CoV-2 spike and ORF1ab proteins and hence should be effective in aiding the early elimination of the virus.</p>
</sec>
</abstract>
<kwd-group>
<kwd>ORF1ab</kwd>
<kwd>SARS-CoV-2</kwd>
<kwd>SARS-CoV-2 spike glycoprotein protein</kwd>
<kwd>human leukocyte antigen class I</kwd>
<kwd><italic>in silico</italic> investigation</kwd>
</kwd-group></article-meta>
</front>
<body>
<sec id="s1"><title>Introduction</title>
<p>Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19), a disease that has now become a global pandemic. The steps following infection include a first phase during which native immunity mechanisms and human leukocyte antigen (HLA) class I molecules (of the <italic>A</italic>, <italic>B</italic>, <italic>C</italic> genes) combat the virus by recruiting cells entered by the virus which is then fragmented via proteasomal cleavage to 9&#x2013;13 amino acid (AA) fragments. HLA class I genes code for cell-surface glycoproteins are expressed on nucleated cells and present on the cell surface those antigen peptides to CD8<sup>&#x0002B;</sup> cytotoxic T cells to signal cell destruction, thus eliminating infected cells. Thus, HLA class I restricted processing and presentation alerts the immune system to any infectious processes unfolding intracellularly and provides potential targets for a cytotoxic T cell response. During the second phase of response to the viral infection, HLA class II molecules are involved to initiate the production of specific antibodies against the virus. HLA class II molecules (of the HLA<italic>-DR</italic>, <italic>-DQ</italic> and <italic>-DP</italic> genes) are expressed on professional antigen-presenting cells (e.g., macrophages, dendritic cells) and present endocytosed extracellular antigen peptides to CD4<sup>&#x0002B;</sup> T cells to promote B-cell mediated antibody production and immune memory. The HLA region in chromosome 6 is the most highly polymorphic in the human genome resulting in considerable individual and population variability in HLA composition, reflecting the long evolutionary history of exposure to and dealing with elimination and ultimate protection from various pathogens &#x0005B;<xref ref-type="bibr" rid="B1">1</xref>, <xref ref-type="bibr" rid="B2">2</xref>&#x0005D;.</p>
<p>In a previous study &#x0005B;<xref ref-type="bibr" rid="B3">3</xref>&#x0005D;, we reported on the relations between SARS-CoV-2 virus spike glycoprotein (S protein) and 66 common HLA class II alleles, investigated using an <italic>in silico</italic> approach &#x0005B;<xref ref-type="bibr" rid="B4">4</xref>&#x0005D; by assessing the binding affinity of epitopes of the S protein to these most commonly occurring alleles (frequency &#x2265; 0.01). In the current study, we employed the same sliding epitope window methodology to exhaustively scan the entire S protein and open reading frame 1ab (ORF1ab) protein and determine the binding affinity of each <italic>n</italic>-mers (<italic>n</italic> &#x0003D; 10) AA epitopes to 61 common HLA class I alleles.</p>
</sec>
<sec id="s2"><title>Materials and methods</title>
<p>The main objective of this study was to exhaustively assess the binding affinities of HLA class I molecules to the SARS-CoV-2 S protein and the ORF1ab polyprotein. For that purpose, we assessed the binding affinities of 61 common class I alleles as described below.</p>
<sec><title>HLA alleles</title>
<p>For this study, we selected the more frequent alleles of classical HLA class I genes (<italic>A</italic>, <italic>B</italic>, <italic>C</italic>), namely all alleles with global frequencies &#x2265; 0.01 (<italic>n</italic> &#x0003D; 61 total), an arbitrary but reasonable threshold. For that purpose, we obtained an Estimation of Global Allele Frequencies by querying the website <ext-link ext-link-type="uri" xlink:href="http://www.allelefrequencies.net">http://www.allelefrequencies.net</ext-link>/ &#x0005B;<xref ref-type="bibr" rid="B5">5</xref>&#x0005D;. The alleles with frequencies &#x2265; 0.01 that we used are given in <xref ref-type="table" rid="T1">Table 1</xref>. They comprised 17, 24 and 20 alleles of <italic>A</italic>, <italic>B</italic> and <italic>C</italic> genes, respectively.</p>
<table-wrap id="T1" position="float"><label>Table 1.</label><caption><p>HLA class I alleles used, ordered by their global frequencies in descending order (see text for details)</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th colspan="2" align="left" valign="top">Gene <italic>A</italic></th>
<th colspan="2" align="left" valign="top">Gene <italic>B</italic></th>
<th colspan="2" align="left" valign="top">Gene <italic>C</italic></th>
</tr>
<tr>
<th align="left" valign="top">Allele</th>
<th align="left" valign="top">Frequency</th>
<th align="left" valign="top">Allele</th>
<th align="left" valign="top">Frequency</th>
<th align="left" valign="top">Allele</th>
<th align="left" valign="top">Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:01</italic></td>
<td align="left" valign="top">0.19257</td>
<td align="left" valign="top"><italic>B&#x0002A;07:02</italic></td>
<td align="left" valign="top">0.08271</td>
<td align="left" valign="top"><italic>C&#x0002A;04:01</italic></td>
<td align="left" valign="top">0.13265</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;01:01</italic></td>
<td align="left" valign="top">0.10933</td>
<td align="left" valign="top"><italic>B&#x0002A;08:01</italic></td>
<td align="left" valign="top">0.06555</td>
<td align="left" valign="top"><italic>C&#x0002A;07:02</italic></td>
<td align="left" valign="top">0.12442</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;24:02</italic></td>
<td align="left" valign="top">0.09998</td>
<td align="left" valign="top"><italic>B&#x0002A;35:01</italic></td>
<td align="left" valign="top">0.05815</td>
<td align="left" valign="top"><italic>C&#x0002A;07:01</italic></td>
<td align="left" valign="top">0.12235</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;03:01</italic></td>
<td align="left" valign="top">0.09324</td>
<td align="left" valign="top"><italic>B&#x0002A;44:02</italic></td>
<td align="left" valign="top">0.05244</td>
<td align="left" valign="top"><italic>C&#x0002A;06:02</italic></td>
<td align="left" valign="top">0.08360</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;11:01</italic></td>
<td align="left" valign="top">0.07282</td>
<td align="left" valign="top"><italic>B&#x0002A;44:03</italic></td>
<td align="left" valign="top">0.04972</td>
<td align="left" valign="top"><italic>C&#x0002A;04:43</italic></td>
<td align="left" valign="top">0.08053</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;23:01</italic></td>
<td align="left" valign="top">0.03373</td>
<td align="left" valign="top"><italic>B&#x0002A;51:01</italic></td>
<td align="left" valign="top">0.04963</td>
<td align="left" valign="top"><italic>C&#x0002A;03:04</italic></td>
<td align="left" valign="top">0.06754</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;68:01</italic></td>
<td align="left" valign="top">0.03305</td>
<td align="left" valign="top"><italic>B&#x0002A;40:01</italic></td>
<td align="left" valign="top">0.04177</td>
<td align="left" valign="top"><italic>C&#x0002A;05:01</italic></td>
<td align="left" valign="top">0.05914</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;26:01</italic></td>
<td align="left" valign="top">0.03199</td>
<td align="left" valign="top"><italic>B&#x0002A;15:01</italic></td>
<td align="left" valign="top">0.03966</td>
<td align="left" valign="top"><italic>C&#x0002A;01:02</italic></td>
<td align="left" valign="top">0.04659</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;29:02</italic></td>
<td align="left" valign="top">0.02847</td>
<td align="left" valign="top"><italic>B&#x0002A;18:01</italic></td>
<td align="left" valign="top">0.03814</td>
<td align="left" valign="top"><italic>C&#x0002A;02:02</italic></td>
<td align="left" valign="top">0.04371</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;32:01</italic></td>
<td align="left" valign="top">0.02784</td>
<td align="left" valign="top"><italic>B&#x0002A;14:02</italic></td>
<td align="left" valign="top">0.02643</td>
<td align="left" valign="top"><italic>C&#x0002A;16:01</italic></td>
<td align="left" valign="top">0.04129</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;31:01</italic></td>
<td align="left" valign="top">0.02769</td>
<td align="left" valign="top"><italic>B&#x0002A;57:01</italic></td>
<td align="left" valign="top">0.02553</td>
<td align="left" valign="top"><italic>C&#x0002A;12:03</italic></td>
<td align="left" valign="top">0.03938</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;68:02</italic></td>
<td align="left" valign="top">0.01941</td>
<td align="left" valign="top"><italic>B&#x0002A;53:01</italic></td>
<td align="left" valign="top">0.02361</td>
<td align="left" valign="top"><italic>C&#x0002A;08:02</italic></td>
<td align="left" valign="top">0.03282</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:06</italic></td>
<td align="left" valign="top">0.01365</td>
<td align="left" valign="top"><italic>B&#x0002A;58:01</italic></td>
<td align="left" valign="top">0.02326</td>
<td align="left" valign="top"><italic>C&#x0002A;15:02</italic></td>
<td align="left" valign="top">0.02997</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;33:01</italic></td>
<td align="left" valign="top">0.01282</td>
<td align="left" valign="top"><italic>B&#x0002A;27:05</italic></td>
<td align="left" valign="top">0.02240</td>
<td align="left" valign="top"><italic>C&#x0002A;17:01</italic></td>
<td align="left" valign="top">0.01957</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;25:01</italic></td>
<td align="left" valign="top">0.01255</td>
<td align="left" valign="top"><italic>B&#x0002A;13:02</italic></td>
<td align="left" valign="top">0.02061</td>
<td align="left" valign="top"><italic>C&#x0002A;14:02</italic></td>
<td align="left" valign="top">0.01949</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:02</italic></td>
<td align="left" valign="top">0.01097</td>
<td align="left" valign="top"><italic>B&#x0002A;40:02</italic></td>
<td align="left" valign="top">0.01981</td>
<td align="left" valign="top"><italic>C&#x0002A;08:01</italic></td>
<td align="left" valign="top">0.01833</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:05</italic></td>
<td align="left" valign="top">0.01065</td>
<td align="left" valign="top"><italic>B&#x0002A;38:01</italic></td>
<td align="left" valign="top">0.01872</td>
<td align="left" valign="top"><italic>C&#x0002A;12:02</italic></td>
<td align="left" valign="top">0.01678</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;49:01</italic></td>
<td align="left" valign="top">0.01844</td>
<td align="left" valign="top"><italic>C&#x0002A;03:02</italic></td>
<td align="left" valign="top">0.01492</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;35:03</italic></td>
<td align="left" valign="top">0.01644</td>
<td align="left" valign="top"><italic>C&#x0002A;07:04</italic></td>
<td align="left" valign="top">0.01196</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;45:01</italic></td>
<td align="left" valign="top">0.01291</td>
<td align="left" valign="top"><italic>C&#x0002A;17:03</italic></td>
<td align="left" valign="top">0.01037</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;15:03</italic></td>
<td align="left" valign="top">0.01247</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;46:01</italic></td>
<td align="left" valign="top">0.01204</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;37:01</italic></td>
<td align="left" valign="top">0.01181</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"><italic>B&#x0002A;39:01</italic></td>
<td align="left" valign="top">0.01011</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec><title>SARS-CoV-2 S protein</title>
<p>The AA sequence of the SARS-CoV-2 S protein (&#x201C;glycoprotein&#x201D;) was retrieved from the UniprotKB database &#x0005B;<xref ref-type="bibr" rid="B6">6</xref>&#x0005D;. It consists of 1,273 AA residues. As mentioned above, the main objective of this study was to exhaustively assess the binding affinities of HLA class I and II molecules to the SARS-CoV-2 S protein. For that purpose, we used a sliding epitope window approach &#x0005B;<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B4">4</xref>&#x0005D; to partition the sequence of the S protein into subsequences of all possible consecutive 10-mers for (e.g., residues S1-S10, S2-S11, &#x2026;, S1263-S1273) that covered the entire sequence length (1,273 AA). The method is illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float"><label>Figure 1.</label><caption><p>A sample of the sliding window approach for the SARS-CoV-2 S protein (see text for details)</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g001.tif"/></fig>
</sec>
<sec><title>ORF1ab</title>
<p>The complete AA sequence of the ORF1ab polyprotein was retrieved on February 2, 2021 from the NCBI SARS-CoV-2 data hub resource (URL: <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/sars-cov-2/">https://www.ncbi.nlm.nih.gov/sars-cov-2/</ext-link>). The retrieved sequence (GenBank Acc. number: QQX03240) was matched by filtering the search for release date (most recent sequence) and protein (ORF1ab polyprotein). The ORF1ab encodes numerous viral proteins including a leader protein, non-structural protein (nsp)2-9, 3C-like proteinase, RNA-dependent RNA polymerase, helicase, exonuclease, endoRNAse and methyltransferase. The ORF1ab polyprotein consists of 7,101 AA residues. The sliding epitope window approach illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref> was applied to analyze the binding affinity of all possible linear epitopes.</p>
<p>Specifically, a set of 10-AA-length subsequences (considered in this analysis as putative epitopes) was generated (number of subsequences &#x0003D; length of the protein sequence - 10) and FASTA-formatted for input. The number of subsequences was hence 1,273 &#x2013; 10 &#x0003D; 1,263 for the S protein and 7,101 &#x2013; 10 &#x0003D; 7,091 for the ORF1ab polyprotein (&#x201C;polyprotein&#x201D;). The FASTA-formatted subsequences were then queried in the Immune Epitope Database (IEDB) (<ext-link ext-link-type="uri" xlink:href="http://www.iedb.org/">http://www.iedb.org/</ext-link>) &#x0005B;<xref ref-type="bibr" rid="B7">7</xref>&#x0005D; in order to determine their binding affinity to a specific HLA class I molecule. Binding affinity predictions were obtained using the NetMHCpan eluted ligands (EL) method &#x0005B;<xref ref-type="bibr" rid="B8">8</xref>&#x0005D;. For each 10-mer, a binding affinity score was predicted and reported as a percentile rank by comparing the peptide&#x2019;s score against the scores of 5 million random 10-mers (not limited to any one species or other taxonomic rank) selected from the SwissProt database &#x0005B;<xref ref-type="bibr" rid="B7">7</xref>&#x0005D;. For each gene and protein, all alleles and epitopes were entered as a single query and, thus, the same set of 5 million random 10-mers was employed to rank all queried alleles. Smaller percentile ranks indicate higher binding affinity. Next, the lowest (minimum) percentile rank (LPR) for each allele and 10-mer of the S protein and ORF1ab polyprotein was retrieved and retained (N &#x0003D; 1,263 epitopes &#x00D7; 61 alleles &#x0003D; 77,043 values for glycoprotein; N &#x0003D; 7,091 epitopes &#x00D7; 61 alleles &#x0003D; 432,551 for ORF1ab polyprotein). Finally, for various analyses (see below) we employed a conservative threshold of LPR &#x0003D; 1 and performed analyses on the percentage of cases with LPR &#x003C; 1 (&#x201C;good&#x201D; binding affinities).</p>
</sec>
<sec><title>Data analysis</title>
<p>Initially, we analyzed the data to assess the effect of protein, gene, and protein &#x00D7; gene interaction on the percentage of cases with LPR &#x003C; 1, where this percentage was calculated for each allele across all epitopes of each protein. Subsequently, for each epitope, we calculated the percentage of alleles for which LPR &#x003C; 1 in order to evaluate the distribution of allele affinity across the sequence of the two proteins. Standard statistical methods were employed in these analyses, including analysis of variance (ANOVA) and linear regression, using the IBM-SPSS statistical package (version 27).</p>
</sec>
</sec>
<sec id="s3"><title>Results</title>
<sec><title>Overall estimated affinities</title>
<p>For the S protein, there were 981/77,043 (1.273&#x00025;) cases with LPR &#x003C; 1, and for the ORF1ab polyprotein 5,767/432,551 (1.333&#x00025;). These proportions did not differ significantly (Wald H0 test of two proportions, two-sided <italic>P</italic> &#x0003D; 0.180). All values for which LPR &#x003C; 1 for each allele (with associated epitope sequences) are given in <xref ref-type="supplementary-material" rid="SUP1">Supplemental Table A</xref> for the S protein and <xref ref-type="supplementary-material" rid="SUP1">Supplemental Table B</xref> for the ORF1ab polyprotein.</p>
</sec>
<sec><title>Affinities of individual alleles</title>
<p>The percentage of LPR &#x003C; 1 (across all epitopes of a protein) for each allele and protein, and their average are shown in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap id="T2" position="float"><label>Table 2.</label><caption><p>Percentages of LPR &#x003C; 1 (across all epitopes of a protein) for all alleles studied, ranked from highest to lowest</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Allele</th>
<th align="left" valign="top">Spike</th>
<th align="left" valign="top">ORF1ab</th>
<th align="left" valign="top">Average</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;29:02</italic></td>
<td align="left" valign="top">0.0493</td>
<td align="left" valign="top">0.0497</td>
<td align="left" valign="top">0.0495</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;01:01</italic></td>
<td align="left" valign="top">0.0441</td>
<td align="left" valign="top">0.0442</td>
<td align="left" valign="top">0.0441</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;26:01</italic></td>
<td align="left" valign="top">0.0415</td>
<td align="left" valign="top">0.0418</td>
<td align="left" valign="top">0.0417</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;25:01</italic></td>
<td align="left" valign="top">0.0428</td>
<td align="left" valign="top">0.0398</td>
<td align="left" valign="top">0.0413</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;24:02</italic></td>
<td align="left" valign="top">0.0402</td>
<td align="left" valign="top">0.0363</td>
<td align="left" valign="top">0.0383</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;23:01</italic></td>
<td align="left" valign="top">0.0415</td>
<td align="left" valign="top">0.0347</td>
<td align="left" valign="top">0.0381</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;15:01</italic></td>
<td align="left" valign="top">0.0350</td>
<td align="left" valign="top">0.0412</td>
<td align="left" valign="top">0.0381</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;11:01</italic></td>
<td align="left" valign="top">0.0337</td>
<td align="left" valign="top">0.0375</td>
<td align="left" valign="top">0.0356</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;03:01</italic></td>
<td align="left" valign="top">0.0286</td>
<td align="left" valign="top">0.0386</td>
<td align="left" valign="top">0.0336</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;69:01</italic></td>
<td align="left" valign="top">0.0299</td>
<td align="left" valign="top">0.0356</td>
<td align="left" valign="top">0.0327</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;46:01</italic></td>
<td align="left" valign="top">0.0312</td>
<td align="left" valign="top">0.0340</td>
<td align="left" valign="top">0.0326</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;53:01</italic></td>
<td align="left" valign="top">0.0324</td>
<td align="left" valign="top">0.0280</td>
<td align="left" valign="top">0.0302</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:02</italic></td>
<td align="left" valign="top">0.0247</td>
<td align="left" valign="top">0.0331</td>
<td align="left" valign="top">0.0289</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;35:01</italic></td>
<td align="left" valign="top">0.0286</td>
<td align="left" valign="top">0.0268</td>
<td align="left" valign="top">0.0277</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;57:01</italic></td>
<td align="left" valign="top">0.0221</td>
<td align="left" valign="top">0.0331</td>
<td align="left" valign="top">0.0276</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;58:01</italic></td>
<td align="left" valign="top">0.0208</td>
<td align="left" valign="top">0.0319</td>
<td align="left" valign="top">0.0263</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;68:01</italic></td>
<td align="left" valign="top">0.0221</td>
<td align="left" valign="top">0.0291</td>
<td align="left" valign="top">0.0256</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;14:02</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0277</td>
<td align="left" valign="top">0.0256</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;44:03</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0270</td>
<td align="left" valign="top">0.0252</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;44:02</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0266</td>
<td align="left" valign="top">0.0250</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;32:01</italic></td>
<td align="left" valign="top">0.0208</td>
<td align="left" valign="top">0.0275</td>
<td align="left" valign="top">0.0241</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:05</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0238</td>
<td align="left" valign="top">0.0236</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;07:02</italic></td>
<td align="left" valign="top">0.0247</td>
<td align="left" valign="top">0.0217</td>
<td align="left" valign="top">0.0232</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:01</italic></td>
<td align="left" valign="top">0.0195</td>
<td align="left" valign="top">0.0266</td>
<td align="left" valign="top">0.0230</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;31:01</italic></td>
<td align="left" valign="top">0.0260</td>
<td align="left" valign="top">0.0197</td>
<td align="left" valign="top">0.0228</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;35:03</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0222</td>
<td align="left" valign="top">0.0228</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;33:01</italic></td>
<td align="left" valign="top">0.0221</td>
<td align="left" valign="top">0.0213</td>
<td align="left" valign="top">0.0217</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;49:01</italic></td>
<td align="left" valign="top">0.0221</td>
<td align="left" valign="top">0.0206</td>
<td align="left" valign="top">0.0213</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;15:03</italic></td>
<td align="left" valign="top">0.0195</td>
<td align="left" valign="top">0.0231</td>
<td align="left" valign="top">0.0213</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A&#x0002A;02:06</italic></td>
<td align="left" valign="top">0.0208</td>
<td align="left" valign="top">0.0217</td>
<td align="left" valign="top">0.0212</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;40:01</italic></td>
<td align="left" valign="top">0.0169</td>
<td align="left" valign="top">0.0254</td>
<td align="left" valign="top">0.0212</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;45:01</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0185</td>
<td align="left" valign="top">0.0209</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;27:05</italic></td>
<td align="left" valign="top">0.0234</td>
<td align="left" valign="top">0.0183</td>
<td align="left" valign="top">0.0208</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;40:02</italic></td>
<td align="left" valign="top">0.0182</td>
<td align="left" valign="top">0.0208</td>
<td align="left" valign="top">0.0195</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;51:01</italic></td>
<td align="left" valign="top">0.0195</td>
<td align="left" valign="top">0.0192</td>
<td align="left" valign="top">0.0193</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;38:01</italic></td>
<td align="left" valign="top">0.0182</td>
<td align="left" valign="top">0.0199</td>
<td align="left" valign="top">0.0190</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;02:02</italic></td>
<td align="left" valign="top">0.0182</td>
<td align="left" valign="top">0.0183</td>
<td align="left" valign="top">0.0182</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;03:02</italic></td>
<td align="left" valign="top">0.0195</td>
<td align="left" valign="top">0.0160</td>
<td align="left" valign="top">0.0177</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;13:02</italic></td>
<td align="left" valign="top">0.0182</td>
<td align="left" valign="top">0.0160</td>
<td align="left" valign="top">0.0171</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;07:02</italic></td>
<td align="left" valign="top">0.0195</td>
<td align="left" valign="top">0.0146</td>
<td align="left" valign="top">0.0170</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;01:02</italic></td>
<td align="left" valign="top">0.0182</td>
<td align="left" valign="top">0.0106</td>
<td align="left" valign="top">0.0144</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;18:01</italic></td>
<td align="left" valign="top">0.0130</td>
<td align="left" valign="top">0.0153</td>
<td align="left" valign="top">0.0141</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;12:02</italic></td>
<td align="left" valign="top">0.0169</td>
<td align="left" valign="top">0.0113</td>
<td align="left" valign="top">0.0141</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;17:01</italic></td>
<td align="left" valign="top">0.0143</td>
<td align="left" valign="top">0.0125</td>
<td align="left" valign="top">0.0134</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;17:03</italic></td>
<td align="left" valign="top">0.0143</td>
<td align="left" valign="top">0.0125</td>
<td align="left" valign="top">0.0134</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;39:01</italic></td>
<td align="left" valign="top">0.0091</td>
<td align="left" valign="top">0.0169</td>
<td align="left" valign="top">0.0130</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;05:01</italic></td>
<td align="left" valign="top">0.0078</td>
<td align="left" valign="top">0.0169</td>
<td align="left" valign="top">0.0123</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;08:02</italic></td>
<td align="left" valign="top">0.0091</td>
<td align="left" valign="top">0.0139</td>
<td align="left" valign="top">0.0115</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;07:01</italic></td>
<td align="left" valign="top">0.0130</td>
<td align="left" valign="top">0.0088</td>
<td align="left" valign="top">0.0109</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;04:01</italic></td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0113</td>
<td align="left" valign="top">0.0109</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;04:43</italic></td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0113</td>
<td align="left" valign="top">0.0109</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;15:02</italic></td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0095</td>
<td align="left" valign="top">0.0099</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;03:04</italic></td>
<td align="left" valign="top">0.0078</td>
<td align="left" valign="top">0.0120</td>
<td align="left" valign="top">0.0099</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;08:01</italic></td>
<td align="left" valign="top">0.0091</td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0097</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;37:01</italic></td>
<td align="left" valign="top">0.0091</td>
<td align="left" valign="top">0.0097</td>
<td align="left" valign="top">0.0094</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;16:01</italic></td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0081</td>
<td align="left" valign="top">0.0092</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;12:03</italic></td>
<td align="left" valign="top">0.0104</td>
<td align="left" valign="top">0.0065</td>
<td align="left" valign="top">0.0084</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;07:04</italic></td>
<td align="left" valign="top">0.0091</td>
<td align="left" valign="top">0.0069</td>
<td align="left" valign="top">0.0080</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;06:02</italic></td>
<td align="left" valign="top">0.0078</td>
<td align="left" valign="top">0.0053</td>
<td align="left" valign="top">0.0066</td>
</tr>
<tr>
<td align="left" valign="top"><italic>C&#x0002A;08:01</italic></td>
<td align="left" valign="top">0.0052</td>
<td align="left" valign="top">0.0062</td>
<td align="left" valign="top">0.0057</td>
</tr>
<tr>
<td align="left" valign="top"><italic>B&#x0002A;14:02</italic></td>
<td align="left" valign="top">0.0026</td>
<td align="left" valign="top">0.0058</td>
<td align="left" valign="top">0.0042</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec><title>Effect of protein and gene</title>
<p>For each allele, the percentage of LPR &#x003C; 1 across all protein epitopes was calculated and the effect of protein, gene and their interaction was evaluated using an ANOVA where protein and gene were fixed factors. The effect of protein was not statistically significant &#x0005B;<xref ref-type="fig" rid="F2">Figure 2</xref>; <italic>F</italic><sub>(1,116)</sub> &#x0003D; 0.476, <italic>P</italic> &#x0003D; 0.476&#x0005D; but the effect of gene was highly statistically significant &#x0005B;<xref ref-type="fig" rid="F3">Figure 3</xref>; <italic>F</italic><sub>(2,116)</sub> &#x0003D; 61.599, <italic>P</italic> &#x0003D; 5.90 &#x00D7; 10<sup>&#x2212;19</sup>&#x0005D;. Specifically, the percentage of LPR &#x003C; 1 above was significantly higher in gene <italic>A</italic> than gene <italic>B</italic> (<italic>P</italic> &#x0003D; 3.84 &#x00D7; 10<sup>&#x2212;9</sup>, ANOVA) and gene <italic>C</italic> (<italic>P</italic> &#x0003D; 6.07 &#x00D7; 10<sup>&#x2212;20</sup>), and in gene <italic>B</italic> than in gene <italic>C</italic> (<italic>P</italic> &#x0003D; 3.33 &#x00D7; 10<sup>&#x2212;7</sup>). Finally, the protein &#x00D7; gene interaction term was not statistically significant &#x0005B;<xref ref-type="fig" rid="F4">Figure 4</xref>; <italic>F</italic><sub>(2,116)</sub> &#x0003D; 0.397, <italic>P</italic> &#x0003D; 0.673&#x0005D;.</p>
<fig id="F2" position="float"><label>Figure 2.</label><caption><p>Mean percentages &#x0005B;&#x00B1; 95&#x00025; confidence interval (CI)&#x0005D; of alleles with LPR &#x003C; 1 for the two proteins. See text for details of statistical comparisons</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g002.tif"/></fig>
<fig id="F3" position="float"><label>Figure 3.</label><caption><p>Mean percentages (&#x00B1; 95&#x00025; CI) of alleles with LPR &#x003C; 1 for the 3 genes. See text for details of statistical comparisons</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g003.tif"/></fig>
<fig id="F4" position="float"><label>Figure 4.</label><caption><p>Mean percentages (&#x00B1; 95&#x00025; CI) of alleles with LPR &#x003C; 1 for the 3 genes and 2 proteins. See text for details of statistical comparisons</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g004.tif"/></fig>
</sec>
<sec><title>Distribution of affinities across protein sequences</title>
<p>For each epitope tested, the percentage of alleles with LPR &#x003C; 1 was calculated and plotted along the sequence of the S protein (<xref ref-type="fig" rid="F5">Figure 5</xref>) and ORF1ab polyprotein (<xref ref-type="fig" rid="F6">Figure 6</xref>). It was observed that high affinity scores exist throughout the entire protein sequence for both the S protein and ORF1ab.</p>
<fig id="F5" position="float"><label>Figure 5.</label><caption><p>Distribution of alleles with LPR &#x003C; 1 for an epitope along the S protein sequence</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g005.tif"/></fig>
<fig id="F6" position="float"><label>Figure 6.</label><caption><p>Distribution of alleles with LPR &#x003C; 1 for an epitope along the S protein sequence</p></caption><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="10033-g006.tif"/></fig>
</sec>
<sec><title>Application to individuals</title>
<p>Each individual carries 6 HLA class I alleles, 2 from each of the 3 classical genes (<italic>A</italic>, <italic>B</italic>, <italic>C</italic>). The average &#x201C;goodness&#x201D; of HLA affinities (LPR &#x003C; 1 across epitopes) for each allele is given in <xref ref-type="table" rid="T2">Table 2</xref>, from which the average LPR &#x003C; 1 percent across the 6 alleles carried by an individual can be calculated. (Since this study was focused on 61 common alleles, the LPR &#x003C; 1 value for other alleles will need to be calculated.)</p>
</sec>
<sec><title>Application to populations</title>
<p>The overall &#x201C;goodness&#x201D; of HLA class I affinities for a population comprising a set of specific alleles can be calculated as the mean of the products of the percent LPR &#x003C; 1 value (<xref ref-type="table" rid="T2">Table 2</xref>) times the corresponding allele frequency (<xref ref-type="table" rid="T1">Table 1</xref>). (Since this study was focused on 61 common alleles, the LPR &#x003C; 1 value for other alleles will need to be calculated for specific populations.) For a particular gene, the formula is:
<disp-formula id="FD1"><label>(1)</label><mml:math id="m1" display='block'><mml:mrow><mml:msup><mml:mi>Q</mml:mi><mml:mi>P</mml:mi></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x2211;</mml:mo><mml:mi>i</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msubsup><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>P</mml:mi></mml:msubsup></mml:mrow></mml:mstyle></mml:mrow></mml:math></disp-formula>
where <italic>Q<sup>P</sup></italic> is the overall affinity for population <italic>P</italic> consisting of <italic>N</italic> alleles <italic>i</italic> of that gene, with percent LPR &#x003C; 1 <italic>a<sub>i</sub></italic> and frequency in the population <inline-formula><mml:math id="m2" display='inline'><mml:mrow><mml:msubsup><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>P</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>. For the global set of alleles analyzed in this study, and using the allele frequencies from <xref ref-type="table" rid="T1">Table 1</xref> and the average (of the two proteins) percent LPR &#x003C; 1 (last column of <xref ref-type="table" rid="T2">Table 2</xref>), we obtained the following <italic>Q<sup>P</sup></italic> for the 3 genes. For genes <italic>A</italic>, <italic>B</italic> and <italic>C</italic>, <italic>Q<sup>P</sup></italic> &#x0003D; 0.001571, 0.000678, and 0.000608, respectively.</p>
</sec>
</sec>
<sec id="s4"><title>Discussion</title>
<p>In the current study, we investigated the relations between two important proteins of the SARS-CoV-2 proteome (the S protein, which is the main antigenic molecule of the virus &#x0005B;<xref ref-type="bibr" rid="B9">9</xref>&#x0005D; and ORF1ab polyprotein), and 61 common HLA class I alleles. There were three major aims in this <italic>in silico</italic> study, as follows. First, we sought to quantify the binding affinity of suitable fragments (linear epitopes) of the S protein and ORF1ab polyprotein to the 61 HLA class I alleles in an exhaustive manner using the sliding epitope window approach &#x0005B;<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B4">4</xref>&#x0005D;; second, we aimed to assess differences in allele binding affinities between the two proteins, among the three HLA class I genes, and their interaction; and third, we sought to assess the distribution of binding affinities across the sequences of the two proteins. The results obtained were clear cut. First, all 61 alleles exhibited good binding affinities to epitopes of both the spike and ORF1ab proteins; second, the percentage of good binding affinities (LPR &#x003C; 1) did not differ significantly between the two proteins (<xref ref-type="fig" rid="F2">Figure 2</xref>) but did so among the genes, with gene <italic>A</italic> having the best binding performance, followed by gene <italic>B</italic> and gene <italic>C</italic> (<italic>A</italic> &#x003E; <italic>B</italic> &#x003E; <italic>C</italic>) (<xref ref-type="fig" rid="F3">Figure 3</xref>); third, there was no significant interaction between proteins and genes (ANOVA), indicating a similar effect of the genes in both proteins (<xref ref-type="fig" rid="F4">Figure 4</xref>); and fourth, good binding affinities were distributed throughout the two protein sequences (<xref ref-type="fig" rid="F5">Figure 5</xref> and <xref ref-type="fig" rid="F6">6</xref>). These findings indicate that the HLA class I system is well suited to contribute effectively to the elimination of SARS-CoV-2 at the early phase of its entry to the body in otherwise immunocompetent individuals. The fact that good binding affinities are distributed throughout both protein sequences provides for robustness in the fight against the virus, since it almost guarantees a good possibility that linear epitopes of the proteins can bind to class I molecules irrespective of the location of glycoprotein cleavage by cell proteases.</p>
<p>With respect to the ORF1ab polyprotein, it should be mentioned that CD8<sup>&#x0002B;</sup> T cells recognize peptides presented by class I HLA surface receptors, and such peptides most commonly originate from cytosolic viral proteins following proteasomal degradation by host intracellular antigen processing pathways. Since T cells do not recognize surface antigens, in contrast to B cells and the antibodies that are produced and secreted by them, ORF1ab provides intracellular peptides that originate from essential enzymes and are thus more conserved. Proteins encoded by SARS-CoV-2 ORF1ab are more conserved among coronaviruses relative to the S protein, which has lower homology &#x0005B;<xref ref-type="bibr" rid="B10">10</xref>, <xref ref-type="bibr" rid="B11">11</xref>&#x0005D;. ORF1ab was included in this study because it is the largest viral gene in the <italic>Coronaviridae</italic> family and encodes both non-structural and accessory proteins that are less susceptible to evolutionary pressure and hence mutational changes, in contrast to the S protein.</p>
<p>Nguyen and colleagues &#x0005B;<xref ref-type="bibr" rid="B12">12</xref>&#x0005D; have published a similar <italic>in silico</italic> analysis of viral peptide binding affinities to HLA<italic>-A</italic>, <italic>-B</italic> and <italic>-C</italic> loci. In their study, the entire SARS-CoV-2 proteome was investigated, whereas our study focused on the ORF1ab polyprotein and S protein, which together comprise a considerably smaller number of potential <italic>n</italic>-mers epitopes. Furthermore, sequence homology was used to identify highly conserved sequences in SARS-CoV-2 from other common coronaviruses that could function as epitopes. Here, we did not specifically investigate for sequence homology of the S protein or ORF1ab polyprotein with other coronaviruses as this has been documented in previous studies &#x0005B;<xref ref-type="bibr" rid="B10">10</xref>, <xref ref-type="bibr" rid="B11">11</xref>&#x0005D;. Interestingly, Nguyen and colleagues &#x0005B;<xref ref-type="bibr" rid="B12">12</xref>&#x0005D; report that the HLA<italic>-A</italic> and HLA<italic>-C</italic> alleles exhibit the highest and lowest capacities to present SARS-CoV-2 antigens, respectively. This ranking is corroborated in our study (<xref ref-type="fig" rid="F3">Figure 3</xref>) focusing on multiple proteomic elements (the S protein and the ensemble of intracellular proteins encoded in ORF1ab) as opposed to the entire viral proteome. With respect to specific alleles with the highest and lowest binding affinities to SARS-CoV-2 antigens, our results differ from those of Nguyen et al. &#x0005B;<xref ref-type="bibr" rid="B12">12</xref>&#x0005D; who report that <italic>B&#x0002A;15:03</italic> is a better binder than <italic>B&#x0002A;46:01</italic>. This difference could be due to the smaller number of <italic>n</italic>-mers that we used (8,354 10-mers peptides that linearly partitioned the ORF1ab polyprotein and S protein <italic>vs.</italic> 32,257 peptide samples from the full SARS-CoV-2 proteome used by Nguyen and colleagues &#x0005B;<xref ref-type="bibr" rid="B12">12</xref>&#x0005D;).</p>
<p>Nelde and colleagues &#x0005B;<xref ref-type="bibr" rid="B13">13</xref>&#x0005D; recently published a study in which they identified HLA class I and HLA-DR binding peptides using experimental and computational methods. Their prediction workflow incorporates the NetMHCpan algorithm which was used in our study and reports immunogenic SARS-CoV-2-derived class I T cell epitopes. Among these are two 9-mers, LTDEMIAQY (HLA<italic>-A&#x0002A;01</italic>) and QYIKWPWYI (HLA<italic>-A&#x0002A;24</italic>), both of which have very high binding affinity scores in our analysis as well. The authors &#x0005B;<xref ref-type="bibr" rid="B13">13</xref>&#x0005D; used ELISA-type assays with <italic>in vitro</italic> amplified T cells from patients convalescing from SARS-CoV-2 infection and donors who were never exposed to the virus. Nelde et al. &#x0005B;<xref ref-type="bibr" rid="B13">13</xref>&#x0005D; reported that 29&#x00025; of SARS-CoV-2-derived HLA class I binding peptides were validated as naturally occurring T cell epitopes, showing that NetMHCpan is useful as a predictive algorithm for class I immunogenicity.</p>
<p>To the best of our knowledge, there have been only a limited number of published studies (by Nguyen et al. &#x0005B;<xref ref-type="bibr" rid="B12">12</xref>&#x0005D; and Nelde et al. &#x0005B;<xref ref-type="bibr" rid="B13">13</xref>&#x0005D;) that investigated viral protein binding affinity across a wide range of HLA alleles on a per-allele basis using predictive algorithms. Our present study builds on previous work &#x0005B;<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B4">4</xref>&#x0005D; and examines the relationship between two viral proteins of the SARS-CoV-2 proteome and HLA class I antigen presentation, with the results discussed above. Finally, we devised metric (<italic>Q<sup>P</sup></italic>) to calculate the overall binding affinity of a gene in a specific population by taking into account both the goodness of binding affinities of alleles of a gene and their frequency of occurrence in the population. Using the global allele frequencies of HLA class I alleles (<xref ref-type="table" rid="T1">Table 1</xref>), we found that the HLA class I <italic>A</italic> gene had the best overall binding performance, followed by genes B and C. It is possible that this result could differ for different populations (e.g., ethnicities, locations on Earth, etc.), thus providing a population-specific measure of how well the HLA class I makeup of that population could contribute to SARS-CoV-2 elimination at the early stage of fighting the viral infection.</p>
<p>We acknowledge, however, an important limitation to our work. This study was performed exclusively <italic>in silico</italic> and thus the data presented are subject to any constraints that may affect HLA binding affinity predictive tools (NetMHCpan EL in this case). Although important to consider, this limitation is mitigated by the fact that NetMHCpan and other binding affinity prediction algorithms included in the IEDB suite are trained using experimental datasets. In absence of clinical and patient data regarding actual Covid-19 cases, we are unable to assess the impact of disease-modifying risk factors such as age and clinical comorbidities &#x0005B;<xref ref-type="bibr" rid="B14">14</xref>, <xref ref-type="bibr" rid="B15">15</xref>&#x0005D; on HLA class I-conferred neutralization of the SARS-CoV-2 virus. In spite of this limitation, we believe that computational studies centered around binding affinity prediction can offer important insights as to how HLA genotype affects viral susceptibility, which can hopefully guide vaccination strategies.</p>
</sec>
</body>
<back>
<glossary><title>Abbreviations</title>
<def-list>
<def-item><term>AA:</term><def><p> amino acid</p></def></def-item>
<def-item><term>ANOVA:</term><def><p> analysis of variance</p></def></def-item>
<def-item><term>CI:</term><def><p> confidence interval</p></def></def-item>
<def-item><term>COVID-19:</term><def><p> coronavirus disease 2019</p></def></def-item>
<def-item><term>HLA:</term><def><p> human leukocyte antigen</p></def></def-item>
<def-item><term>LPR:</term><def><p> lowest (minimum) percentile rank</p></def></def-item>
<def-item><term>ORF1ab:</term><def><p> open reading frame 1ab</p></def></def-item>
<def-item><term>S protein:</term><def><p> spike glycoprotein</p></def></def-item>
<def-item><term>SARS-CoV-2:</term><def><p> severe acute respiratory syndrome coronavirus 2</p></def></def-item>
</def-list>
</glossary>
<sec id="s5"><title>Supplementary materials</title>
<p>The supplementary materials for this article are available at: <ext-link ext-link-type="uri" xlink:href="https://www.explorationpub.com/uploads/Article/file/10033_sup_1.pdf">https://www.explorationpub.com/uploads/Article/file/10033_sup_1.pdf</ext-link>.</p>
</sec>
<sec id="s6"><title>Declarations</title>
<sec><title>Author contributions</title>
<p>SAC and APG contributed to data analysis; SAC, EPT and APG contributed to writing the manuscript.</p>
</sec>
<sec><title>Conflicts of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec><title>Ethical approval</title>
<p>Not applicable.</p>
</sec>
<sec><title>Consent to participate</title>
<p>Not applicable.</p>
</sec>
<sec><title>Consent to publication</title>
<p>Not applicable.</p>
</sec>
<sec><title>Availability of data and materials</title>
<p>Not applicable.</p>
</sec>
<sec><title>Funding</title>
<p>Partial funding for this study was provided by the University of Minnesota (the American Legion Brain Sciences Chair) and the U.S. Department of Veterans Affairs. The sponsors had no role in the current study design, analysis or interpretation, or in the writing of this paper. The contents do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.</p>
</sec>
<sec><title>Copyright</title>
<p>&#x00A9; The Author(s) 2021.</p>
</sec>
</sec>
<ref-list><title>References</title>
<ref id="B1"><label>1.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sanchez-Mazas</surname><given-names>A</given-names></name><name><surname>Lema&#x00EE;tre</surname><given-names>JF</given-names></name><name><surname>Currat</surname><given-names>M.</given-names></name></person-group> <article-title>Distinct evolutionary strategies of human leucocyte antigen loci in pathogen-rich environments</article-title>. <source>Philos Trans R SocLond B Biol Sci.</source> <year>2012</year>;<volume>367</volume>:<fpage>830</fpage>&#x02013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1098/rstb.2011.0312</pub-id></mixed-citation></ref>
<ref id="B2"><label>2.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Salamon</surname><given-names>H</given-names></name><name><surname>Klitz</surname><given-names>W</given-names></name><name><surname>Easteal</surname><given-names>S</given-names></name><name><surname>Gao</surname><given-names>X</given-names></name><name><surname>Erlich</surname><given-names>HA</given-names></name><name><surname>Fernandez-Vi&#x00F1;a</surname><given-names>M</given-names></name><etal/></person-group> <article-title>Evolution of HLA class II molecules: allelic and amino acid site variability across populations</article-title>. <source>Genetics.</source> <year>1999</year>;<volume>152</volume>:<fpage>393</fpage>&#x02013;<lpage>400</lpage>. <pub-id pub-id-type="pmid">10224269</pub-id> <pub-id pub-id-type="pmcid">PMC1460587</pub-id></mixed-citation></ref>
<ref id="B3"><label>3.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Charonis</surname><given-names>S</given-names></name><name><surname>Tsilibary</surname><given-names>EP</given-names></name><name><surname>Georgopoulos</surname><given-names>AP.</given-names></name></person-group> <article-title>SARS-CoV-2 virus and Human Leukocyte Antigen (HLA) Class II: investigation of binding affinities for COVID-19 protection and vaccine development</article-title>. <source>J Immunological Sci.</source> <year>2020</year>;<volume>4</volume>:<fpage>12</fpage>&#x02013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.29245/2578-3009/2020/4.1198</pub-id></mixed-citation></ref>
<ref id="B4"><label>4.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Charonis</surname><given-names>S</given-names></name><name><surname>James</surname><given-names>LM</given-names></name><name><surname>Georgopoulos</surname><given-names>AP.</given-names></name></person-group> <article-title><italic>In silico</italic> assessment of binding affinities of three dementia-protective Human Leukocyte Antigen (HLA) alleles to nine human herpes virus antigens</article-title>. <source>Curr Res Transl Med.</source> <year>2020</year>;<volume>68</volume>:<fpage>211</fpage>&#x02013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1016/j.retram.2020.06.002</pub-id> <pub-id pub-id-type="pmid">32624427</pub-id></mixed-citation></ref>
<ref id="B5"><label>5.</label><mixed-citation publication-type="web"><person-group person-group-type="author"><collab>The allele frequence database &#x0005B;Internet&#x0005D;.</collab></person-group> <article-title>Liverpool: Royal Liverpool University Hospital</article-title>; c2019 &#x0005B;updated 2019 Jul 10; cited 2020 Jun 16&#x0005D;. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.allelefrequencies.net/">http://www.allelefrequencies.net/</ext-link></mixed-citation></ref>
<ref id="B6"><label>6.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Consortium</surname><given-names>U.</given-names></name></person-group> <article-title>UniProt: a worldwide hub of protein knowledge</article-title>. <source>Nucleic Acids Res.</source> <year>2019</year>;<volume>47</volume>:<fpage>D506</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gky1049</pub-id> <pub-id pub-id-type="pmid">30395287</pub-id> <pub-id pub-id-type="pmcid">PMC6323992</pub-id></mixed-citation></ref>
<ref id="B7"><label>7.</label><mixed-citation publication-type="web"><person-group person-group-type="author"><collab>Immune epitope database and analysis &#x0005B;Internet&#x0005D;.</collab></person-group> <article-title>Bethesda: the National Institutes of Health (NIH), National Institute of Allergy and Infectious Disease (NIAID)</article-title>; c2020 &#x0005B;cited 2020 Jun 16&#x0005D;. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.iedb.org">http://www.iedb.org</ext-link></mixed-citation></ref>
<ref id="B8"><label>8.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reynisson</surname><given-names>B</given-names></name><name><surname>Alvarez</surname><given-names>B</given-names></name><name><surname>Paul</surname><given-names>S</given-names></name><name><surname>Peters</surname><given-names>B</given-names></name><name><surname>Nielsen</surname><given-names>M.</given-names></name></person-group> <article-title>NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data</article-title>. <source>Nucleic Acids Res.</source> <year>2020</year>;<volume>48</volume>:<fpage>W449</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkaa379</pub-id> <pub-id pub-id-type="pmid">32406916</pub-id> <pub-id pub-id-type="pmcid">PMC7319546</pub-id></mixed-citation></ref>
<ref id="B9"><label>9.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Duan</surname><given-names>L</given-names></name><name><surname>Zheng</surname><given-names>Q</given-names></name><name><surname>Zhang</surname><given-names>H</given-names></name><name><surname>Niu</surname><given-names>Y</given-names></name><name><surname>Lou</surname><given-names>Y</given-names></name><name><surname>Wang</surname><given-names>H.</given-names></name></person-group> <article-title>The SARS-CoV-2 spike glycoprotein biosynthesis, structure, function, and antigenicity: implications for the design of spike-based vaccine immunogens</article-title>. <source>Front Immunol.</source> <year>2020</year>;<volume>11</volume>:<fpage>576622</fpage>. <pub-id pub-id-type="doi">10.3389/fimmu.2020.576622</pub-id> <pub-id pub-id-type="pmid">33117378</pub-id> <pub-id pub-id-type="pmcid">PMC7575906</pub-id></mixed-citation></ref>
<ref id="B10"><label>10.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grifoni</surname><given-names>A</given-names></name><name><surname>Sidney</surname><given-names>J</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Scheuermann</surname><given-names>RH</given-names></name><name><surname>Peters</surname><given-names>B</given-names></name><name><surname>Sette</surname><given-names>A.</given-names></name></person-group> <article-title>A sequence homology and bioinformatic approach can predict candidate targets for immune responses to SARS-CoV-2</article-title>. <source>Cell Host Microbe.</source> <year>2020</year>;<volume>27</volume>:<fpage>671</fpage>&#x02013;<lpage>80.e2</lpage>. <pub-id pub-id-type="doi">10.1016/j.chom.2020.03.002</pub-id> <pub-id pub-id-type="pmid">32183941</pub-id> <pub-id pub-id-type="pmcid">PMC7142693</pub-id></mixed-citation></ref>
<ref id="B11"><label>11.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ahmed</surname><given-names>SF</given-names></name><name><surname>Quadeer</surname><given-names>AA</given-names></name><name><surname>McKay</surname><given-names>MR.</given-names></name></person-group> <article-title>Preliminary identification of potential vaccine targets for the COVID-19 coronavirus (SARS-CoV-2) based on SARS-CoV immunological studies</article-title>. <source>Viruses.</source> <year>2020</year>;<volume>12</volume>:<fpage>254</fpage>. <pub-id pub-id-type="doi">10.3390/v12030254</pub-id></mixed-citation></ref>
<ref id="B12"><label>12.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nguyen</surname><given-names>A</given-names></name><name><surname>David</surname><given-names>JK</given-names></name><name><surname>Maden</surname><given-names>SK</given-names></name><name><surname>Wood</surname><given-names>MA</given-names></name><name><surname>Weeder</surname><given-names>BR</given-names></name><name><surname>Nellore</surname><given-names>A</given-names></name><etal/></person-group> <article-title>Human leukocyte antigen susceptibility map for severe acute respiratory syndrome coronavirus 2</article-title>. <source>J Virol.</source> <year>2020</year>;<volume>94</volume>:<fpage>e00510</fpage>&#x02013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1128/JVI.00510-20</pub-id> <pub-id pub-id-type="pmid">32303592</pub-id> <pub-id pub-id-type="pmcid">PMC7307149</pub-id></mixed-citation></ref>
<ref id="B13"><label>13.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nelde</surname><given-names>A</given-names></name><name><surname>Bilich</surname><given-names>T</given-names></name><name><surname>Heitmann</surname><given-names>JS</given-names></name><name><surname>Maringer</surname><given-names>Y</given-names></name><name><surname>Salih</surname><given-names>HR</given-names></name><name><surname>Roerden</surname><given-names>M</given-names></name><etal/></person-group> <article-title>SARS-CoV-2-derived peptides define heterologous and COVID-19-induced T cell recognition</article-title>. <source>Nat Immunol.</source> <year>2021</year>;<volume>22</volume>:<fpage>74</fpage>&#x02013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1038/s41590-020-00808-x</pub-id> <pub-id pub-id-type="pmid">32999467</pub-id></mixed-citation></ref>
<ref id="B14"><label>14.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jain</surname><given-names>V</given-names></name><name><surname>Yuan</surname><given-names>JM.</given-names></name></person-group> <article-title>Predictive symptoms and comorbidities for severe COVID-19 and intensive care unit admission: a systematic review and meta-analysis</article-title>. <source>Int J Public Health.</source> <year>2020</year>;<volume>65</volume>:<fpage>533</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1007/s00038-020-01390-7</pub-id> <pub-id pub-id-type="pmid">32451563</pub-id> <pub-id pub-id-type="pmcid">PMC7246302</pub-id></mixed-citation></ref>
<ref id="B15"><label>15.</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname><given-names>WJ</given-names></name><name><surname>Liang</surname><given-names>WH</given-names></name><name><surname>Zhao</surname><given-names>Y</given-names></name><name><surname>Liang</surname><given-names>HR</given-names></name><name><surname>Chen</surname><given-names>ZS</given-names></name><name><surname>Li</surname><given-names>YM</given-names></name><etal/><collab>China Medical Treatment Expert Group for COVID-19.</collab></person-group> <article-title>Comorbidity and its impact on 1,590 patients with COVID-19 in China: a nationwide analysis</article-title>. <source>EurRespir J.</source> <year>2020</year>;<volume>55</volume>:<fpage>2000547</fpage>. <pub-id pub-id-type="doi">10.1183/13993003.00547-2020</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>