Rational design of novel phenol ether derivatives as non-covalent proteasome inhibitors through 3D-QSAR, molecular docking and ADMET prediction

as


Introduction
The ubiquitin-proteasome system (UPS) is a major pathway for the selective degradation of proteins in eukaryotic cells [1].UPS involves many biological processes such as cell cycle progression, transcription repair, cell proliferation, differentiation, and apoptosis.Thus, UPS plays an important role in transcription, protein degradation, and protein stability of eukaryotic cells.It is an intracellular non-lysosomal protein degradation pathway, which is composed of ubiquitin-activating enzyme (E1), ubiquitin-conjugating enzyme (E2), ubiquitin ligase (E3), and the proteasome.Most of the substrates of this pathway can be recognized and degraded by 26S proteasome only after they are labeled by ubiquitin enzyme E1-E3, i.e., ubiquitination [2,3].The common 26S proteasome specifically degrades target proteins into polypeptides containing 7-9 amino acid residues in the form of ATP consumption [4].The 26S proteasome mainly contains a 20S core particle (20S proteasome) and a 19S regulatory particle.The 20S proteasome consists of two α chains and two β chains respectively.Each α chain and β chain consists of seven subunits (α1-α7) and seven subunits (β1-β7), respectively.Among them, β1, β2, and β5 subunits are active sites for protein degradation, which are responsible for the caspase-like (C-L), trypsin-like (T-L), and chymotrypsin-like (ChT-L) activities of the proteasome, respectively.Each of these three β-subunits has a catalytic site that cleaves the peptide bond using the nucleophilic γ-hydroxyl group of N-terminal threonine.According to the different composition sequences of β1, β2, and β5 subunits, 20S proteasomes can be divided into immune 20S proteasomes (β1i, β2i, and β5i) and constitutive 20S proteasomes (β1c, β2c, and β5c), and different 20S proteasome inhibitors act on different subunit types and therefore have different pharmacological effects [5,6].UPS controls most biological functions within cellular mechanisms and handles 80% to 90% of intracellular protein degradation, therefore proteasomes have become attractive targets for the treatment of inflammatory, autoimmune, and neoplastic diseases [7].
It has been reported that the level and activity of proteasomes are higher than 90% in malignant tumors compared to normal cells, providing a survival advantage for tumor cells to continue proliferating [8].Therefore, tumor cells will show greater sensitivity to proteasome inhibitors [9].In 2003, bortezomib (the first-generation drug) became the first proteasome inhibitor approved by the Food and Drug Administration (FDA) for the treatment of multiple myeloma and mantle cell lymphoma, enabling the proteasome to be clinically validated as an oncology therapeutic target [10].These were carfilzomib (the second-generation drug) [11] and ixazomib [12], which were also approved by FDA.These major 20S proteasome inhibitors reported in the literature are covalent inhibitors, which have highly reactive and unstable chemical groups [13].They generally react covalently and irreversibly with the proteolytic sites of the proteasome, resulting in permanent blockage of the proteasome [14].In addition, the reactive head groups lead to nonspecific binding to the active center of the proteasome and the conversion of many enzyme off-target activities into serious side effects.These may be the main cause of side effects, acquired resistance, and unsatisfactory pharmacokinetic (PK) properties of covalent proteasome inhibitors [15][16][17].Hence scientists have turned their attention to the study of non-covalent proteasome inhibitors with lower toxicity in recent years.Although there is less research on non-covalent proteasome inhibitors than covalent inhibitors, it provides a promising alternative mechanism for inhibiting proteases [18].Their potential advantages are high selectivity, moderate response, and reduced instability, which may reduce some side effects [19].Considering these reasons, it is essential to continue the design and research of novel non-covalent proteasome inhibitors.
The purpose of this paper is to use advanced computer-aided drug design technology to study the structure-activity relationship (SAR) of a series of non-covalent proteasome inhibitors based on the research results of Yu et al. [20], so as to find their common backbones and do structural modifications.By enumerating a large number of novel compounds on the core skeleton, various physicochemical properties of new molecules are predicted and analyzed, and then novel drug molecules with anticancer activity are screened out.In this study, the quantitative SAR (QSAR) model was constructed by SYBYL-X software to study SAR between the structural characteristics of non-covalent proteasome inhibitors and their biological activities.These models provided theoretical guidance for the design of new compounds; and then the molecular docking simulation was applied to analyze the binding mode and stability between the active pocket of proteasome and inhibitors.Moreover, the absorption, distribution, metabolism, excretion, and toxicology (ADMET) and druglike properties of new non-covalent proteasome inhibitors with higher selectivity were also evaluated.As an effective, rapid, and economical tool, computer-aided drug design has been widely used in the development of new drugs.It not only improves the hit rate of drug candidates but provides new strategies for novel drug design [21].For example, Xu et al. [22] discovered a new set of non-covalent proteasome inhibitors through a fragment-based approach to drug design (i.e., docking calculations).Li et al. [23] obtained 2,167 compounds by virtually screening the SPECS database through using non-covalent docking and a 20S proteasome-based pharmacophore model, and finally two hit compounds were selected after molecular dynamics simulations.

Dataset selection and biological activities
A data set, containing twenty-eight phenol ether derivatives as non-covalent proteasome inhibitors, was collected from the same literature to build the QSAR models [20].These compounds were randomly divided into two groups.The training set of 22 compounds was used to generate quantitative models.The other was a test set of 6 molecules, which was designed to verify the reliability of the created model.The half maximal inhibitory concentration (IC 50 ) of these compounds was known, which was converted to the pIC 50 value [pIC 50 = -log 10 (IC 50 )] as the dependent variable for QSAR modeling.The pIC 50 values of 28 phenol ether derivatives covered a wide range from 5.229 to 7.310, which supplied extensive and homogenous data for QSAR analysis.The backbones of non-covalent proteasome inhibitors are shown in Figure 1, which are separated into two series.The backbone of the first series is shown in Figure 1A, and the compounds are from 1 to 16.The backbone of compounds 17-28 in Table 1 is illustrated in Figure 1B.The chemical structures of non-covalent proteasome inhibitors and their biological activity values are shown in Table 1.

Minimization and molecular alignment
The three-dimensional (3D) structure of 28 phenol ether derivatives was constructed in the Sketch module of the SYBYL-X 2.0 software package (Tripos, St. Louis, USA) [24].Each molecule was minimized to obtain the lowest energy conformation.Energy minimization of all compounds was performed using the Tripos force field with Powell method.Gasteiger Hückel method was used to calculate the atomic charge.In order to get stable conformations, the maximum number of iterations was set to 10,000 and the energy convergence gradient was set to 0.005 kcal/(mol Å) [25][26][27].The other parameters adopt the system default.
Molecular alignment operation is not only the basic step to establish a reliable 3D-QSAR model, but the quality of molecular alignment directly affects the prediction ability of the constructed model [28,29].To get the best QSAR model, we selected the most active compound 20 (IC 50 = 49 nmol/L) as the template molecule for molecular alignment [30,31].The chemical structure of the template molecule is shown in Figure 2A and the red part represents the common skeleton.The result of alignment based on the common substructure is shown in Figure 2B.

CoMFA and CoMSIA models
CoMFA and CoMSIA are two general applied tools of 3D-QSAR [26].These methods are based on descriptors of 3D structures to analyze the different contributions of steric (S), electrostatic (E), hydrophobic (H), H-bond donor (HD), and H-bond acceptor (HA) fields [32,33].These descriptors are directly related to the atomic properties of compounds and the spatial geometry of molecules.Potentials are reflected in their position and expansion in space and their intensity.The CoMFA and CoMSIA models are performed using the QSAR option of SYBYL-X 2.0 software with the default parameters.
The non-covalent proteasome inhibitors were placed in the spatial grid to establish the CoMFA model.A 3D cubic box with 2.0 Å grid spacing in the X, Y, and Z directions was generated by the default values.The spatial lattice was linked with 3D structure, bioactive data, and molecular potential energy, thus providing valuable information for molecular modifications.S and E fields were calculated at each grid point with Tripos force field using a carbon atom probe with sp 3 hybridization by means of a van der Waals radius of 1.52 Å and a charge of +1.0.In addition, the default S and E energy cutoffs were both set to 30 kcal/mol to avoid the infinity of energy values inside a compound [34,35].
CoMSIA and CoMFA applied for the same molecular alignment method, however, the results of CoMSIA were more robust than the CoMFA.CoMSIA model was not limited to S and E fields but covered H, HD, and HA fields, which compensated for some shortcomings of CoMFA model.CoMSIA model was calculated using a sp 1 hybridized carbon atom with the radius of 1.0 Å and net +1.0 charge as a probe atom.The column filter and attenuation factor were set to 2.0 kJ/mol and 0.3, respectively, to reduce noise and speed up analysis [36].The other parameters took the system default.

Model performance and validation
The partial least-squares (PLS) method was a multi-regression analysis method based on the ideas of linear transformation [37].The inhibitor activity was used as the dependent variable and descriptors of potential fields as independent variables for the 3D-QSAR model.In the cross-validation process, the optimum number of components (ONCs) and cross-validation correlation coefficient (Q 2 ) were calculated by leave-one-out (LOO) method.Then the non-cross-validation correlation coefficient (r 2 ), standard error of estimate (SEE), and F-test values (F) were further obtained in the non-cross-validation analysis.The reliable 3D-QSAR models should have low values of SEE and high values of Q 2 , r 2 , and F. These important statistical parameters reflected whether the constructed model was reliable and robust or not [38,39].
The external validation was also a crucial step because it could guarantee the predictive accuracy of 3D-QSAR.A favorable model relied on the high predicted external validation correlation coefficient r 2 pred value (r 2 pred > 0.6) [40].According to Roy et al. [41] and Mitra et al. [42], the other four statistical parameters also need to meet certain rules to ensure the robustness of the model.These rules were the following criteria: R 2 was the square correlation coefficient between the experimental and predicted bioactivity.The formula was as follows: K was the regression of experimental and Pred.and the regression line slope through the origin.The formula was as follows: / R0 2 was the square correlation coefficient by calculating the experimental versus Pred.The formula was as follows: Rm 2 was another criterion to analyze the external predictability of the model.The formula was as follows: Where Ytest , Ytest , Ypred(test), and Ypred(test) represented experimental, average, predictive and predictive average bioactivity values in the test set, respectively.

Molecular docking
Molecular docking is a general method of simulating ligand-protein interactions, through which some vital information about ligand at particular binding site can be provided.For example, we can know the overall geometric conformation of ligands and the microscopic interactions within the pockets [43].In this study, the Autodock software was used for the docking process [44,45].Since the ligand in the crystal structure of 3MG6 [protein data bank (PDB) ID] are selective for the β5 (ChT-L) site of the 20S core particle of the proteasome, our designed molecules have the same mechanism of selective inhibition of the β5 site of the proteasome [46,47].Therefore, to elucidate the binding modes of a new series of non-covalent proteasome inhibitors with exquisite potency and selectivity for the 20S β5-subunit, molecular docking calculations were performed.The X-ray crystal structure of the proteasome (PDB ID: 3MG6) was downloaded from the PDB (https://www.rcsb.org/).There were two core operations in the docking process.The first one involved pretreatment of proteins, including the removal of water molecules and original ligand, modification of missing loop regions, the addition of missing atoms and polarized hydrogen, and charges calculations.Then the compounds were docked into the active pockets of protein, which was defined by a 45 × 45 × 45 box centroid of the crystal ligand with a default grid space size of 0.375 Å [48].All other parameters remained the default values.Finally, ten different molecular conformations were generated for each compound.The docking results were analyzed using PyMol software.

Prediction of ADMET and drug-likeness
PKs and toxicology are important components of drug preclinical research.PK reflects the patterns of absorption, distribution, metabolism, and excretion of drugs, which are necessary processes for drug metabolism in the human body.According to statistics, most candidate drugs failed in drug development due to poor PK properties or excessive toxicity [49,50].Therefore, the ADMET comprehensive evaluation of candidate compounds is conducive to improving the success rate of new drug research and development [51].In this study, ADMET and drug-like properties of newly designed phenol ether derivatives were obtained from pkCSM online server (http://biosig.unimelb.edu.au/pkcsm/prediction)[52] and SwissADME web tool (http://www.swissadme.ch)[53].The predicted ADMET data will provide some theoretical support for further experimental verification.

Statistical results of CoMFA and CoMSIA
In order to acquire the best statistical model, we combined different fields to build a wide variety of models, calculated their statistical data, and finally elected the best QSAR model.The results of the CoMFA model and some CoMSIA models utilizing different combination types of fields are shown in Table S1.The Q 2 and r 2 values of a good 3D-QSAR model should be greater than 0.5 and 0.9, respectively.For the PLS analysis results of CoMFA-SE (using both S and E fields to build models) model, Q 2 , ONC, r 2 , SEE, and F was 0.574, 10, 0.999, 0.026 and 1,084.404,respectively, suggesting that it had good internal validation capabilities.The predicted external validation correlation coefficient r 2 pred value was 0.755.The contribution for S and E fields were 47.1% and 52.9%, respectively.All this statistical data indicated that this model was reliable and robust.Therefore, CoMFA-SE was selected as the final CoMFA model.
The CoMSIA-HAD model had the highest Q 2 value (Q 2 = 0.647) in all CoMSIA series, but it was not chosen as the final CoMSIA model.Because we found that the contribution rate of E field always had a high weight in CoMSIA series.For example, the contribution of E was up to 32.3% in the CoMSIA-SEHAD model with all fields.Therefore, we could not ignore the importance of E in the CoMSIA model.However, the contribution of the HD showed a downward trend with the increase in the number of fields.For instance, it contributed the least (8.1%) in the CoMSIA-SEHAD model.So, the role of HD was not considered at all.Meanwhile, the CoMSIA-SEHA model had the highest Q 2 value in these five models (CoMSIA-SEHD, CoMSIA-SEHA, CoMSIA-SEDA, CoMSIA-SHDA, CoMSIA-EHAD), hence CoMSIA-SEHA was selected as the CoMSIA model.The CoMSIA-SEHA model had a large Q 2 value of 0.584 and r 2 value of 0.989, lower SEE value of 0.077, and F value of 172.183, showing that this model was reliable.The r 2 pred value was 0.921, suggesting that this model had a strong external predictive capability.These statistical results demonstrated that the CoMSIA-SEHA model was robust.The contributions of S, E, H, and HA were 12.6%, 35.2%, 27.6%, and 24.5%, respectively.It showed that E played a vital role in this model.To sum up, CoMSIA-SEHA was selected as the best CoMSIA model.The statistical parameters of 3D-QSAR model after PLS analysis are shown in Table 2. To test the predictive ability of the QSAR model, the test set of six compounds was selected for external verification to evaluate it.The r 2 pred of the CoMFA-SE (hereafter referred to CoMFA) and CoMSIA-SEHA (hereafter referred to as CoMSIA) models was greater than 0.6, indicating that the established QSAR model had a decent external predictive ability.The relevant statistical results are shown in Table 2, and the other four statistical parameters calculated from the formulas of Roy et al. [41] and Mitra et al. [42] are also shown in Table 2.The CoMFA and CoMSIA models had R 2 at 0.684 and 0.897, K at 0.996 and 1.005, (R 2 -R0 2 )/R 2 at -0.459 and -0.110, Rm 2 at 0.559 and 0.614, respectively.All four statistical parameters met the Roy et al. [41] and Mitra et al. [42] rules, which indicated the created model had stronger external verification.The above results suggested that the CoMFA and CoMSIA models had good external predictive capacity.
The predicted bioactivity values of QSAR models for the training set and test set are shown in Table 1.The scatter plots of experimental and predicted pIC 50 values are shown in Figure 3.It could be seen from the picture that red triangle blocks and blue dots were close to the straight line (Y = X), which meant the actual values of whole molecules were almost consistent with the Pred.All these data clearly indicated the excellent stability and the highly predictive characteristics of 3D-QSAR models.

Contour map analysis
By analyzing the contour maps of CoMFA and CoMSIA models, we can vividly study the physical and chemical properties of the compound to find the key information affecting the bioactivity, which will guide us to design novel molecules.In this study, we mainly discussed the 3D-QSAR in four potential fields.The favorable and unfavorable contributions of all fields default to 80% and 20%, respectively.To better explain the contour maps, the inhibitor of the highest activity (compound 20) was labeled as shown in Figure 2A.
In the S contour maps, the green blocks indicated that the introduction of bulky groups in the region would improve the bioactivity, while the yellow regions mean the opposite.S fields for CoMFA and CoMSIA were shown in Figures 4 and 5, respectively.As shown in Figure 4A, a large-sized yellow block was displayed near the R 3 area, while a medium-sized green contour existed in the R 1 region.In addition, in Figure 5A, compound 20 was surrounded by a large yellow contour and relatively small green block in the R 3 and R 1 regions, respectively, which was basically consistent with the CoMFA model analysis.It was worth mentioning that a large-sized yellow contour appeared at the R 3 opposite position in the CoMSIA model, but not in the CoMFA model.Overall, we found three key clues to enhance the activity of non-covalent proteasome inhibitors in the S fields: the bulky substituents at the R 1 of compound 20, and the small group at the R 3 and R 3 opposite position, respectively.In E field contour maps shown in Figures 4B and 5B for CoMFA and CoMSIA, respectively, the red and blue blocks indicated that negative and positive groups in these regions could strengthen the inhibition of the compound.A medium-sized blue and a large blue contour appeared at the R 1 and R 2 in Figure 4B, which suggested that the original group replaced by the positive substitutions would increase the bioactivity.In the R 3 region, there was a medium-sized red contour indicating that the negative group here was beneficial to the inhibition of the compound.As for the CoMSIA, the situation was similar to the CoMFA model, so the details were not discussed.For the CoMSIA model, the H field contour map was shown in Figure 5C.The yellow block indicates that the H group introduced in the area can enhance the activity of compounds, while the magenta block indicates that the hydrophilic groups introduced in the area are helpful for increasing bioactivity.It could be seen from Figure 5C that a yellow contour was embedded in the R 1 area, indicating that the H group here was advantageous.There was a huge magenta block in the R 3 area, showing that the hydrophilic group here was favorable.
In the HA field of the CoMSIA model shown in Figure 5D, magenta blocks indicate that the introduction of HA groups is valuable to increase the activity of the compound, while red blocks are the opposite.There was a medium-sized red color block in the R 2 area, and a large red contour embedded in the R 3 area, indicating that the HD groups in the R 2 and R 3 regions were advantageous for improving the bioactivity of the compound.

Design of novel molecules
By analyzing the 3D-QSAR of a series of non-covalent proteasome inhibitors, some key clues about structural modifications to improve bioactivity were obtained: 1) The R 1 area was mainly surrounded by the green contour in the S fields of the CoMFA and CoMSIA models, the blue block in the E field, and the yellow block in the H field of the CoMSIA model.These results showed that the addition of bulky, positive, or H groups in the R 1 region was beneficial.2) The R 2 region was mainly surrounded by a red contour in the HA field of the CoMSIA model, indicating that the introduction of the HD here was helpful.3) The R 3 area had the appearance of color block in four fields, namely, the yellow block in the S field, the red contour in the E field, the magenta block in the H field, and the red contour in the HA field, which represented the addition of small group, negative charge, hydrophilic, or HD group in this area was favorable.The main SAR information of non-covalent proteasome inhibitors is illustrated in Figure 6.Based on the above clues, a series of novel non-covalent proteasome inhibitors were designed and evaluated using the highest bioactivity compound 20 as the template (Figure 6).Finally, twenty-four new inhibitors (compound D01-D24) were screened out with higher Pred.than the template.The chemical structures of these novel inhibitors and their pIC 50 values predicted by the constructed QSAR models are shown in Table 3.

Molecular docking analysis
In order to better explain the relevant docking results, three representative compounds for detailed descriptions were selected.The results of docking are clearly illustrated in Figure 7. How the least biologically active compound 15, the most biologically active compound 20, and the newly designed compound with the highest predictive activity D24 bound with the receptor protein 3MG6 are shown in Figure 7A, B, and  C, respectively.At the same time, their molecular docking scores were -6.3 kcal/mol (compound 15), -8.3 kcal/mol (compound 20), and -8.9 kcal/mol (compound D24), respectively.The lower the molecular docking score, the stronger the binding ability of the ligand to the receptor [54].And it can be seen that the size of the binding force is consistent with the size of the biological activity value.The binding mode of compound 15 which is the least active molecule in the dataset is clearly shown in Figure 7A.This inhibitor afforded some interactions with the proteasome.It could be seen that the oxygen atom of compound 15 near the R 2 region presented a hydrogen bond with alanine 22 (-O … HN, 3.0 Å).Furthermore, the oxygen atom was also bound with the amino acid residue of asparagine 24 by the formation of a hydrogen bond (-O … HN, 3.1 Å).In addition, there was a π-π stacking between tryptophane 25 and the benzene ring of compound 15.Several H interactions in the binding site were also found.However, when the docking result of compound 20 (the most active inhibitor) was analyzed, there were stronger interactions in the binding pocket which could explain why the activity of compound 20 was higher than compound 15.As could be seen from Figure 7B, the oxygen atom near the R 2 region of compound 20 (in the same position as compound 15) formed a hydrogen bond with tryptophane 25 (-O … HN, 3.4 Å).The second one was that the NH group of compound 20 near the R 3 region interacted with alanine 27 (-O … HN, 3.8 Å).The third one was the oxygen atom bound with glycine 128 and there were other hydrogen bond interactions with key residues like serine 112, serine 118, and aspartic acid 114, respectively.Compared with compound 15, compound 20 had more hydrogen bond interactions with receptor which would be vital for the binding stability of inhibitor in the active site.Meanwhile, there was the same π-π stacking interaction as compound 15.In addition, the H interaction between compound 20 and key residues such as tryptophane 25, histidine 98, and aspartic acid 114 further enhanced the bioactivity.These multi-conjugate effects revealed that compound 20 (IC 50 = 49 nmol/L) had stronger stability in the active pocket of the receptor and higher activity than compound 15 (IC 50 = 5,907 nmol/L), which is consistent with the experimental result.In order to better demonstrate that the new designed non-covalent proteasome inhibitors had higher biological activity than the template molecule (compound 20), the molecular docking method was applied for studying their binding mode with the receptor.New designed compound D24 with the predicted most biological active values was taken as an example.In Figure 7C, we noticed that compound D24 had the same hydrogen bond interaction as compound 20 where they interacted with the same residue (tryptophane 25), but the distinction between them was the hydrogen bond distance.For compound D24, the oxygen atom was far away from the surrounding residue, which was bad for ligand-protein interactions.It was noteworthy that there was a stronger 2.8 Å-hydrogen bond interaction between the NH group of compound D24 (at R 2 position) and phenylalanine 113 (-O … HN, 2.8 Å), which made the ligand-protein interaction binding more stable.Additionally, the pyridazine ring of compound D24 formed another stable hydrogen bond with serine 118 (-O … HN, 2.9 Å).Except that, compound D24 was bound with arginine 125 and phenylalanine 113, respectively, to generate other hydrogen bond interactions.Generally speaking, compound D24 had a closer hydrogen bonding distance than the template molecule, so it could more tightly bind with the receptor to have better inhibitor activity.In addition, we found that all three representative compounds formed π-π stacking with an indole ring of tryptophane 25, which implied it may be a key residue in the active pocket.Of course, the H interactions also existed in the compound D24 to enhance the binding with the protein.In summary, all of these results clearly indicated that compound D24 had high stability at the binding site, followed by a high inhibitory activity.

Results of ADMET and drug-likeness
ADMET is a complex process and each link may have a great impact on the effectiveness and safety of drugs, so it is an important index to evaluate the drug properties of compounds in the process of new drug development and clinical use.Therefore, the online predictive evaluation of the newly designed compound (D01-D24) was performed.The results of statistical parameters for ADMET prediction of the newly designed non-covalent proteasome inhibitors are shown in Table 4.The results of drug-likeness and synthesis accessibility are shown in Table 3.The intestinal absorptions of the newly designed compounds (D01-D24) are all greater than 30%, except compounds D14 and D21, which indicates that they will be highly absorbed by the human intestine.In particular, the highest intestinal absorption of compound D1 is 92.849%, suggesting that the novel non-covalent proteasome inhibitors have better absorption capacity as an oral drug.Enzymatic metabolism of the newly designed compound is assessed by whether it is a substrate or inhibitor of CYP.Considering that there are different enzymes in the family of CYP enzymes in human cells, CYP2D6 and CYP3A4 are two remarkable isoforms of P450 responsible for drug metabolism.As can be seen from Table 4, all the designed compounds are neither substrates nor inhibitors of CYP2D6.Compounds D1, D2, D5, D6, and D10 are substrates of CYP3A4, suggesting that they can be metabolized by CYP3A4.The other compounds are not CYP3A4 inhibitors, meaning that they may not affect normal metabolism.In terms of drug clearance, the Pred.indicates that all newly designed compounds could be cleared by combined hepatorenal clearance.Candidate drugs are supposed to be non-toxic to humans, and the AMES test is a bacteria short-term test that determines whether a chemical has mutagenic potential.Fortunately, none of the designed compounds are toxic to AMES and induce skin sensitization.The designed compounds are hepatotoxic, which may impair the normal function of the liver.According to the results of ADMET prediction, we may theoretically assume that the newly designed molecules have good PK properties.Drug-likeness qualitatively assesses the likelihood of a compound becoming an oral drug from the perspective of bioavailability.As shown in Table 3, all novel compounds fulfill the Lipinski rule, which means they meet the criteria for being oral drugs.In addition, it is worth noting that the newly designed compounds D01-D24 have a synthetic accessibility range of 2.61-4.01,suggesting that they are not difficult to synthesize.These results could provide researchers with valuable information on drug synthesis and screening.

Discussion
A growing body of research suggests that proteasome inhibitors could be valuable drugs for the treatment of neoplastic diseases [55].Most of the 20S proteasome inhibitors reported in the literature are peptidebased compounds with C-terminal electrophilic warheads that form covalent additions to the active site Thr1Oγ.However, this mode of covalent action, together with the high reactivity of the compound, can lead to off-target interactions [56].To overcome these shortcomings, researchers have looked for a number of inhibitors with different mechanisms of action, including non-covalent proteasome inhibitors.Non-covalent inhibitors may exhibit unique advantages over covalent inhibitors.For example, non-covalent inhibitors are usually reversible and reduce toxicity due to their non-cumulative proteasome inhibition [57,58].Off-target effects can be limited by non-covalent inhibition of regulatory enzyme activity, thereby reducing drug side effects [59].Thus, there is a recent tendency to identify non-covalent inhibitors, mainly including peptides, pseudopeptides, and some organic compounds [60].Compared with covalent inhibitors, our rationally designed novel phenol ether derivatives as non-covalent proteasome inhibitors have the effect of selectively inhibiting the β5 site of the proteasome and they may provide an alternative mechanism for proteasome inhibition.
In this work, computer-aided drug design methods, including the QSAR model, molecular docking, ADMET prediction, and drug-likeness prediction, were used to systematically study the theoretical SARs of 28 phenol ether derivatives to design novel non-covalent proteasome inhibitors for tumor treatment.The optimal QSAR models were established (CoMFA-SE: Q 2 = 0.574, r 2 = 0.999, r 2 pred = 0.755; CoMSIA-SEHA: Q 2 = 0.584, r 2 = 0.989, r 2 pred = 0.921).In addition, the results showed that the created models were robust and could be used to predict the bioactivity of the newly designed compounds.By comprehensively analyzing the contour maps of CoMFA and CoMSIA models, some key information about the structural modifications that can significantly enhance the molecular activity were found: 1) introducing bulky, positive, or H groups in the R 1 region; 2) introducing HD group in R 2 area; 3) introducing small group, electronegative, hydrophilic, or HD groups in the R 3 region.Based on the SAR information, we designed, predicted, evaluated, and finally screened 24 novel non-covalent proteasome inhibitors (D01-D24).The docking results between compound 15 (the least active molecule), 20 (the most active molecule), D24, and proteasome (PDB ID: 3MG6) showed that compound D24 had high stability in the binding site of the protein.In addition, the prediction results of ADMET and drug-likeness properties indicated that the novel-designed compounds had rational PKs and drug-likeness properties.In general, all newly designed non-covalent proteasome inhibitors had good bioactivity, binding stability, ADMET, and drug-likeness characteristics.
In our study, SAR on the phenol ether derivatives as non-covalent proteasome inhibitors has been extensively explored.The computational studies show a non-covalent binding mode.It also provides a new chemical template for non-covalent proteasome inhibitors, providing good insights for future research on structural modification and synthesis of more efficient and selective proteasome inhibitors to improve potency and subunit selectivity.Of course, the accuracy of these predictions also needs to be verified by experiments in the future.This study could lead to the discovery of anti-tumor drugs with higher inhibitory effects.

Figure 2 .
Figure 2. The common substructure and the result of alignment.(A) Compound 20 was used as a template for the alignment.The common substructure was marked in red color; (B) the alignment diagram of the training set

Figure 3 .
Figure 3.The scatter plots of experimental versus predicted pIC 50 for CoMFA (A) and CoMSIA (B) models

Figure 4 .
Figure 4.The CoMFA contour maps of template molecule 20.(A) The S field; (B) the E field

Figure 5 .
Figure 5.The CoMSIA contour maps of template compound 20.(A) The S field; (B) the E field; (C) the H field; (D) the HA field

Figure 7 .
Figure 7. Docking results of three representative compounds in the binding site of the proteasome.(A) Least active compound 15; (B) most active compound 20; (C) new designed compound D24.Yellow solid line: hydrogen bond; green dotted line: π-π stacking; gray dotted line: H interaction

Table 1 .
The structures and activities of non-

Table 1 .
The structures and activities of non-covalent proteasome inhibitors (continued)

Table 1 .
The structures and activities of non-covalent proteasome inhibitors (continued)

Table 2 .
Statistical results of optimal CoMFA and CoMSIA models

Table 3 .
The structures and activities of newly designed non-covalent proteasome inhibitors

Table 4 .
ADMET properties of novel designed compounds

Table 4 .
ADMET properties of novel designed compounds (continued)