Intrinsically Disordered Proteins (IDPs): Experimental and Computational Approaches in Drug Discovery

Intrinsically disordered proteins (IDPs) are proteins that usually do not adopt well-defined native structures when isolated in solution under physiological conditions. Numerous IDPs have close relationships with human diseases such as Parkinson disease, Alzheimer disease, diabetes, and so on. Recently IDPs are getting importance as desirable drug targets for major diseases. This review introduces IDP’s in drug discovery by experimental and by computational approach.


Introduction
Soluble proteins may consist of globular units and nonglobular units. Globular units are composed of regular secondary structure elements whereas the non-globular units are composed of disordered, unstructured, and flexible regions without regular secondary structure elements [1]. Recently it is well known that many functionally important protein segments occur outside of globular units [2,3]. Intrinsically Disordered Proteins (IDPs) (also known as intrinsically unstructured, unfolded, natively disordered or rheomorphic proteins) lack a well-defined 3D structure and exhibit a multitude of conformations that dynamically change over time and population [4]. Many proteins cannot be over-expressed, purified or crystallized. Native conformational disorder of proteins is one of the main obstacles facing structural biology analyses. Moreover, in structural genomics initiatives, it is becoming increasingly important to identify Intrinsically Disordered Proteins during the target selection process [5,6]. No commonly agreed definition of Intrinsically Disordered Proteins exists. AIDPs can be either completely unfolded or comprise both folded and unfolded segments" [7]. While completely unfolded proteins carry out their function by means of regions that lack specific 3D structure and exists as an ensemble of flexible molecules, in partially unfolded proteins only localized regions lack organized structure.

Experimental Determination of Intrinsically Disordered Proteins
IDPs are indirectly observed with a variety of experimental methods, such as X-ray crystallography, NMR, Raman spectroscopy, CD spectroscopy, and hydrodynamics measurements. X-Ray Crystallography: Disorder leads to missing electron density in protein structures determined by X-ray crystallography. Two types of disorder have been recognized: static and dynamic [8]. Disorder is static when different molecules are rigid and adopt different conformations. It is, on the contrary, dynamic when molecules are flexible and oscillate between various conformations. If a region exists as an ensemble of φ and ψ angles, whether static or dynamic, it is intrinsically disordered. The major disadvantage of X-ray crystallography is that it requires additional experimental confirmation whether the missing electron density is a wobbly domain, is intrinsically disordered, or is the result of technical difficulties [9].

Nuclear Magnetic Resonance Spectroscopy (NMR):
Protein three-dimensional structures can be determined in solution by NMR. Under favourable circumstances, NMR provides motional information on a residue-by-residue basis by means of a variety of different isotopic labelling and pulse sequence experiments [10].

Circular Dichroism (CD) Spectroscopy:
Structural information for proteins in solution is also provided by circular dichroism. The circular dichroism spectroscopy can give semiquantitative information by combined use of near and far UV CD. It does not provide clear information for the proteins that contain both ordered and disordered regions [11].

Small-Angle Scattering of X-rays (SAXS):
It is one of powerful method for characterization of both ordered and disordered proteins in solution. It allows one to quantitatively characterize conformational poly dispersity, in particular, of completely or partially disordered macromolecules, including multi-domain proteins with flexible linkers and intrinsically disordered proteins (IDPs) [12].

Protease Digestion
Protease digestion is a well-recognized method, which gives insight into protein structure and flexibility. The protein digestion method is particularly useful when used in combination with other methods. Protein digestion along with X-diffraction method helps to sort out whether a region of missing electron density is due to a wobbly domain or to intrinsic disorder. Protein digestion is useful when coupled with CD spectra, which lack position-specific information. Finally, the combination of proteolysis and mass spectrometry for fragment identification can indicate the presence of intrinsically disordered regions [13].

Stokes Radius Determination
Random coil disorder has also been detected by various methods for obtaining stokes radius such as small-angle X-ray scattering or size exclusion chromatography [14].

Biological Importance of Intrinsically Disordered Proteins
The native unstructured proteins are found to participate in many biological processes and commonly occur in cell signalling pathways, DNA transcription and replication and protein translation [15]. Intrinsically Disordered Proteins is important for understanding protein function as well as protein folding pathways [16]. More than 180 such proteins are known including Tau, Prions, Bcl-2, p53, 4E-BP1, elF1A and HMG proteins [17]. They are thought to become ordered when they are bound to another molecule (e.g. CREB-CBP complex) although little is understood about the cellular and structural meaning of disorder. Due to the abnormal aggregation patterns of these proteins, they are involved in major protein diseases such as Parkinson's and Alzheimer's syndromes [18].

Intrinsically Disordered Proteins as Drug Targets
The recent studies suggest that IDPs involved in various disease that suggests that they can be used as viable targets. IDPs including the development of small molecules that either: (i) directly bind to the disordered ensemble, or (ii) That bind to binding partner and inhibit IDP binding or stabilize its bound state, and (iii) That affects modification pathways [19]. They are aimed to targeting conformational transitions and regulatory elements besides of modulating Post Translational Modifications (PTMs). Currently existing drug design strategies for IDPs include the development of small molecule and peptide inhibitors, arresting order-to-disorder transitions, targeting regulatory elements and the modulation of post-translational modifications [20].

A.
Disprot: It is a manually curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either entirely or in part. The database contains experimentally characterized IDPs and includes functional information for many of the IDPs and regions [21]. In its first public release of February 2004, DisProt contained 154 proteins (190 disordered regions); whereas in June 2018 the database contained 803 proteins (2167 disordered regions). The database can be accessed at http://www.disport.org.

Database of Disordered Protein Prediction (D2P2):
This database contains all the protein sequences from 1765 complete proteomes in which battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on [22]. The database can be accessed at http://d2p2.pro/.

C.
MobiDB: MobiDB (http://mobidb.bio.unipd.it/) is a database of intrinsically disordered and mobile proteins. It contains different flavours of disorder in protein structures covering all UniProt sequences. The database features three levels of annotation: manually curated, indirect and predicted. Manually curated data is extracted from the DisProt database. Indirect data is inferred from PDB structures that are considered an indication of intrinsic disorder. Predicted data contains disorder annotations predicted for protein that lack by various predictors like ESpritz, IUPred, DisEMBL, GlobPlot, VSL2b and JRONN [23]. of structural information on IDP-and denatured protein ensembles based on Nuclear Magnetic Resonance (NMR) and Small-angle X-ray Scattering (SAXS) data [25].

Computational Tools for IDP's Prediction
Various computational predictors were developed for predicting protein conformational disorder. Since there is no unique definition of Intrinsically Disordered Proteins, each of the predictors has its own definition and algorithms. Some of them have different versions of the same basic algorithm. Some of the predictors are based on datasets of ordered/disordered proteins. Others are based on physicochemical trends and observations. All the predictors mentioned here are available as web servers (Table  1).

Computational Drug Design Tools for IDPs
Experimental procedures provide a lot of useful data and computational approaches need it to carry out simulations as more realistic as possible. However, for IDPs, where dynamics is so important, computation clearly has advantages over experiments as it can describe dynamics completely and it can characterize the full conformational energy landscape of proteins. Some of docking tools for IDPs are pepATTRACT [26] and Rosetta-FlexPepDock [27] for Computational peptide protein docking. IDP-LZerD [28] used for docking of disordered protein interactions. AnchorDock [29] for Blind anchor-driven peptide docking, HADDOCK [30] used for Docking of protein-protein and protein-peptide binding. Molecular dynamics CHARMM [31] for Molecular simulations and modelling program, Amber Biomolecular simulation program, Gromacs [32] for Molecular simulations and modelling program of proteins, lipids and nucleic acids.