Understanding the molecular details of drug-target protein interactions is a critical component of the drug discovery process in the modern pharmaceutical industry. We have put in place a comprehensive set of highly integrated biochemical and biophysical assay methods to better characterize the target protein and its interactions with inhibitors. These techniques enable us to identify chemical instability (oxidation, deamination etc.), proteolytic or chemical degradation, post-translational modification, and physical instability such as surface denaturation, soluble aggregation, and precipitation. More complete routine characterization of the protein of interest informs the development of better formulations, as we are able to quantitate the effects of formulation components on protein stability. A critical component of our biophysical arsenal is the ability to investigating a protein’s thermodynamic properties by ITC and DSC. Both techniques are based in well established fundamental principles. ITC provides thermodynamic data used to confirm the binding model, as well as quantitation of binding enthalpy (ΔH), entropy (TΔS), free energy (ΔG) and binding association constant Ka. DSC measures the heat changes associated with thermal denaturation of the target protein and has been extensively used in understanding protein folding and unfolding. The degree of stabilization conferred by compound binding is often related to the affinity of the interaction. The thermodynamics associated with unfolding (change of Tm, ΔCp, and ΔH,) reveal information about the properties of compound binding.1,2 Classically, both measurements required a significant amount of protein and commitment of significant operator time, as the instrumentation required manual operation. These factors limited wide application of these measurements. Recently, MicroCal LLC. introduced two new instruments, the AutoITC and AutoDSC, with significantly improved sensitivity, greatly reduced sample consumption, robotic automation and user friendly software packages that greatly simplify experimental set-up and data analysis. In this paper, we will present examples of disciplines in small molecule drug discovery where this new instrumentation has allowed us to improve the efficiency of our processes.
Recombinant protein construct design and expression for drug discovery at Exelixis
Protein construct design for expression is not yet a predictable process. Thanks to the rapid development of genome science, DNA manipulation is no longer a limiting factor. To ensure successful expression of our target proteins without multiple rounds of optimization, we take a highly parallel approach, preparing 20-30 constructs for a given protein simultaneously. This insures a high probability of success, but adds the additional complication of requiring unwieldy numbers of experiments to optimize expression of all constructs. Approaches to prioritizing these constructs will be discussed in the next section.
The first step in designing expression constructs is to extensively search for literature precedents, particularly from the protein structural database (www.pdb.org). This provides a good indicator for expression feasibility, as structural biology usually requires both high quality and large quantities of protein. For novel or not well-studied targets, domain analysis is performed based on information from our in-house expression database and structural modeling (particularly in the areas of boundary analysis and insertion sequences). A number of random combinations of N-terminal and C-terminal boundaries derived from homolog or ortholog proteins are also included in the initial design panel. In our experience, even minor changes at the termini can result in dramatic differences in protein expression yield and/or protein stability.
To facilitate identification and purification of the target protein, cleavable fusion tags are usually incorporated in the construct design. The poly-His tag is our preferred affinity tag. The tag is short and purification by metal chelating chromatography is inexpensive and scalable. For proteins with solubility issues, several approaches may be taken. Large protein tags such as GST and MBP are used with great caution, as frequently the tag helps drag partially folded protein into solution. Co-expression is another alternative when the protein requires a protein ligand to form a stable complex. For certain classes of proteins, it is necessary to include a small molecule ligand during protein production to maintain protein stability.
Finally, hydrophobic surface residues can be mutated based on modeling from orthologous proteins. All of the constructs are engineered so that the affinity tags can be subsequently removed by treatment with highly specific proteases. Target protein expression can be toxic to the host cell, resulting in reduced yield or cell death. For biophysical measurements and structural purposes, activity attenuating mutations can be introduced to reduce the toxic effects on the host cell. These mutations are carefully chosen to be distant from the site of inhibitor binding. In the case of kinases, these mutations can often increase the yield 5-10 fold. E. coli and BEVS expression systems are our workhorses for intracellular protein expression.
Screening and optimizing recombinant protein
After generating multiple constructs for a target protein, prioritizing constructs based on their key biophysical properties is essential. We have developed and implemented three criteria for prioritizing constructs for optimization and scaleup.
The first criterion is soluble expression yield after the first round of expression optimization. Protein expression yield in E.coli can be affected by multiple factors such as medium composition, growth conditions, induction conditions (inducer concentration and temperature), harvest time, etc. In the BEVS system, factors such as multiplicity of infection (MOI), infection time, agitation, sparging rate, and harvest time can significantly impact the expression yield. At the small scale expression testing stage, the key expression parameters are tested in parallel to insure comparability of the results. Protein yield ultimately dictates the production costs and purification resources required for scale-up. Therefore, accurate assessment of yield and understanding key expression parameters for the target of interest are crucial.
The next criterion for construct prioritization is protein solubility testing at high concentration (typically >3 mg/ml is required for crystallization optimization). The purified protein is concentrated in a variety of buffers and evaluated by measuring soluble aggregation and precipitation at different concentrations. The homogeneity of the soluble fraction is assessed by both dynamic light scattering and static light scattering. Monodispersity of the protein solution is one of key determining factors for successful crystallization.
The final criterion is thermal stability (high Tm) of the construct as measured by DSC. While protein yield is not always a good predictor of protein stability, both stability against aggregation in solution (mono-dispersity) and thermal stability are excellent predictors of successful crystallization.
Affinity tag effects on protein stability
Affinity tags fused to the protein of interest are widely used in recombinant protein expression. In addition to the ease of protein identification by ELISA and western immunoblots, the tag greatly facilitates the purification process. The initial purification no longer relies on the individual protein properties, but rather on the nature of the tag. Affinity tags have transformed the purification process to allow parallel, high throughput, automatable small scale processing of multiple samples. Nevertheless, the effect of the tag on the protein is not always benign.
In many cases, we have observed that even the addition of a small 6xHis tag can reduce protein stability. Sometimes, the effect can be dramatic, shifting the Tm by 10°C or more. DSC can readily be used to measure the stability difference between tagged and tag-free protein (after tag removal). DSC can also detect the thermal transitional peak of the expressed protein domain when large tags such as GST or MBP are used. Both GST and MBP have discrete high Tm, which frequently is much higher than Tm of our target protein. The detection of a second Tm from the fused target protein provides a good indication of the presence of independently folded domains. It is important to mention that we have found that for the majority protein domains that failed to be solubly expressed with a poly-His tag (insoluble aggregation, protein found in the pellet after lysis), the use of larger protein tags may appear to promote soluble expression. However, with closer study, the protein of interest is often found not to be functional or correctly folded and immediately precipitates upon tag cleavage. Therefore, one should be cautious when using these tags.
Protein integrity assessment
A distinct thermal transition is a good indication of uniformly folded protein. It reflects a population that responds to increased temperature and unfolds by the same pathway. However, one cannot conclude that this protein is folded to be functional, as opposed to having adopted a conformation that is kinetically favored or is trapped in a local energy minimum. We use DSC and ITC to confirm protein function by measuring its ability to interact with inhibitors, natural substrates, or ligands. The inhibitors used may be generic for certain class of enzymes, or highly specific compounds made in-house or available commercially. When high affinity substrates or inhibitors are bound to the protein, the protein is stabilized to thermal denaturation and the Tm is increased in DSC analysis. The DSC assay is robust, very consistent and easy to assemble. Compound solubility generally does not present a significant issue. The quantitative binding energy can also be obtained by ITC if desired. This has the added benefit of determining the stoichiometry of binding which in turn confirms the purity of the protein preparation. Furthermore, the relative binding affinities of substrate and product provide useful information for development of activity assays. It is a great way to identify the potential for product inhibition of the reaction catalyzed by the enzyme.
Protein formulation optimization
Formulation of proteins to increase solubility and stability is an integral part of our protein chemistry efforts. DSC allows direct study of the impact of buffer component changes on protein stability. With a better understanding of the intrinsic protein folding characteristics, co-solvent effects, and thermal sensitivity, we can better design the large-scale process, speed assay development and crystallization trials, and improve product storage. For example, to optimize solvent effects on protein freezing-thaw stability, protein samples are subjected to multiple cycles of freezing-thaw then assessed by DSC. The integrity of proteins that require extended assay times or elevated temperatures can be confirmed by DSC at various time points of an assay condition simulation. Free energy (derived from ITC analysis and the Gibbs-Helmholtz equation) plotted as function of temperature can serve as a guideline to define a good working temperature range. While DSC allows us to find good additives, it also allows us to remove redundant co-solvents. This is a critical step in simplifying the initial sample delivered for crystallization trials.
For one of our projects, literature conditions indicated the need for an extensive array of co-solvents to maintain the stability of the protein and these co-solvents were found to prohibit the formation of crystals in our trials. We used DSC to dissect the requirements for each additive and identified the dominant ones. Removal of the non-essential components resulted in a robust process for crystallization.
DSC applications in protein purification and characterization
One topic that has not yet been given enough attention is that recombinant proteins can bind to host cell ligands (small molecules or macromolecules) tightly enough to remain bound throughout the purification process. Some phenomena are well known, such as proteins with high pI binding to nucleic acids from the host cells. These events can be detected simply by measuring the UV280/UV260 ratio. We have also observed that our target proteins can be tightly complexed with host proteins, which can be easily revealed by various conventional methods: multi-angle light scattering, gel filtration, SDS-PAGE, mass spectrometry, etc. In the case of recombinant proteins tightly but reversibly bound with small molecules, these changes are often not easily detected by conventional methods.
However, the tight binding of small molecules can significant increase the thermal stability of the protein. DSC is a sensitive instrument for detection of these energetic changes. On one of our projects, a high resolution ion exchange separation indicated that the protein preparation was heterogeneous, even after extensive purification. All three peaks were identical based on SDS-PAGE and electrospray mass spectrometry. DSC was used to analyze each peaks and indicated that two of the peaks have higher Tm. Re-chromatography of these peaks showed slow conversion to the lowest Tm form. The higher Tm peaks turned out to have tightly bound small molecule ligands from the host cells. These small molecules were later identified by mass spectrometry after the extraction from the crude protein preparation. With this knowledge, we were able to devise a way to remove these ligands, resulting in increased recovery of a homogenous protein preparation for crystallization.
Microcalorimetry provides orthogonal information for understanding drug binding
Traditionally, information regarding compound inhibition has been derived from activity based assays. These assays typically provide an IC50, or Ki. The quality of these measurements can be compromised by several experimental artifacts.
Reactive impurities in the compound of interest can reduce the concentration of active protein. This is a frequent event in pharmaceutical small molecule screening,3 although careful analysis with adjusted mathematical models can be applied to restore the quality of the data. Other situations, such as enzymes with very high Km, enzymes with slow turn-over rates, or with stability issues, can also lead to erroneous affinity calculations. When the enzyme concentration needed to obtain an experimental readout is much higher than the Ki for compound inhibition, the equations used to determine Ki are no longer valid.
Biophysical analysis can provide an alternative route to determine the binding affinity that is not subject to some of these artifacts. Isothermal calorimetry can also provide important thermodynamic information regarding the binding mechanism, allowing determination of Ka, ΔH and ΔS. It also provides ΔCp, a parameter related to the magnitude of protein conformational change. In many cases, slow binding pharmaceutical compounds require at least a two step binding event, fast binding, usually diffusion limited, followed by a slow event coupled with enzyme conformation change. These components can be observed and dissected by ITC.
ITC is particular useful where the inhibitor binds to only the inactive form of the enzyme. For example, GLEEVEC® only inhibits the inactive form of the cABL kinase. Attempts to measure the Ki in an assay utilizing active cABL do not yield a true value of the affinity of the small molecule for its target, the inactive form.
DSC can provide a rough estimate of relative binding affinity of compounds by measuring the shift in Tm caused by compound binding. DSC detects racemic inhibitor mixtures with differential binding affinity as two different melting temperature peaks. It can also be used to profile binding affinities to both inactive and active enzyme forms in SAR analysis and can discriminate between competitive and non-competitive binding modes in mechanistic analysis when mixtures of protein, ligand, and inhibitor are studied. Kroe RR et al. noted that there is a good linear relationship between Log Tm and Ki.4 Our data confirmed this relationship; however, in rare cases where the binding ΔCp is exceptional large (large conformation changes as a result of compound binding to the protein), the correlation is diminished.
We hope that this review has provided new insights into the utility of calorimetry measurements in enhancing the efficiency of experimental optimization in protein biochemistry and studies of protein-compound interactions. We expect that the list of applications will continue to expand as more laboratories use modern automated instrumentation to incorporate routine calorimetry measurements into their daily workflow.
The authors would like to thank the entire staff at Exelixis Drug Discovery for their many contributions to this work.
- Plotnikov V, Rochalski A, Brandts M, Brandts JF, Williston S, Frasca V, Lin LN (2002). An autosampling differential scanning calorimeter instrument for studying molecular interactions. Assay Drug Dev Technol. :83-90
- Brandts JF and Lin LN. (1990) Study of strong to ultratight protein interactions using differential scanning calorimetry. Biochemistry. (29):6927-40.
- Kuzmic P, Hill C, Kirtley MP, Janc JW. (2003) Kinetic determination of tight binding impurities in enzyme inhibitors. Anal Biochem. 319(2):272-9. 4. Kroe RR, Regan J, Proto A, Peet GW, Roy T, Landro LD, Fuschetto NG, Pargellis CA, Ingraham RH.(2003) Thermal denaturation: a method to rank slow binding, high-affinity P38alpha MAP kinase inhibitors. J Med Chem.