Drugs that have been withdrawn from international marketplaces between 1995 and 2005 due to associated hepatotoxicity.
1.1. The cost of new drugs and need to streamline drug development
Innovation is fundamental to discovering new drugs for the variety of human conditions that exist. It is also one of the key requirements for any pharmaceutical organization that wishes to gain a competitive edge. The pharmaceutical industry is profit-driven because it has to fund its own drug innovation, which highlights why research and development (R&D) forms the backbone of this industry. According to the CEO of the Pharmaceutical Research and Manufacturers of America (PhRMA), John Castellani, member companies of PhRMA spent a record US$ 67.4 billion on R&D in 2011. This is approximately 20% of generated revenue, which is 5 times more than the average manufacturing firm invests into R&D . The pharmaceutical sector was responsible for 20% of all R&D expenditures by U.S. businesses in 2011 . The aforesaid figures do not describe global R&D expenditures, but serve to give some indication of the astronomical contributions that are annually devoted by the pharmaceutical industry to drug development.
Substantial fiscal investments are made against the backdrop of enormous investment risks. It is estimated that only 5 of every 10 000 compounds explored will make it to clinical trials . Although the likelihood that an investigational new drug in clinical testing reaches the market has increased over the past couple of decades to 16%, the probability is still low. Furthermore, of those that do get approved, only 2 or 3 out of every 10 drugs recover their full pecuniary investment . The stakes are incredible and the strain on the industry as a whole is overt. In 2011 the world's largest research-based pharmaceutical company, Pfizer, closed its R&D centre located in the U.K. owing to financial viability concerns. In an attempt to dissuade some of the financial pressures, many companies have opted for mergers to either maintain existing pipelines or acquire new development opportunities .
A fairly regular citation estimates the out-of-pocket, pre-approval cost per drug developed to be more than US$ 800 million . Estimations reported in peer-reviewed literature ranges from US$ 391 million  to US$ 1.8 billion . Evident from literature is the fact that the estimates increase over time, in other words, the cost of developing drugs is escalating, which implies ever-increasing financial pressures on industry.
Two of the most prominent concerns for the pharmaceutical industry are patent expirations and attrition rates. Patent expirations result in decreased revenue generation and, as stated, this industry is profit-driven, meaning that diminished earnings cripple the R&D of an organization. Not only does this predict deterioration for a pharmaceutical company, but decreased R&D output also slows the production of new drugs. This also has a major impact on healthcare. It is estimated that in the U.S. a new case of Alzheimer's develops every 68 seconds . Using these figures, more than 460 000 new cases of Alzheimer's will develop each year the approval of an effective new drug is delayed. Whereas patent expirations prune generated revenues, attrition rates affect the opposite side of the equation, needlessly raising the cost of developing new drugs. Attrition rates are high (Figure 1). A chemical entity that reaches phase I clinical trials has a 71% chance of reaching phase II clinical trials. Those chemical entities that do reach phase II trials have only a 31% chance of entering phase III trials. Further compounding the issue are rising failure rates in phase III trials . Attrition drives development costs for two reasons: 1) monetary investments into failed ventures are lost and 2) failing development programs occupy resources and time that could otherwise be spent on drug candidates that would eventually succeed to be approved for marketing.
Together, patent expirations and drug attrition add enormous strain on new drug development, in a cumulative way inhibiting productivity and output of the entire R&D process. An article recently published by Forbes offers some perspective on the impact of attrition on development costs . According to this article, AstraZeneca has been plagued by development failures, which escalated their average cost to develop a new drug to US$ 12 billion. In comparison, for Eli Lilly the average cost of developing a new drugs is estimated at only US$ 4.5 billion. The difference in development cost between the two companies can be attributed to the difference in approval rates of new drug i.e. less failures .
The average times, from the start of a particular phase to entering the next phase, are 4.3 years for pre-clinical development and 1.0, 2.2 and 2.8 years for phase I, II and III trials, respectively. Regulatory perusal adds another 1.5 years to the entire process . Collectively, the duration of drug development from initiation of clinical testing until drug approval is estimated at 7.5 years . Including pre-clinical development, it takes, on average, 10 - 15 years to develop a new drug from its discovery to regulatory approval [1,4] (Figure 2). A study that investigated the reduction in costs associated with drug development with improved productivity of the process reported that a 5% reduction in total development time will decrease development costs by 3.5% . Although this may not sound like much, 3.5% of US$ 1 billion is a substantial saving. The study also emphasized the reduction in costs if decisions to terminate unproductive development programs are shifted to earlier phases of the discovery process. For example, the study estimated that if a company manages to shift a quarter of its decisions to terminate from phase II to phase I, it would save US$ 22 million . Again, it relates back to why attrition drives development costs. Making the decision to terminate (a development program) earlier would stop further investment into unfruitful programs and free resources to promote approval ratings.
Industry continuously struggles to bring new drugs to the market, despite the process being overextended, costly and particularly uncertain of success. Over the last decade, overall drug development time has increased by 20% and the rate of approval of new chemical entities has dropped by 30% . There is a mounting need to nurture output from the drug development process. Minor restructuring and streamlining of this process is required to increase its productivity and alleviate some of the financial pressures that drug developers experience. One area in particular where pruning of this process is overdue is the early pre-clinical detection / prediction of potential hepatotoxic chemical entities.
2. Attrition due to hepatotoxicity
Drug-induced liver injury (DILI) is a challenge for both the pharmaceutical industry and regulatory authorities. The most severe adverse effect that DILI may lead to is acute liver failure, resulting in either death or liver transplant. Of all the cases of acute liver failure in the U.S., between 13% and 50% can be attributed to DILI [11,12]. Without a doubt there is great concern for the safety of consumers exposed to drugs that may cause DILI because patients have only one liver. For this reason, government and the public put pressure on regulatory authorities to establish safer drugs . However, if regulatory authorities unnecessarily raise safety standards without scientific evidence, this will discourage drug development because of attrition, which is predominantly unwanted when considering the current scenario where fewer antimicrobials are being developed alongside increased antibiotic resistance.
A prevailing issue in drug development is the attrition of new drug candidates. Between 1995 and 2005, a total of 34 drugs were withdrawn from various markets (Table 1) and the reason for withdrawal in the majority of cases was hepatotoxicity . Hepatotoxicity is the leading cause of drug withdrawals from the marketplace [15-17]. Examples include the monoamine oxidase inhibitor, iproniazid, the anti-diabetic drug, troglitazone, and the anti-inflammatory analgesic, bromfenac, all of which induced idiosyncratic liver injury. Iproniazid, the first monoamine oxidase inhibitor released in the 1950's, was probably the most hepatotoxic drug ever marketed . Troglitazone was available on the U.S. market from March 1997. By February 2000, 83 patients had developed liver failure, of which 70% died. Of the 26 survivors, 6 required liver transplants . While on the market, troglitazone accrued approximately US$ 700 million per year . Withdrawals of lucrative drugs like troglitazone diminish return on investments and threaten further R&D.
Of all classes of drugs, non-steroidal anti-inflammatory drugs (NSAIDs) have had one of the worst track records regarding hepatotoxicity. Benoxaprofen and bromfenac are two NSAIDs that were withdrawn from public use after reports of hepatotoxicity [16,19]. Benoxaprofen was withdrawn in 1982, the same year that it was approved . Bromfenac was predicted to earn around US$ 500 million per year .
Although diclofenac is widely used to treat rheumatoid disorders, approximately 250 cases of diclofenac-induced hepatotoxicity have been reported. In perspective, DILI caused by diclofenac has an incidence of 1-2 per every million prescriptions [20,21], being high enough that a considerable amount of literature has been generated warning against diclofenac-induced hepatotoxicity. Between 1982 and 2001 in France, more than 27 000 cases of NSAID-induced liver injuries were reported. Clometacin, and silundac were the NSAIDs with the highest risk of DILI. Over the same peroid approximately 2100 cases of NSAID-induced liver injuries were reported in Spain, with the main culprits being droxicam, silundac and nimesulide . Acetaminophen (a.k.a. paracetamol) must be the most notorious of all the NSAIDs, if not all drugs, when it comes to DILI. Its mechanism of hepatotoxicity is better understood than its therapeutic mechanism of action. Fortunately, acetaminophen has a substantial therapeutic index and copious amounts need to be administered before the liver will not be able to manage its onslaught anymore .
Troglitazone was available on the U.S. market for three years before withdrawal, during which time it was used by almost 2 million patients, realising some return on investment . Ximelagatran, on the other hand, was in the very late stages of development when its fate was sealed. In fact, AstraZeneca had already applied at the EMEA for marketing approval when the company withdrew all applications due to concerns over the hepatotoxic potential of the drug . Although this drug did reach the market in France, the U.S. FDA was not prepared to grant approval and the drug was never marketed in the U.S. . Ximelagatran, which was the first orally available thrombin inhibitor that would have replaced the troublesome warfarin as an oral anticoagulant, serves as a good example where huge investments were made to get the drug to market, but a return on investment was never realised. This example emphasizes the necessity for improved methodologies to predict intrinsic hepatotoxicity more accurately during the initial phases of the drug development process.
Examples of other drugs that were never marketed in the U.S. because of hepatotoxicity include drugs such as ibufenac, perhexilene and dilevalol. There are also drugs for which the use / application has been limited because of possible DILI. These include the drugs isoniazid, pemoline, tolcapone and trovafloxacin . A big question that remains a challenge for regulatory authorities is how rare or mild does hepatotoxicity have to be for a drug to be approved and to remain on the market?  Undoubtedly, DILI has a sizeable influence on drug development output. Pre- and post-marketing attrition as a result of DILI causes further financial stresses for those in the industry. Limiting attrition to the early phases of drug development can only be beneficial. Both the pharmaceutical industry and regulatory authorities agree that there is a great need for improved methodologies and strategies to accurately assess the hepatotoxic potential of compounds, earlier in the drug development process [13,26].
3. Safety pharmacology and current practices used to detect hepatotoxicity
Distinct from pharmacology proper, which examines the desired effects and kinetics of a particular drug, safety pharmacology identifies and characterises secondary adverse pharmacological and toxicological effects of potential drugs, mainly through the use of established animal models . Regulatory authorities require that certain minimal safety pharmacology examinations be completed before a new investigation drug application will be approved. These international regulatory guidelines were compiled by the International Committee for Harmonization (ICH) in the documentation covering topic S7. The ICH S7A and ICH S7B guidelines have been in effect since 2000 and 2001, respectively .
At present, the attention of pre-clinical safety pharmacology investigations is drawn to three physiological systems: the cardiovascular system, the respiratory system and the central nervous system (for compounds that may cross the blood-brain barrier). Effects on the cardiovascular system are of great concern because 1) it is a system often found to be affected and 2) due to its redundancy (organisms relevant to drug development have only one heart). Like the heart, the respiratory system is of concern because it is essential to the immediate survival of the organism.
Hepatic safety does not form part of the core battery of pre-clinical tests performed for initial safety pharmacology. The EMEA have published draft guidance on the non-clinical assessment of hepatotoxic potential . This draft amounted to a clinical white paper , however, no regulations are set in place yet. This initial draft may demonstrate the future intent of regulatory authorities. If this is the case, not only is it worthwhile for the pharmaceutical companies to consider improved pre-clinical evaluation of hepatotoxic potential for their own profit, but it may soon be required as part of their investigational drug applications before first-in-human trials.
Currently, in vivo screening for hepatotoxicity during both the pre-clinical animal testing and clinical phases of the development process forms the basis of hepatic safety testing. However, from an extensive study on available literature, Biowisdom, a healthcare intelligence company, estimates that between 38% and 51% of compounds showing liver effects in humans do not present similar effects in animal studies including both rodents and non-rodents . The mainstay clinical chemistry can be used to detect certain types of hepatic injury. For example, the aminotransferases, alanine aminotransferase (ALT) and aspartate aminotransferase (AST), can be used to identify hepatocellular injury, whereas levels of bilirubin and alkaline phosphatase can be used to assess hepatobiliary health [28,31]. Of the two aminotransferases ALT is by and large superior at predicting hepatocellular injury for two reasons: 1) ALT is a more sensitive signal than AST because it is found in higher concentrations in the cytosol of hepatocytes and 2) ALT is also more specific to the liver than AST as AST is normally also present in the blood, skeletal muscle and heart . The ratio of ALT/AST has been found useful to differentiate, to some degree, between different types of liver injury. An ALT/AST ratio >2:1 may be indicative of an alcoholic type liver injury, whereas a ration of 1:1 could point to non-alcoholic steatohepatitis a.k.a. NASH . Logistic regression analysis on 784 reports of DILI received by the Swedish Adverse Drug Reaction Advisory Committee between 1970 and 2004 found that, in combination, an AST/ALT ratio > 1 and bilirubin > 2 × upper limit of normal (ULN) had a higher positive predictive value than either AST in combination with bilirubin or ALT in combination with bilirubin . The "Rezulin rule" was coined to describe the fact that the more marked any ALT elevations and the frequency of such elevations during clinical trials, the more significant post-approval hepatotoxicity appears to become . Rezulin was the marketing name of troglitazone.
Elevations of > 3 × ULN are considered a sensitive signal of a potential hepatotoxic test compound. Data from 28 clinical trials (phases II - IV) conducted by GlaxoSmithKline between 1995 and 2005 found elevations in ALT of > 3 × ULN at baseline to be rare, with a prevalence of 6.265% . A study of Merck clinical trial databases, reported that elevations of ALT or AST > 3 × ULN had an 83% sensitivity to detect serious liver disease [Senior, 2003]. ALT > 3 × ULN has proved a useful threshold for screening for clinically significant DILI from various hepatotoxic substances. This includes drugs that have been withdrawn from the market due to hepatotoxicity, such as troglitazone and bromfenac . However, this is not a very specific signal as increases in aminotransferase levels can also be induced by drugs that do not cause DILI such as aspirin, statins and heparin . Indeed, the Merck study showed that the predictive power of elevations of ALT or AST > 3 × ULN, was only 11% . A separate manuscript also reported high sensitivity and specificity when using ALT > 3 × ULN, but again with very low predictive power (only 6%) . Serum ALT or AST levels are therefore a sensitive screen for possible hepatotoxic side-effects, but not definitive enough to terminate a drug development program.
Even though it was originally not intended as such, the most successful predictor of hepatotoxicity is “Hy’s law”, which is based on the original observations made by Dr. Hyman Zimmerman. It was described by Dr. Zimmerman as "clinical jaundice" and its modern application has proved valuable in being able to predict idiosyncratic hepatotoxicities brought about by drugs / potential drugs such as troglitazone and dilevalol. A more recent description is a state of drug-induced jaundice caused by hepatocellular injury, without any significant obstructive component [17,35]. Therefore, Hy’s Law is met when:
There exists the possibility that a drug (or potential drug) can induce hepatocellular damage as evident from elevations in serum aminotransferase levels of ≥ 3 × ULN and
These elevations are accompanied by elevations in total bilirubin of ≥ 2 × ULN with no evidence of intra- or extra-hepatic obstruction (elevated ALP) or Gilbert’s syndrome.
It is worth noting that Dr. Zimmerman himself placed some weight on the degree of jaundice as it often served to predict a negative outcome . Hy's Law is, however, not exclusive to DILI and if it is met, it is of utmost importance that any other condition(s) that may also cause these symptoms be excluded before any conclusions are drawn about a potential intrinsic hepatotoxin. Such conditions may include viral hepatitis, hypotension or congestive heart failure . Obviously, the possibility of DILI caused by concomitant drugs should also be excluded.
The incidence of idiosyncratic DILI is generally 1 per 10 000 or less. This makes it exceptionally difficult to detect idiosyncratic hepatotoxicity due to an investigational drug during clinical testing, even if several thousands of subjects are studied . Generally, an investigational drug does not get administered to more than 2000 subjects , which makes it unlikely to detect a single incidence in 10 000. Although it portrays the role of the key predictor of the hepatotoxic potential of an investigational drug during drug development, Hy's Law falls short of constituting a "gold standard'" and validation of Hy's Law is much needed, chiefly with regards to its sensitivity and specificity . Moreover, for the purposes of detecting potential intrinsic hepatotoxins as early as possible during drug development, the foremost drawback of Hy’s law is that it requires in vivo testing of the investigational drug. Hy's Law is therefore not a realistic approach for traditional in vitro testing, the type of testing that can be applied prior to vast resources being invested into in vivo testing.
4. Methodologies applicable to the early pre-clinical assessment of potential intrinsic hepatotoxicity
The ultimate goal of research into this field is to establish an in vitro model or tier of in vitro tests that is valid and able to accurately predict DILI during lead optimisation before any hepatotoxic chemical entity under development unnecessarily progresses into in vivo studies. Currently, this is still desired as more than 90% of candidate drugs that enter the clinical phases of drug development still fail to complete development due to inadequate safety, pharmacokinetics or efficacy . The following sections will focus mainly on in vitro methods as these can be conducted at the lowest expense and at higher throughput than conventional animal studies.
4.1. Cell-based models
4.1.1. Cell cultures
Cell-based models are increasingly used as there is a growing pressure to reduce, refine and replace the use of animals from organisations such as the European Centre for the Validation of Alternative Methods (ECVAM). The three basic types of cells used for in vitro toxicity testing are transformed cell lines, primary cells and pluripotent cells. The latter will be discussed in more detail later in the manuscript. The advantages of using transformed cell lines include unlimited supply, no genetic variation, which aids reproducibility and predictive power of an outcome, as well as access to the collective knowledge gained from global research conducted on the geno- and phenotype of the cell line in question. The HepG2 cell line, was one of 20 cell lines of human origin that was used in one of the first international attempts to try and predict in vivo toxicity through in vitro techniques during the Multicentre Evaluation of In Vitro Cytotoxicity (MEIC) program, initiated by the Scandanavian Society of Cell Toxicology in 1989. The program was based on two main assumptions: 1) there exists some “basal cytotoxicity” that can be quantified, and 2) in vitro methods can be used to model some type of “general toxicity”, which is related to the basal cytotoxicity concept . Basal cytotoxicity was defined as “the toxicity of a chemical to basic cellular functions and structures, common to all human cells”. Although the study lacked some systemic focus, results from the MEIC study conducted on 50 reference chemicals, demonstrated that in vitro cytotoxicity assays were able to predict lethal human blood concentrations just as well as rodent LD50 values were able to .
The use of immortalized human hepatocytes cell lines, like HepG2 cells, were proposed to overcome limitations of primary human hepatocytes including the scarce availability of fresh human liver samples, complicated isolation procedures, short life-span, inter-donor variability, and cost. HepG2 cells display morphological features similar to that of liver parenchymal cells and maintain many functions of in vivo hepatocytes, expressing receptors for insulin, transferrin, epidermal growth factor and low density lipoprotein . These cells also express a plethora of cellular products (www.atcc.org). The HepG2 cell line has been used extensively in research as a model to study cytotoxicity , liver lipid metabolism , mitochondrial homeostasis [45,46], oxidative stress , gluconeogenesis and glucose uptake , to mention just a few. The applications are very broad and there is a vast collective knowledge about how these cells behave and respond under specified conditions or when exposed to various stressors. This must be one of the greatest advantages when using these cells, especially in mechanistic studies. However, it is believed that observations made with these cells cannot be extrapolated to humans as they do not behave as native hepatocytes would because of discrepancies in drug biotranformation . HepG2 cells are known to express low levels of cytochrome P450 (CYP) enzymes compared to primary hepatocytes [49,50].
The chief advantage that primary cell cultures have over most perpetual cell lines is that they are the closest in vitro representation of the in vivo cell type under scrutiny. Hence, primary hepatocytes are considered the “gold standard” used for predictive toxicology . Unlike transformed cell lines, primary cultures have a limited growth and life-span and fresh stock needs to be sourced regularly. This is problematic in itself as human hepatocytes are scarce and availability sporadic . Another drawback of using primary cultures is the occurrence of donor-to-donor variability that is introduced into the results, which will decrease the power of predicting a specific outcome. Although primary hepatocytes initially express higher levels of metabolic enzymes than transformed cell lines like HepG2 cells , in culture their liver-specific function decrease over time .
Two techniques have received attention over the years to try and improve the life-span of primary hepatocytes in culture. These are sandwich culturing techniques and special medium formulations. Sandwich culturing techniques address the conformational / spatial discrepancies between the 2D in vitro and 3D in vivo microenvironments. Hepatocytes are seeded on top of a layer of either collagen I or matrigel, which mimics in vivo extracellular matrix. An additional overlay of extracellular matrix is then layered on top of the seeded hepatocytes [51,53]. Additives to medium formulations attempt to imitate endogenous signalling found in the in vivo milieu. Serum and corticosteroids are known to affect cultured hepatocyte morphology. Contradictory to the general thought that adding serum to medium is good for cells, adding serum to medium that is used for culturing primary hepatocytes will cause the cells to rapidly deteriorate and lose cytoplasmic integrity and bile canaliculi-like structures. The corticosteroid, dexamethasone, has also proven helpful in improving primary hepatocyte life-span when in culture . Culturing primary hepatocytes in a sandwich conformation with extracellular matrix, no serum and dexamethasone allows the conservation of liver-specific functionality for several weeks .
The problems of low levels of enzyme expression in HepG2 cells and limited life-span of primary hepatocytes was overcome with the emergence of the HepaRG cell line. These cells express higher levels of CYP's than HepG2 cells and respond acutely to induction of these enzymes . HepaRG cells maintain hepatic functions of primary hepatocytes and express normal levels of liver-specific genes while lacking the inter-donor variability observed with primary hepatocytes [50,54]. A lot of literature praises HepaRG for the increased metabolic activity, which allows the in vitro study of drug metabolism using a theoretically unlimited supply of cells. This is certainly a remarkable advancement for in vitro drug metabolism and pharmacokinetic (DMPK) studies. However, this is a fairly new cell line (first described in 2002 ) and the accrued collective knowledge of this cell line is dwarfed by that of the HepG2 cell line. A recent study compared the whole genome expression profiles of HepG2, HepaRG (differentiated and undifferentiated) and primary human hepatocytes to that of human liver tissues . It was found that in terms of correlation with human liver tissues, the cell cultures ranked: primary human hepatocytes > HepaRG > HepG2, which boasts well for the future of the HepaRG cell line for use in predictive toxicology.
4.1.2. Outcomes and detection methods
Researchers at Pfizer postulate that the poor predictive power of conventional cytotoxicity assays is related to the endpoint being measured . Cytotoxicity endpoint assays only assess the final extreme from a series of pathological events that lead to cellular death. Assays that target such late events are likely to fail in detecting more subtle types of toxicity that develop after chronic, low-dose exposure to manifest as non-lethal, but definite adverse, complications . In addition to this, the liver is the only organ in mammals that can fully regenerate after injury , making testing for adaptive changes even more relevant and applicable. An example of this scenario of subtle toxicity can be found in troglitazone, which exerts sub-acute hepatotoxicity by acting on a sub-cellular level, disrupting mitochondrial homeostasis. The mechanism of toxicity of troglitazone was investigated by means of in vitro models [45,59,60], which emphasizes the role that cell-based test systems can play during early drug development. It is substantially easier to utilize cell-based in vitro models to examine sub-cellular events, compared to higher levels of biological organization i.e. whole organisms. Cells are the first level of organization where all the lifeless constituents that comprise a cell, functions together as an entity, and the first level where disrupted interplay between sub-cellular components can be evaluated. Cell-based models are more than suitable for the task at hand, but what is being evaluated using these models is critical to the success of such attempts. Rather than cell death / survival endpoints, some adaptive / pre-lethal mechanistic endpoints that can be considered include mitochondrial homeostasis [45,46,61,62], generation of reactive oxygen species (ROS) , lipid peroxidation , Ca2+ signalling  and inhibition of enzymes and transporters , especially bile acid transporters  (Figure 3).
An important tool that was used in the MEIC study, and remains relevant to current methodologies, is that of mathematical modelling. In the MEIC study, researchers employed partial least squares regression . Mathematical modelling provides a way in which researchers can combine data from different endpoint assays, thereby allowing them to piece together underlying associations and correlations observed when drugs (or groups of drugs) affect normal cellular function. Previous research illustrated how mathematical modelling of multiparametric data can aid prediction . Seventeen compounds (7 known hepatotoxins and 10 "unknowns") were subjected to testing using 6 separate endpoint assays monitored with a fluorescent plate reader. The data was then used to develop 5 prediction models. Modelling techniques included logistic regression, support vector machines (using several different kernel functions), decision tree, quadratic discriminant analysis and neural networks. Discriminant analysis was found to yield the best positive and negative predictive values . In addition, the study highlighted the significance of adequate sample size and careful consideration and defining of positive and negative test values in the training data set. It is important to realise that the task of predicting DILI from in vitro studies does not only depend on the parameters that are measured but also on how the acquired data is reduced and utilised to reach the critical “go” / “no-go” decision.
As with the study by Flynn and Ferguson , high content screening (HCS), which is based on automated microscopy, also employs fluorescent probes. HCS is one cell-based methodology that has shown promising results in predicting DILI. This methodology has three key strengths: 1) the ability to simultaneously examine multiple parameters of cellular function, 2) all parameters can be examined in individual cells, and 3) it has the potential for high-throughput screening since it is based on a microplate format. Combined, these features culminate in powerful technology. Testing more than 240 drugs using an HCS platform, researchers at Pfizer demonstrated that this methodology had overall sensitivity and specificity of 90% and 98%, respectively, for predicting in vivo hepatotoxicity . When employing the HCS platform the sensitivity of predicting severely hepatotoxic drugs was 100%, and 80% for moderately hepatotoxic drugs. This is a noteworthy improvement over the conventional cytotoxicity assays that showed scores of 20% and 24% for the same predictions . The authors again stressed the value of chronic, sub-lethal exposure conditions to allow cellular phenomena to manifest. Recently, a similar study on 61 hepatotoxic and 12 non-hepatotoxic drugs / compounds examined the same parameters (nuclear morphology, plasma membrane integrity, mitochondrial membrane potential, and Ca2+ fluxes) with the added parameter of lipid peroxidation, where scores of > 90% for both sensitivity and specificity were reported .
Another fluorescence detection method that has a potential role in early pre-clinical assessment of intrinsic hepatotoxicity is flow cytometry. Essentially, this method of detection can analyse the same parameters as fluorometry and fluorescence microscopy. It has not been explored in as much detail as HCS, but initial reports are positive . There is room for research comparing these different methods of detection and the verdict is still out on which platform outperforms the rest.
4.2. Profiling technologies
Virtually all responses to toxic insults are accompanied by differential gene expression . Differential gene expression is likely to be accompanied by differential transcription and differential protein expression (adaptive responses in Figure 3). On this conceptual basis, researchers have tried to use profiling technologies such as genomics / transcriptomics and proteomics to discern between compounds that may or may not induce liver injury and even between subsets of chemical entities that cause different types of hepatotoxicity like necrosis, steatosis and cholestasis .
The sensitivity of genomics experiments is high enough to detect subtle changes in gene expression profiles. For this reason, it is argued to be more sensitive than conventional methodologies aimed at detecting toxicity . Indeed this was demonstrated in rats exposed to sub-toxic doses of acetaminophen, where subtle changes in gene expression profile were observed although no histological changes manifested . This boasts well for toxicogenomics as being able to identify the most sensitive signals of potential hepatotoxicity. The authors did however emphasise the weight of demarcating toxic events, sub-toxic / adverse events, and adaptive responses as this will have great influence on the outcomes of toxicogenomic studies. The ability to detect responses at a molecular level that are not necessarily revealed at phenotypic level makes it possible to address questions about linearity of the dose-response curve at low exposure levels and allows for more accurate determination of inflection points along to dose-response curve and threshold exposure levels . These determinants can play pivotal roles in safety pharmacology when selecting dosages for clinical studies. Regarding the predictive power of genomics, Zhang et al.  were able to achieve 83% accuracy in predicting human DILI using data obtained from rats. Rats that met Hy’s law were found to express a gene expression signature, which led to an 83% accuracy . Unfortunately, toxicogenomics using cell cultures and toxicogenomics using rodents do not correlate as well as one would hope. Following acetaminophen exposure, in vitro toxicogenomics using primary hepatocytes yielded results comparable to that of in vivo toxicogenomics regarding acute cellular toxicity. However, in vivo toxicogenomics revealed genetic expression changes due to an inflammatory response, which in vitro toxicogenomics failed to detect . This unearths the stubborn dilemma of inter-dependent physiological systems within an organism, which is very difficult to recreate experimentally.
Unlike the genome, the proteome is a dynamic entity that changes as gene activation and epigenetic factors alter protein expression due to endogenous and exogenous signals and factors. Studying the proteome allows the surveillance of current cellular events, which can only be deduced from genomics data. This is probably the greatest disadvantage of toxicogenomics compared to toxicoproteomics; there are many splice variants, post-translational modifications and subcellular localizations of the final products originating from genes [72,73] implying that some degree of extrapolation is necessary when predicting cellular events from genomics data. When studying the proteome, differential expression such as this can be detected and this may in fact form part of the solution, rather than part of the problem.
Studying the proteome provides a direct description of cellular functions . Thus far, toxicoproteomic attempts to predict DILI have demonstrated limited efficacy when performed in vitro . Toxicoproteomics performed on in vitro cultures have the key advantage that biologically significant alterations can be monitored without relying on whole animals . Exposing HepG2 cells to three model hepatotoxins that are known to cause necrotic, steatotic or cholestatic liver injury, researchers were only able to distinguish cholestatic injury from untreated controls . The study failed to successfully discern necrotic and steatotic events, however, the ability to detect adverse cholestatic events in HepG2 cells is noteworthy, as these cells are not known to form biliary structures in a monolayer but rather present as parenchymal cells . This implies that morphological studies would not have been able to detect this event and further suggests that native morphological features may not be required in order to detect / predict certain types of toxicity when utilising in vitro proteomic approaches.
Perhaps a more integrated approach would eventually prove more fruitful. Researchers conducted a study in which they characterised methapyrilene-induced hepatotoxicity in rats employing three profiling technologies simultaneously: genomics, proteomics and metabolomics . The report demonstrated the possibility and great value of these technologies when used in an integrative manner, where responses to the toxic insult could be followed from genetic expression changes, to protein up- / down-regulation, through to changes in the metabolite profile, which gave a very good indication of where and how the chemical entity may exert its biochemical action(s). Conducting this type of study on a substantial number of compounds, both hepatotoxic and not, will yield a vast amount of data on how hepatocytes react toward challenges with different types of chemical entities and provide insight into which responses should raise concern and which are harmless. It may also deliver further understanding of the mechanisms by which hepatocyte injury occurs.
One major drawback of all the profiling technologies is that most of the current research has been carried out in vivo, which is predominantly unwanted in the drug development scenario. The ultimate goal of predictive toxicology would be to develop techniques that can be used in vitro. However, it should also be noted that it is highly unlikely that animal studies will be avoided altogether, at least not for the foreseeable future, which leaves room for in vivo profiling technologies as adjuvants to conventional safety pharmacology testing. In fact, it may help justify the use of animals for safety pharmacology testing. It is not impossible to employ these technologies using in vitro platforms, but more research is necessary to develop and establish effective methodologies and biomarkers.
4.3. Emerging technologies
The main reason for a lack of in vitro predictive power is the difference in phenotype between perpetual hepatocyte monolayers and native in vivo hepatocytes. The traditional approach to circumvent this problem was the use of primary hepatocytes, which are considered to be the “gold standard” for in vitro hepatotoxicity studies. Two emerging technologies that may offer alternative solutions are hepatocytes differentiated from stem cells and 3D culturing techniques.
4.3.1. Stem cell technologies
As all physiological processes take place in a cellular setting, the highest quality of cells should be used to determine safety and efficacy. This led to the use of stem cells. Stem cells are classified as embryonic or adult, which is distinguished by developmental status. Where adult stem cells are multipotent (yield the cell type from the tissue from which they originate), embryonic stem cells are pluripotent (can give rise to differentiated cell lineages of all three germ layers). Stem cells that originate from embryos have a normal diploid karyotype and do not exhibit donor-dependent variability. The advantage of these cells compared to primary cells are that they can be maintained in culture for a longer period of time and can be grown up in large scale, producing high volumes.
The implementation of murine embryonic cells to predictably identify human developmental toxins, allowing for early identification of toxicity or candidate compounds in the discovery pipeline was initiated by ECVAM. Mouse hepatocyte-like cells, which were established from embryonic stem cells were the first to be used in hepatotoxicity models. The efficacy of cell differentiation and maturation was improved, where the cells generated alpha fetoprotein and albumin . Cell characteristics included: 70% expressed the phenotypical marker albumin, they metabolized ammonia, lidocaine and diazepamat nearly two-thirds the rate of primary mouse hepatocytes. However, the difference in metabolism between humans and mice is considerable leading to interspecies extrapolation problems. Subsequently hepatocyte-like cells were differentiated from hESC . These cells contained liver-related characteristics such as; expression of α-fetoprotein, production of albumin, hepatocyte nuclear factor 4α and induction of CYP450 enzymes, stored glycogen and showed uptake of idocyanine green. This was followed by more differentiated hepatocyte-like cells which additionally express functional glutathione transferase activity at levels comparable to human hepatocytes .
The advantages of stem cells in relation to transformed/tumour or primary cells are that the former possesses normal growth, genetic transformation and genetic composition as well as uniform physiology and pharmacology . Since stem/progenitor cells can differentiate into clinically relevant cell types, but still maintain functional similarities to their in vivo counterparts, they allow for safer drugs to be introduced into clinical trials and the market place. Other advantages of stem cells include; the availability of cell types which were not previously available and the ability to investigate cellular renewal, regeneration, expansion as well as differentiation . Stem cells can also be genetically modified using reporter gene construct, thereby providing specific disease models .
As with all technologies, there are still hurdles to overcome with stem cell technology. Many clinically relevant cell types cannot be efficiently differentiated, purified and isolated . Human stem cells that reproducibly deliver hepatocytes with predictive pharmacology results for high-throughput safety screens are limited. Although progress has been made in the differentiation protocols, scaling cell growth and plating for cell-based assays, as well as refining of these protocols in order to ensure homogeneous preparations will continue. Currently, panels of human embryonic stem cells which reflect the wide variation in the population are not available.
Although these hurdles exist, stem cells hold the potential for investigation into metabolic competence, biotransformation capacity and transformation of exogenous compounds. Also, the ability to determine human inter-individual differences due to genetic polymorphisms.
4.3.2. 3D culturing techniques
Although 2D techniques have the advantages of being relatively inexpensive, reproducible, robust and convenient, they have the chief disadvantage of loss of much of the functionality of native hepatocytes , which raises the question of the relevance of such a model in predicting DILI. Three-dimensional culturing of hepatocytes is an attempt to imitate an in vivo environment in order to obtain more innate hepatocyte-like cells, thus producing a more relevant model to study hepatotoxicity whilst using an in vitro platform.
The sandwich configuration of 3D culturing is frequently used when propagating primary hepatocytes as it has been shown that maintaining these cells in this configuration prolongs their in vitro life-span by promoting cellular junctions, cell-cell and cell-matrix interactions, and maintaining differentiation [85-87]. After seeding a sandwich culture of primary hepatocytes, the cells require a recuperation period of > 40 h. During this period a number of morphological changes occurs as the hepatocytes acclimatise to their new environment, one of which is the formation of bile canaliculi [63,88]. The latter highlights a particular role that 3D culturing techniques may play in predicting cholestatic type DILI. Cholestatic DILI has been problematic to detect or predict using in vitro systems because most cell lines do not produce the native biliary structures when propagated in monolayer configuration. HepaRG cells, differentiated using dimethyl sulphoxide and glucocorticosteroids, have been reported to form biliary-like structures when grown in 2D format . Building on the work of Liu et al. , Ansede et al.  demonstrated that it is possible to determine whether drugs may induce cholestasis using primary rat hepatocytes in the sandwich culture configuration. Deuterated taurocholic acid was used, which is easily discernable from endogenous sodium taurocholate using liquid chromatography/tandem mass spectrometry, to monitor bile acid transport. The effect of Ca2+ on hepatocyte tight junction integrity was exploited in order to discern between hepatic uptake and efflux of deuterated taurocholic acid. Using this approach the researchers were able to determine total and intracellular bile acid accumulation, biliary excretion index and in vitro biliary clearance.
A manuscript that unmistakeably illustrates the important role that 3D culturing techniques can play in drug development is Lee et al. . Using sandwich-cultured rat primary hepatocytes, the authors were able to elucidate the hepatoprotective effect of dexamethasone on tabectedin-induced hepatotoxicity. At the time of the study, trabectedin was a promising new antineoplastic agent showing activity against various cancers at nanomolar concentrations and which had already reached phase II clinical trials, but was found to have dose-limiting hepatotoxic side-effects. The report highlights the fact that experiments using primary rat hepatocytes cultured in a monolayer configuration were unable to replicate the known hepatoprotective effect of dexamethasone against trabectedin-induced toxicity. The reason for this was the lack of hepatobiliary functionality of the hepatocytes in monolayer configuration, and explains why the sandwich configuration was able to show that dexamethasone protected hepatocytes by restoring normal hepatobiliary function. This demonstrates that 3D culturing techniques hold the key to predicting different subtypes of DILI such as hepatocellular necrosis and cholestatic injury. Whether or not similar experiments would prove successful when using HepaRG cells is still to be determined.
It is difficult for nutrients to reach, and for waste products to be removed from hepatocytes in a traditional sandwich configuration because the cells are entrapped in a thick extracellular matrix. The perfusion sandwich culture  and entrapment between ultra-thin porous silicon membranes technologies  were developed to surmount these complications. In addition to maintaining hepatobiliary function, both these methods claim added predictive capabilities for DILI as demonstrated through increased sensitivity to acetaminophen toxicity due to preserved metabolic enzyme functionality. Still, even with these improved methods, the life-span of these primary hepatocytes remains limited, which restricts the use of such methods on a large scale.
Other 3D culturing methods are mainly based on bio-artificial liver bioreactors that are aimed at developing extracorporeal liver support systems for patients with acute liver failure. In the past, such bioreactors were based on adult hepatocytes and proved unsuccessful because the hepatocytes failed to proliferate . The latest of these that are being explored for its use in drug toxicity testing is the four-compartment perfusion model. Cells are contained in one of the four compartments, the remaining three compartments comprises three independent but interwoven artificial capillary bundles that form the capillary bed in which the cells are housed. Cells are derived from hESCs and currently research is being carried out to obtain the optimal protocol for differentiating these cells into mature hepatocytes that closely resemble innate hepatocytes. This research project is headed by the EU Vitrocellomics project .
Anchorage-free 3D culturing methods result in the formation of small hepatocyte aggregates known as spheroids. There are different ways to induce the formation of spheroids including continuously-stirred bioreactors , the rocked suspension technique  and rotating wall bioreactors . Initial experimentation demonstrated that, between spheroids and monolayers, there was indeed differential toxicity induced by 7 day methotrexate exposure. It was thought that this was due to preservation of hepatocyte functionality, but could also have been due to lack of the test compound to penetrate the spheroidal structure . More than a decade later it is well known that liver-specific functions like albumin and urea synthesis and metabolic activities are maintained for prolonged periods of up to 21 days . In time, spheroids deposit an extracellular matrix consisting of laminin, fribronectin and collagen, which encapsulates each individual spheroid. These structures also preserve histotypical cytarchitechture, intercellular contacts (gap junctions) and biliary canaliculi . Moreover, when hepatocytes grown under these conditions are encapsulated in alginate polymers, albumin and urea synthesis doubles and phase I and II metabolic activities are also elevated. This may be attributed to the bulk added to the extracellular matrix, provided by the alginate polymers, which protects the hepatocytes from shear stresses under hydrodynamical conditions . A setback of this technique is the difficulty of obtaining spheroids that are of a specific mean diameter (100 μm) and batches of spheroids that are all similar in size. This is necessary as necrotic cell death may occur at the centre of spheroids if the diameter of these aggregates exceeds approximately 300 μm. The reason for this is lack of oxygen perfusion to cells located in the central region of spheroids that are too large in size .
Recently researchers attempted to predict hepatotoxicity employing hepatocyte spheroids developed from an immortalised cell line, a HepG2 derivative (C3A), instead of primary hepatocytes . The study emphasizes the value of proper dosing during toxicity testing. In the study spheroids were not exposed to a set concentration of drug in the culture medium for individual experiments. Rather, the concentration of drug in the culture medium was adjusted with each experiment to mimic in vivo dosing practices where the amount of drug was altered according to the amount of protein present in the bioreactor i.e. dosages were reported as mg drug / mg protein. Using this approach the researchers were able to obtain more accurate predictions of lethal human blood concentrations compared to conventional 2D culturing techniques.
5. Future directions
It would be fair to say that 2D culturing techniques have predominated since the inception of research on artificially cultured cells and as such numerous ways have been developed to analyse cells in the 2D format. Amongst others, this is one of the key advantages that 2D culturing techniques have over 3D culturing techniques, demonstrated by the multiple parameters that can be simultaneously assessed using HCS. Currently, this is not possible when using 3D cultures as all cells are not in the same pane and cannot be examined individually. On the other hand, the relevance of 2D culture models is questionable when compared to 3D models that more closely resemble their native counterparts. Various reports have shown that 3D culturing methods are superior to 2D cultures in detecting or predicting certain types of DILI, especially cholestatic injury as 2D models do not express the necessary morphology to study this. Profiling technologies may be able to breach the chasm between 2D and 3D culture models because it is applicable to both scenarios and have been shown to distinguish cholestatic hepatotoxins even when applied to 2D cultures.
The proteome represents current events on a cellular level and 3D cultures are better depictions of innate hepatocytes. Therefore, proteomic investigations that are based on 3D cultures, dosed using in vivo practices (mg drug/ mg protein), and are similar in size to the large DILI prediction studies that have been conducted on 2D cultures may prove exceedingly valuable in providing researchers with a set of protein biomarkers that can successfully predict DILI in humans. A substantial amount of research is necessary into this field of interest.
What is missing from current literature is the assessment of 3D cultures to express / secrete biomarkers that are currently used in the clinical setting, i.e. ALT, AST, ALP and bilirubin, and how these respond following challenge with various drugs. Research into this area may uncover possible accurate extrapolations that can be validated for use in predicting DILI. For instance, it is possible that 3D cultures secrete sufficient quantities of ALT and bilirubin to be measured in the surrounding culture medium. Maybe these markers will fluctuate in a way similar to what would occur in the in vivo setting and it may therefore be possible to assess the criteria for, and apply, Hy’s law on an in vivo-like in vitro system.
The in vitro technologies necessary to shift the detection and prediction of candidate drugs that may cause DILI from the clinical phases of drug development to the early pre-clinical phase, is available at present. There are various types of in vitro technologies available and each has its own unique advantages and disadvantages. For this reason, different approaches may be able to identify and predict certain types of DILI better than others and vice versa. Therefore, an integrated approach based on multiple models may be a step in the right direction if an in vitro platform is desired. Cultures of hepatocyte spheroids may be convenient in this scenario. At the end of an experiment, individual spheroids from the same bioreactor can be examined using different technologies (some can be used for profiling, others for microscopic evaluation, and still others for fluorescent analyses following digestion), which would make the results truly comparable in that all the spheroids would be subjected to the exact same conditions.
Work is necessary to incorporate the available methods into a standard set of tests, comprising of different tiers, which generate data that can be interpreted as a whole, to aid the critical ‘go’ / ‘no-go’ decision (the earlier, the better). Such a set of experiments will greatly improve lead prioritization before astronomical amounts of funds are invested into a particular potential drug. In the long run this will increase the productivity of the entire drug development process by alleviating some of the financial pressures and improving time-scales from drug discovery to marketing as less time is spent on candidates that will eventually fail in the clinical phases. Finally, it should aid regulatory authorities in granting approval and provide safer drugs for consumers.
European Centre for the Validation of Alternative Methods (ECVAM); Drug metabolism and pharmacokinetic (DMPK); Drug-induced liver injury (DILI); High content screening (HCS); International Committee for Harmonization (ICH); Multicentre evaluation of in vitro cytotoxicity program (MEIC); Non-steroidal anti-inflammatory drugs (NSAIDs); Pharmaceutical Research and Manufacturers of America (PhRMA); Research and development (R&D);