The development of chemical tool compounds becomes increasingly important for chemical biology research projects in many disciplines of life sciences. In addition, they form essential parts in both academic and industrial drug discovery efforts. The required expertise and technology platforms for the identification and optimization of these potent and target-selective small molecules often exceed the capabilities of academic groups and smaller companies. Over the years, several initiatives were created all over the world which address this issue by either creating networks or consortia of academic institutes, public-private partnerships with industry, or even dedicated new research infrastructures for chemical biology. Several of these organizations and their different access models will be described. We will focus in particular on the model of EU-OPENSCREEN ERIC, a new European Research Infrastructure which was founded in 2018 and consists of more than 20 partner institutes from eight countries.
- academic consortia
- chemical tools
- pharmacological screening
In the last decade, the interdisciplinary field of chemical biology has emerged from the need to better understand the role of proteins or signaling pathways in cellular systems and whole organisms than it was previously feasible with more classical genetic tools or methods. Rather than changing the levels of proteins, or blocking completely their expression or activity, by deleting or overexpressing their respective DNA or RNA sequences, it is now becoming more and more possible to precisely modulate their function in a time- and concentration-dependent manner using potent, selective and cell-permeable chemical compounds. Although the relevance of these so-called chemical tool compounds or probes for solving basic mechanistical questions in life sciences is indisputable , their role often extends into the fields of pharmacology and molecular medicine. In fact, chemical tools are playing an important role in the validation of newly identified drug targets in pharmaceutical companies, and might even serve as starting points for the development of new therapeutics.
Despite recent technological advances in areas such as cryo-electron microscopy (Cryo-EM) , the major approach for identifying bioactive substances is still the systematic testing of compound collections, often comprising many thousands or even millions of individual substances, with target- or pathway-specific biological assays which are designed to produce reproducible biological activities with high signal-to-noise ratios under experimental conditions which are fast, miniaturized and therefore cost-effective . This approach is technically and logistically challenging and, in the past, could only be performed by large pharmaceutical companies. In addition to experienced personnel, it requires large facilities with often expensive equipment for compound storage, automated liquid handling and sensitive detection of biological reactions. In recent years, however, this picture started to change. In the wake of the sequencing of the human genome, mostly larger academic institutions started to create their own screening and translational drug discovery centers because many new potential drug targets were suddenly becoming available for which a solid understanding of their physiological roles and molecular mechanisms were missing. At the same time, pharmaceutical companies faced increased pressures due to high drug development costs, often resulting in down-sized research budgets and cost cutting exercises combined with a general trend of becoming risk-averse towards innovative drug targets with potential high failure rates . As a result, many experienced industrial ‘drug hunters’ found employment in academic chemical probe discovery centers, supporting their efforts and helping to alleviate some of the initial issues these centers faced .
In this chapter we describe some of these new initiatives which were created to develop chemical tool compounds outside of the traditional pharmaceutical industry, highlighting their particular strengths, challenges and access models for the mostly academic scientific community.
2. Developing chemical probes in academic networks
At the beginning of the 21st century, academic institutions first began to implement dedicated assay development and screening centers which were soon followed by reports on the systematic testing of small molecule compound libraries in the US . Comparable efforts in Europe’s research institutes immediately received much attention, fostering collaborations between chemistry and biology groups and the establishment of academic screening platforms of diverse size. However, single platforms alone could not support comprehensively the needs of academic or industrial users due to limited chemical diversity of their compound collections and/or limited technical capabilities, and big pharma platforms were at that time not open to academic users. Pooling and coordination of public resources and expertise became imperative. Therefore, the efforts in the US were replicated with similar initiatives in countries such as France (Chimiothèque Nationale), Germany (ChemBioNet), Spain (ChemBioBank) and several others. Some years later, long-term cooperations between academic centers from different countries as well as public-private partnerships were established. We will describe some of these initiatives in more detail and will put particular emphasis on the collaborative model of the youngest organization for chemical biology, the EU-OPENSCREEN ERIC.
2.1 The molecular libraries program (MLP) in the US
The large, NIH-funded MLP was created in 2004 with the ambitious goal of creating a small molecule probe for every human protein in order to define the functions of genes, cells, and whole organisms in health and disease. The three components of the initiative were essentially: (a) a network of comprehensive and specialized screening centers plus specialized chemistry centers, (b) several cheminformatics approaches which included also a newly created public compound database called PubChem with assay metadata, and (c) initiatives to generally advance technologies in the fields of chemical diversity, assay development and screening . The aim was always to publish the new chemical probes and associated data immediately so that compounds could be used by the academic scientific community not only for basic research questions, but also for mechanistic validation of potentially disease-relevant drug targets and drug development.
Individual scientists could apply for funding to the NIH for their assay development and screening projects. Successfully, peer-reviewed projects were taken on board by one of the MLP centers, and high-throughput screens were conducted with a library which, by the end of the program, consisted of 390.000 compounds. About 5% of these molecules with often novel scaffolds were delivered by the academic synthetic chemistry community. In many cases, further chemical optimization yielded probes against protein targets which were deemed challenging or even ‘undruggable’. Overall, during the 10-year period of the program, a total of 375 chemical tool compounds were developed against a broad range of target classes. 18 of these compounds were considered sufficiently interesting to serve as starting points for the development of therapeutics against a total of eight disease targets or target classes, and were licensed to biotech and pharmaceutical companies . In light of the investment into the MLP it is debatable whether the ratio of probes to drug candidates can be regarded as a success or a disappointment but it certainly highlights the difficulties that chemical biologists are facing when they want to keep up with the speed of biological discoveries while translating academic findings into therapeutics.
2.2 The chemical biology consortium Sweden (CBCS)
Although much smaller than the MLP in the US, this example of a national consortium can highlight very well the particular strengths of a focused organization with only a few members. CBCS, with two nodes at the Karolinska Institutet and Umeå University, was founded as a non-for-profit research infrastructure for chemical biology in 2010  by researchers from Biovitrum (former Pharmacia and Upjohn) and became an integrated platform of SciLifeLab, an already existing national centre for molecular biosciences, in 2013 . The combined platform can investigate both chemical and genetic perturbations in biological systems. CBCS wants to enable high level basic research with open access publications while at the same time linking up academic and industrial groups. Complementary to CBCS, SciLifeLab offers a dedicated platform for drug discovery and development, with the clear goal of accelerating projects with translational potential. After nearly 10 years of operation, the consortium has produced more than 130 co-authored publications and 11 patent applications while scientific data provided the basis of six start-up companies .
Users are encouraged to discuss in more detail project proposals with the CBCS staff prior to the submission of the official application. A proposal template, user agreements and estimated costs of typical screening and chemistry projects are available online. Project proposals are evaluated by an independent ‘Project Review Committee’ (PRC), which meets biannually. Prioritized projects may be subsidized, with the remaining costs covered by the applicant. Implemented projects are periodically re-evaluated by the Project Review Committee as they progress to pre-defined milestones. A project plan for a so-called “large collaborative project” may run over a maximum of 2 years for which the user is expected to cover the costs for all reagents and consumables, including a compound access fee for plating of library compounds. There are also “small collaborative projects” which involve only limited CBCS support for maximal 2 weeks, e.g. a short-term access to a specialized instrument such as an imaging plate reader. For these projects, no PRC application is required but they are undertaken with a “first come—first served” policy based on available resources . In line with the open access data policy of the CBCS, the applicant and the CBCS agree upon a clear publication strategy before the implementation of the project. The target user group of CBCS are academic researchers at Swedish research institutions, who aim to develop chemical probes on a collaborative basis.
It is worth looking in more detail into the services CBCS can offer to their academic customers. The consortium assists in assay development for both biochemical and cell-based assays, gives access to the SciLifeLab compound collection and provides medicinal and computational chemistry expertise for hit validation and optimization. This model is very similar to the service offerings of the much larger European research infrastructure EU-OPENSCREEN which is being discussed below. In addition, mechanism-of-action studies can be performed with often specialized technologies such as cellular thermal shift assays (CETSA) . In fact, the development of CETSA is a good example on how an expert consortium such as CBCS can impact and further develop disrupting technologies in collaboration with local academic groups and commercial partners (here: Pelago Biosciences). Starting life as a low throughput assay, CETSA is now amenable to high throughput screening . Scientists usually come to the CBCS with the concept for a biological assay and first experimental data. They have then the chance to work further on the assay in the CBCS laboratories under guidance of their expert scientists, enabling in parallel scientific services and the education of users .
The CBCS compound collection consists of more than 200.000 compounds with high chemical diversity which are routinely quality controlled. While many of these compounds were donated by the pharmaceutical company Biovitrum, the library was further expanded with sets from commercial vendors and donations by other biotech companies. Importantly, the strategy has always been to build a modular collection of sub-libraries which can be adapted to the needs of each academic screening project, based mainly on assay throughput and cost per data point. For instance, in addition to a diverse primary screening set of 35.000 compounds, there are also focused libraries for particular target classes such as kinases, G-protein coupled receptors, agrochemicals etc., as well as a set of approved drugs . This is very different to the concept of EU-OPENSCREEN which offers a high throughput screening set of 100.000 commercial compounds to their users, with the goal to have that set screened in almost all projects so that each compound becomes associated with “positive” and “negative” screening data from as many projects as possible (see below).
Overall, between 2010 and 2018 more than 400 collaborative projects with 236 individual users in Sweden were discussed. User interest grew continuously during these 8 years, currently leading to approximately one new project discussion per week. About 25% of discussions result in large project PRC applications while others obtain small project limited support, all documented in, on average, 20 publications per year .
2.3 Public private initiatives: SGC and ELF
In industry, chemical tool compounds play an important role as pharmaceutical modulators of novel drug targets. Typically, they are being used for testing a particular disease hypothesis and for validating the chemical tractability of newly discovered candidate proteins or signaling pathways for which otherwise comparatively little information is available. Sometimes their properties are even sufficient to act as starting points for drug discovery programs. The development of compounds with required potency and, most importantly, selectivity towards individual members of a protein class can be a formidable task even for larger pharmaceutical or biotech companies. It came therefore as no surprise that in 2009 several industrial partners decided to collaborate in a pre-competitive manner and initiated a public-private partnership (PPP) with leading academic institutes in the field of chemical biology. The aim was to develop high-quality chemical tool compounds for families of understudied proteins of potential therapeutic value, for instance epigenetic and other transcriptional modulators.
The chosen academic partners in that particular PPP were the universities of Oxford and Toronto which had already formed the so-called Structural Genomics Consortium (SGC) in 2004 with the goal of determining the three-dimensional structures of proteins with therapeutic relevance. The SGC advocates open access partnerships between industry and academia and is committed to make their chemical tool compounds available without any restrictions. In the last 10 years, and with financial support by several pharmaceutical companies, more than 50 chemical probes in the areas of epigenetics and kinase signaling were developed [13, 14]. Furthermore, seven pharmaceutical companies made their chemical tool compounds from older research programs available to the scientific community, including protocols, controls and associated data . Efforts are now underway, under the umbrella of the Innovative Medicines Initiative (IMI), to expand the initial collection of compounds further by focusing not only on the protein classes which were selected in the past but also on the development of new technologies, making the identification and profiling of tool compounds generally faster and more cost-effective .
Another PPP initiative supported by the IMI is the European Lead Factory (ELF)  which is a consortium of 20 partners, currently among them the universities of Oxford and Dundee while several other universities, research organizations and companies in the UK, Netherlands and Germany were former partners. The project was launched in 2013 and came to an end in 2018, with a follow-up five-year project funded in the same year . During its lifetime, the ELF established a selection of about 550.000 compounds which are generally not commercially available. 300.000 of these were donated by seven participating pharmaceutical companies, while the rest was synthesized by medicinal chemistry partner companies during the last 5 years. Both the compound management facility in the UK and the high throughput screening center in the Netherlands were formerly part of pharmaceutical companies and able to perform screening operations and chemistry services such as hit optimization and modeling according to industry standards. The Oxford Biotechnology group of the SGC was selected as a key contributor of 3D co-crystal structures which are essential for compound optimization. During the lifetime of the project, more than 80 drug discovery programs across most therapeutic areas were pursued. By March 2018, two partnering deals between the respective project owner and one of the pharmaceutical company partners had emerged. Importantly, the ELF protects the IP rights of their academic collaborators against the pharmaceutical companies, ensuring that the academic researchers can always search for external partners in case that no development deal between them and one of the ELF industry partners could be fixed. This was one of the main concerns when the project started in 2013 .
It remains to be seen though if and how academic groups really benefit from these ambitious initiatives, especially when own research interests show little overlap with the essentially commercial interests of the participating pharmaceutical companies.
2.4 The European research infrastructure consortium (ERIC) EU-OPENSCREEN
EU-OPENSCREEN  is a community-driven, bottom-up initiative in Europe, which brings together 21 partner sites, i.e. technology platforms and research groups at various universities and research institutions, to form an open-access research infrastructure for chemical biology and early drug discovery. Instead of building an ivory tower, the aim of EU-OPENSCREEN is to establish and operate an infrastructure that facilitates and encourages the engagement with the broader scientific community. In the framework of EU-OPENSCREEN, the partner sites and external researchers collaboratively develop novel tool compounds (or chemical ‘probes’) that allow researchers to interrogate and study fundamental cellular processes, such as signaling or metabolic pathways.
EU-OPENSCREEN is one of 55 research infrastructures listed on the current ESFRI (European Strategy Forum on Research Infrastructures) Roadmap  as an ‘ESFRI Landmark Project’, demonstrating the relevance for the European scientific community and the European Research Area (ERA). It is jointly funded by the research ministries of eight countries (the Czech Republic, Denmark, Finland, Germany, Latvia, Norway, Poland, Spain) and the European Commission. Since April 2018, it operates a European, not-for-profit organization (‘European Research Infrastructure Consortium’), which is based in Berlin, Germany, and is legally independent from any university or research institute. EU-OPENSCREEN, and the European Research Infrastructures in general, promote open science and open innovation .
Many technology platforms at universities and research institutes predominantly work with the colleagues at their hosting institution. Larger European initiatives often engage with scientists from Western European countries, where these initiatives are based. Reaching out to, and encourage the active participation of, scientists from regions, which are often underrepresented in chemical biology and early drug discovery research, requires a different approach. Through its distributed network of partner sites across its member countries, EU-OPENSCREEN aims to have a more balanced engagement of local science communities. In each member country, a local partner establishes and coordinates a national network—e.g. CZ-OPENSCREEN in the Czech Republic, PL-OPENSCREEN in Poland, NOR-OPENSCREEN in Norway, Drug Discovery and Chemical Biology Consortium (DDCB) in Finland, ChemBioNet in Germany—to raise awareness about the initiative and to efficiently encourage scientists at the local level to participate.
2.4.1 The research infrastructure
The EU-OPENSCREEN infrastructure provides open-access to compound libraries, assay development and screening facilities, and medicinal chemistry and informatics platforms. It provides training and serves as a platform for industry engagement.
18.104.22.168 Compound collection
The EU-OPENSCREEN compound collection is a diversity library, which has been designed in a collaborative effort of several partner sites. The library is jointly used by affiliated EU-OPENSCREEN partner sites for primary screening against biological targets solicited from external researchers who developed the appropriate assays. During the design of the library, 100,000 commercially available compounds were selected, with an emphasis on chemical stability, absence of reactive compounds, screening-compliant physico-chemical properties, and maximal diversity/coverage of chemical space. Furthermore, EU-OPENSCREEN crowd-sources compounds from external chemists worldwide, in a federated approach through its national chemical biology networks. This collection of academic compounds will, over time, add increasing uniqueness to the EU-OPENSCREEN compound collection. The ambitious goal is to gather up to 40,000 compounds over the next years and to realize the vision of a truly European compound collection. In this context, the EU-OPENSCREEN compound collection will be dynamic and expanding. In analogy to the ‘FAIR’ (FAIR stands for findability, accessibility, interoperability, and reusability) data principles (described below), structural compound information and quality control data will be available online in an interoperable format (interoperability), unique identifier codes for each compound will be employed (findability), quality control will ensure the identity and purity of the compounds (reproducibility), and their distribution partner sites where they are accessible to external scientists and used in screening projects (accessibility). All compounds of the collections are carefully characterized and annotated for basic physico-chemical (e.g. solubility, light absorbance and fluorescence) and biological properties (e.g. cytotoxicity, antibiotic activity) by ‘profiling’ them in a standard panel of assays. These bioprofiling data increase the reliability and reproducibility of screening results, and identify compounds with properties that could potentially perturb specific bioassay read-out technologies (e.g. auto-fluorescence, luciferase inhibition, etc.) in order to reduce false-positive results. For chemists who provide compounds to be incorporated in the compound collection, these profiling data are an important incentive, in addition to the bioactivity data from the screening projects.
The jointly used compound collection is stored centrally by the Compound Collection Management Facility (CCMF) in Berlin, Germany, and aliquots are distributed to the affiliated EU-OPENSCREEN partner sites, which are located in the eight EU-OPENSCREEN member countries. The CCMF is responsible for the acquisition, selection, maintenance and storage of the central collection and quality-controls of the compounds. The CCMF provides the screening and bioprofiling sites with copies of the compound collection (including, where necessary, cherry-picking for confirmatory and counter-screening activities).
In many cases, primary screening data—even from publicly funded programs—are not openly accessible by the scientific community. While private organizations, contract research organizations (CROs) and many public-private partnerships do not reveal data on a routine basis, EU-OPENSCREEN is committed to maximizing the re-use and impact of generated bioactivity data for the benefit of the wider scientific community. Therefore, EU-OPENSCREEN’s ECBD adheres to the FAIR principles  and is closely linked to the ChEMBL  database, which will raise the discoverability and re-use of EU-OPENSCREEN’s data. Via ECBD and ChEMBL, database users will be drawn from across the global biological and chemical science communities, both from academia and industry. Together with other European life sciences-research infrastructures, EU-OPENSCREEN partners also contribute towards the optimization of technological implementation, integration and interoperability of data and tools within the European Open Science Cloud (EOSC) and participate in the Horizon 2020-funded ‘EOSC-Life’ project (www.eosc-life.eu/). Another initiative, to which the EU-OPENSCREEN partner Fraunhofer IME actively contributes, is the Innovative Medicines Initiative (IMI) funded ‘FAIRplus’ project (https://fairplus-project.eu/), which aims to facilitate the application of FAIR principles to data from certain IMI projects and datasets from pharmaceutical companies.
The ECBD is the central database for the integration of screening data from EU-OPENSCREEN projects with advanced search, analysis, and visualization tools. There will be three levels of data management and access: First, bioactivity data generation of compounds in screening projects, implemented at the individual EU-OPENSCREEN screening sites, using assays provided by the external collaboration partners; second, the integration of these screening datasets from partner sites into the ECBD; and, third, public dissemination of the data through established databases like ChEMBL  and PubChem [25, 26]. The ECBD is hosted by Petr Bartunek, the coordinator of CZ-OPENSCREEN, and his team at the Institute of Molecular Genetics of the ASCR in Prague, Czech Republic, who have developed the open data resource Probes & Drugs portal  as well as other databases such as the Zebrabase . The e-infrastructure CESNET provides cloud-based hosting, backup and security.
An important aspect in the context of integrating complex and diverse screening data, when dealing with datasets from various affiliated, but legally independent sites that jointly use the compound collection, is the implementation of harmonized data standards and data curation. The ECBD adheres to well-established ontologies and identifiers, for example, the BioAssay Ontology (BAO)  for the classification and description of assays, which are commonly used by other similar open data repositories, such as ChEMBL or PubChem BioAssay. Only officially accredited partner sites have permission to upload data into the ECBD and uploaded data will be curated both automatically (e.g. file format, column values) as well as manually (e.g. data inspection) by the ECBD team. In case of ambiguities, the ECBD team contacts the data provider to resolve the issue. The ECBD team provides user support and help desk functions. Webinars on data deposition, the use of ECBD for data searching, visualizations and analysis are planned and dedicated workshops will be organized to demonstrate database users all ECBD capabilities and to share best practices in the community.
A grace period of up to 3 years between the completion of the primary screen and data publication in the EU-OPENSCREEN database is provided, during which the bioactivity datasets are not publicly accessible. This grace period allows for follow-up studies, publication in peer-review scientific journals and securing of intellectual property.
Assay development and screening facilities, and medicinal chemistry groups: EU-OPENSCREEN’s affiliated screening partner sites implement the EU-OPENSCREEN high-throughput screening (HTS) and High-content screening (HCS) projects by using the EU-OPENSCREEN chemical compound collection, in collaboration with the external assay developer. They have been operational as local groups collaborating with external researchers over the past years, even before the EU-OPENSCREEN ERIC has been established. A recent publication showcases several successful projects, which have been realized by individual partner sites, as an example of the capabilities and expertise within the research infrastructure . The chemistry groups have an excellent, proven track record in medicinal chemistry and hit-to-lead/tool optimization. As part of the collaborations with external researchers, they provide services ranging from the re-synthesis of hit compounds and chemical optimization by synthesis of focused libraries containing structurally similar analogues, elaboration of structure activity relationships (SAR), and NMR and TOF-LC-MS analytics.
The EU-OPENSCREEN partner sites have been operational as local screening platforms for many years. During this time, they predominantly work with their colleagues from the hosting institution and university. By working with the same collaborators over a longer time period, both sides could, over the time, increasingly gain practical experience and build a knowledge base, for example, in developing miniaturized, robust assays which are amendable to screening large compound collections. One of the aims of EU-OPENSCREEN is to enable as-yet under-served and under-represented user communities, which, by definition, did not yet have the opportunity to gain practical experience in these areas. Therefore, EU-OPENSCREEN will offer training courses, for example in assay development and other aspects of high-throughput screening. Furthermore, staff exchanges at established partner sites for scientists from prospective sites in countries that are not yet members of EU-OPENSCREEN promote convergence in technical capacities.
2.4.2 Access to the research infrastructure for external researchers
External scientists have open access to a chemical library, assay development and screening facilities, medicinal chemistry and informatics platforms. There are three main groups of researchers who will benefit from EU-OPENSCREEN:
22.214.171.124 Access policy and procedure
The democratization of access to state-of-the-art technology platforms, resources and expertise is the key objective of all European research infrastructure. Importantly, as a European open access research infrastructure, a common access policy and procedure is applied across its network of partner sites. EU-OPENSCREEN is accessible to researchers from academia and industry worldwide. The access to EU-OPENSCREEN by external researchers is in line with the ‘European Charter for Access to Research Infrastructures—Principles and Guidelines for Access and Related Services’  published by the European Commission in 2016. The charter’s guidelines describe three access modes, by which access to research infrastructures may be provided—these are excellence-driven, market-driven and wide access. Excellence-driven access is provided to the majority of scientists who developed an assay and collaborate with EU-OPENSCREEN to implement a screening and/or hit optimization project as well as to chemists who provide their compounds to be incorporated in the EU-OPENSCREEN compound collection. Scientists who use the ECBD will be provided wide access to the bioactivity data.
In this book chapter, we described various academic collaboration models which aim to accelerate chemical too discovery. These initiatives differ in many aspects, for example in structure (e.g. individual academic research groups, public-private partnerships, research infrastructures; single-site vs. distributed/multinational), operational model (e.g. closed consortia, open-access research infrastructures), user communities, funding model (e.g. institutional funding, third-party funding over a defined funding period, long-term funding by member countries), access and data publication policies. Each of these initiatives complement each other and supports academic chemical biology and drug discovery.
The authors would like to thank Ronald Frank (senior advisor of EU-OPENSCREEN, Berlin) and Anna-Lena Gustavsson (Head of the CBCS node at Karolinska Institutet, Stockholm) for ideas and information related to the content of the manuscript.
Conflict of interest
The authors declare no conflict of interest.
|ASCR||Academy of Sciences of the Czech Republic|
|CESNET||association of universities of the Czech Republic and the Czech Academy of Sciences, operating the national e-infrastructure for science, research and education|
|ChEMBL||chemical database of bioactive molecules with drug-like properties, maintained by the European Bioinformatics Institute of the European Molecular Biology Laboratory|
|ECBD||European Chemical Biology Database|
|ESKAPE||acronym encompassing the names of six bacterial pathogens commonly associated with antimicrobial resistance|
|NIH||National Institutes of Health|
|NMR||Nuclear magnetic resonance|
|TOF-LC-MS||Time-of-flight liquid chromatography mass spectroscopy|