Predictive modeling of HHG Scenarios 1 and 2.
\r\n\tWith these open statements, it is clear that there is a need for a comprehensive research book on the primates which deals with current climate change and consequent effect on the conservation of primates. Primates and other forms of wild species (avi-fauna, reptiles and others) are known to be good indicators of environment condition. Primates are the species most likely to be re-used in experiments and, unfortunately, trade in species such as chimpanzee, baboon and other monkeys have increased greatly in recent years. Their endocrines, behaviour, social structure, evolution, food and feeding habits, nervous system, communication effort and reproductive physiology are similar to those of man which makes these animals valuable in testing drugs and other form of uses.
Numerous significant technological advancements mark the nearly six decades that have elapsed since the invention of the laser. We are nowadays facing a dramatic increase in terms of attainable laser powers and intensities concomitantly with a drastic shortening of pulses duration. Super-intense lasers such as HERCULES , TPL , Vulcan  and Astra Gemini  or PHELIX  constitute a notable achievement in terms of chirped pulses intensity: an increase by six orders of magnitude within less than 10 years. Next generation 10-PW laser systems are currently under consideration in various laboratories around the world. To resume to just one example, the 10-PW ILE APOLLON  is envisaged to deliver an energy of 150 J in 15 fs at the last stage of amplification after the front end, with a repetition rate of one shot per minute, its intensity being expected to reach 1024 W·cm−2. Such elevated intensities are the foregoers of the so called ultrarelativistic regime applications, a regime in which not only the electrons but also the ions become relativistic within one laser period. As matter under extreme conditions can now be relatively easily generated and investigated, we are witnessing a worldwide advent of laboratory research in totally “new physics”, from ultrarelativistic laser plasmas to high-energy particle acceleration and generation of high-frequency radiation in the extreme-ultraviolet (XUV) and soft-X-ray regions. X-ray production by means of high intensity laser-plasma interaction experiments is of particular interest for the scientific community since this is a way of attaining increased brightness X-rays, with good coherence and consequently high quality sources of radiation. Among the variety of laser-based mechanisms deployed for this purpose, the most notable are betatron generation from laser wakefield acceleration  and high-order harmonics generation (HHG) .
In spite of the multitude of opportunities, there are still technological issues to be addressed and there are still numerous phenomena occurring during the interaction that are not yet fully understood. Some of these may be potentially damaging to experiments (e.g. hydrodynamic or parametric instabilities, hot electrons), hence their mitigation is vital. Ultimately, optimizing interaction conditions requires state-of-the-art theoretical and computational investigations.
In terms of simulation software, traditional approaches entail either hydrodynamic (fluid) or kinetic codes, in accordance with the laser-plasma interaction regime. Often, choosing between the two implies an inevitable dismissal of certain phenomena within reasonable accuracy limits. Modeling processes like particles’ acceleration, plasma heating, parametric instabilities that occur during the interaction of ultrashort (pulse duration of sub-picoseconds down to tens of femtoseconds) and intense (intensity higher than 1017 W·cm−2) laser pulses with plasma requires mainly a kinetic treatment and this is normally achieved through the Particle-In-Cell method (PIC) , the most reputed among the numerical tools employed in plasma physics and in laser-plasma interaction investigations. Albeit being recognized as a suitable approach for analyzing the highly transient physical processes in the non-linear regime associated with ultrafast laser energy coupling to matter, PIC based codes are subject to nonphysical behaviors such as statistical noise, non-physical instabilities, non-conservation, and numerical heating. Secondly, they require considerable computational resources, being far more demanding than the fluid ones that are normally deployed to study phenomena on a nanosecond scale with “coarser” accuracy. For instance, running a 1D PIC with a reasonable number of particles per cell, a fine grid and a small time resolution can claim up to more than 20 CPU hours on a single-processor PC for simulating what happens during a few femtoseconds of interaction. The distribution function at any given time, in a 3D3V PIC code is six dimensional in nature. Should 100 grid points be allocated for each dimension and representing each grid point in eight byte double precision, then, the system would require as far as 7 TB alone, just to store this data structure. In spite of the recent advent of computing technologies, running high accuracy 3D or even 2D kinetic simulations is still a challenging task even if we are talking about a full migration towards the GPUs.
Various simplified codes have been hitherto been built and successfully used with reasonable compromises between accuracy on one hand and storage requirements and speed on the other. The LPIC++ [10, 11], XOOPIC  and PIConGPU  are some good examples in this sense. Restraining the number of dimensions, in conjunction either with object oriented programming, either with code parallelization, makes it possible to gain increased resolution (but over fewer dimensions) with less fancy hardware. Among the state-of-the-art PICs employed for simulating a variety of laser-plasma problems are the well-established EPOCH , VSim , OSIRIS [16, 17, 18], and QuickPIC [19, 20]. Fully relativistic, parallelized and multidimensional, they all incorporate additional features accounting for phenomena normally disregarded by traditional PIC methods, therefore moving the simulations closer to the real world. For example, EPOCH includes multiphoton, tunneling and collisional ionisations. The latter two can also be found in OSIRIS. VSim is a hybrid code (combining kinetic and hydrodynamic treatments), while OSHUN [21, 22] permits the user to introduce multiple ion species. At the same time, system resources can be spared by either reducing the number of dimensions (user option encountered in EPOCH) or by separating out the time scale of the evolution of the driver from the plasma evolution, thus transforming a fully 3D electromagnetic field solve and particle push into a sequence of 2D solves and pushes (QuickPIC’s algorithm). Highly optimized to run even on a single CPU, these codes are scalable over a large number of cores, featuring the dynamic load balancing of the processors. Parallelization approaches include not only the MPI and Open MP but SIMD Vectorization, with most of these above mentioned simulation environments having CUDA enabled versions as well. Running a PIC code on top of the line GeForce or on Tesla can lead to significant improvements in terms of speed [23, 24, 25, 26, 27, 28, 29, 30, 31, 32] while maintaining a fairly large number of particles per cell. Breakthroughs have been reported especially with the particle push [33, 34, 35] and particle weighing [36, 37, 38] algorithms but also with the parallelization during the current deposition phase [39, 40]. Successful attempts of integrating these schemes while trying to mitigate some of the factors known to limit GPU performance—communication overhead between GPU and CPU, memory latency versus bandwidth, the relatively low level of multitasking or I/O efficient management when reading and writing to files—count in Jasmine [41, 42] or FBPIC [43, 44].
As cloud, big data and AI based technologies are nowadays becoming pervasive in all the fields of the economy, predictive modeling should become just as ubiquitous in every research area, being a comfortable and reliable alternative for designing optimized experiments or for estimating potential results.
This chapter is presenting an overview of an entire class of predictive systems for laser-plasma interaction built at the National Institute for Lasers, Plasma and Radiation Physics—blending in big data, advanced machine learning algorithms and deep learning—with improved accuracy and speed. Making use of terabytes of already available information (literature as well as simulation and experimental data) such systems have the potential of revealing various physical phenomena occurring in certain situations, hence enabling researchers to set up controlled experiments at optimal parameters. Whilst the most obvious advantage of deploying predictive and/or prescriptive modeling is the considerably diminished running time in comparison to classic simulation codes, the motivation goes further than this, to having a readily compiled report containing the most favorable interaction conditions or warnings on the imminent presence of destructive phenomena. However, efficiently extracting, interpreting, and learning from very large and heterogeneous datasets requires new generation scalable algorithms as well as new data management technologies and cloud computing. In this sense, a big step forward was the deployment of Hadoop , together with its MapReduce  algorithm and the Mahout library [47, 48]. Several other libraries were jointly used for deep learning purposes, namely Theano , TensorFlow , Keras  and Caffe . Promising results—correctly predicted high order harmonics in HHG experiments along with the occurrence of hot electrons in certain interaction scenarios—have been obtained by combining deep neural networks (DNNs) and convolutional neural networks (CNNs)  with ensemble learning [54, 55, 56]. The DNNs and CNNs were built by grid search [57, 58], in conjunction with dropout [59, 60, 61, 62] and constructive learning [63, 64, 65, 66, 67], with the CNNs exhibiting somewhat better performances in terms of speed and comparable accuracy in estimations. The chapter offers a comparative discussion of these alternate predictive modeling solutions, highlighting the performance improvement gained by deploying each combination of advanced machine learning and deep learning algorithms. Moreover, a significant part of this analysis is devoted to the challenges, advantages, caveats, accuracy, easiness of usage and suitability to the actual interaction scenario of these systems.
The last section proceeds to arguing the implications of big data and AI based predictive modeling for the scientific community, its potential, not only in joining together experimental observations, theory and simulation data, but also the potential and future prospects in deriving meaningful analysis and recommendations out of the already available information.
The emergence of cloud computing and of open source big data designated platforms like Hadoop, Spark  and the framework ROOT [69, 70], along with the rise of deep learning [71, 72, 73, 74] have rendered data processing and analysis trivially inexpensive. Massive amounts of a wide variety of information can today be interpreted at an unprecedented rate of speed. The consequence is particularly important for science because of various reasons. Firstly, migrating from expensive in-house computing systems to infrastructure as a service (IaaS) significantly cuts costs with capital investment. Secondly, the increased storage capacity and computer power make the cloud ideal for scientific big data applications development , specifically for statistics, analytics and recommender systems. Furthermore, workload optimization strategies can easily be incorporated in order to use the resources to maximum capacity. For applications that are both computational and data intensive the processing models combine different techniques like in-memory big data  or combined CPU—GPU processing.
Predictive modeling in a continuously evolving field like laser-plasma interaction is challenging from several points of view, mainly because this is an area previously unexplored with machine learning techniques and smart agents. Simulations serving this purpose have to this day relied almost exclusively on codes that calculate according to various theories and approximations, hence on programmed software not on software that adapts and learns from experience and common knowledge. ROOT remains the only physics designated package that took some efforts in this new direction. Although it is mainly oriented towards signal treatment techniques and statistics, ROOT also incorporates machine learning algorithms to a lower extent.
Designing an intelligent predictive or recommender system for laser-plasma interaction should take into consideration quite many aspects. The start point and, at the same time, a central decisive factor in the design is actually the available interaction data, its amount and its structure. Specifically, interaction data for a particular kind of experiment is mostly heterogeneous in the sense that it can comprise experimental findings along with simulation yields and literature references, a situation bound to pose potential problems in terms of hardware, software environments and applicable machine learning paradigms. Storing and converting the available information in the same file format—especially if we are talking about terabytes or petabytes—is a time consuming operation. This caveat may be conveniently mitigated by using the NoSQL databases, a notable feature of big data platforms such as Hadoop or Spark. Furthermore, the NoSQL is schema-free, therefore facilitating structure modifications of data in applications. Through the management layer, data integration and validation can be easily attained. A second aspect of interaction data concerns features like inconsistency, incompleteness, redundancy or intrinsic noisiness. For a particular kind of experiment (e.g. a certain type of laser interacting with a specific target, in a predefined interaction configuration) there might be multiple results due to the fact that the same experiment was performed in different laboratories across the world. Consequently, the above mentioned data characteristics can be explained through the differences in diagnostic equipment or in its placement, through slight variations in the interaction configurations, in target compositions or the type of optical components. Simulations performed with different codes or theoretical estimations might also exist in the literature. Two other possible situations concern unavailable data and divergent or conflictual reports. Such variety entails various signal processing techniques like reduction, cleaning, filtering, integration, transforms and interpolations in order to remove noise, correct the inconsistencies and improve the decision-making process. However, these operations can be important consumers of resources, so they should be performed via distributed computing in conjunction with fast analytics purpose tools such as Apache Impala  and Apache Kudu .
Further applying machine learning algorithms  on this type of extended sets complicates things even more, firstly because we are talking about large volumes of data (at least 1 TB and easily up to several hundreds of TBs), and secondly because training even classical multilayer perceptrons (MLP) [80, 81, 82, 83], self-organizing maps (SOM) [84, 85] and especially support vector machines (SVM) [86, 87] on conventional computers renders the process extremely difficult. Practically, this is a striking argument in favor of the custom-made clouds that provide not only computing power but also modularity, scalability and resilience. Beyond Hadoop’s substantial parallelization, jobs dispatching and resource allocation capabilities, considerable speedup may be achieved within the Spark environment, owing to its graph technology. Built-in Mahout and MLib  machine learning libraries integrate a lot of the commonly deployed algorithms allowing the user to modify or add any new self-written modules. Within these frameworks, a common MLP can easily evolve towards deep learning due to the fact that multiple hidden layers (or cascaded MLPs) are no longer an impediment to fast training and rapid convergence. The grid search algorithm permits testing multiple MLP topologies for the best performance on training and test sets, subsequently returning the best one. When stating multiple, an order of a few tens is perfectly feasible. Another useful tool, the dropout methods, randomly exclude various neurons along with their incoming and outgoing connections in order to achieve performance improvements and to avoid overfitting [89, 90, 91]. Some versions do not drop out units but just omit another portion of training data in each of the training cases, ultimately “averaging” over all the yielded MLPs (structures with the same topology but different weights, a consequence of the variation in the training set). This approach mitigates both, overfitting and potential falloffs or stagnations in the learning rate, effects associated primarily with sets featuring high percentages of redundant data. Other algorithms apply the “averaging” over networks with dropped out units or over many networks with different weights instead of merely considering the best configuration. Regardless of what is actually averaged, these solutions act similarly to ensemble learning and can be also combined with unsupervised techniques [92, 93, 94]. Inversely, constructive learning allows the user to add units or connections to the network during training, an approach known to be highly effective in escaping local minima of the objective function. All of the above classes of algorithms can be deployed for both CNNs and DNNs and, with slight modifications, even for 3D topologies of SOMs. Considerable boosts in terms of speed may be attainable through MapReduce acceleration  or GPU accelerated computing.
Practically, the choice of algorithms is of crucial importance when designing a predictive modeling system as they influence its overall performance both in terms of speed as well as in terms of accuracy and robustness. A high degree of modularity and scalability of the system is also desirable since it is fundamental to be able to add new algorithms and tools, or to replace others, as easily as possible without major reconfiguration and training issues. As new interaction data becomes available on a regular basis, retraining the system and its subsequent functionality are not supposed to be problematic. At the same time, hardware modifications within the cloud should only improve performances and not increase the risk of system crashes. Good predictions should be prevalent even when facing undesired events like hardware failures, software bugs and data corruption and from this point of view, the combination cloud-Hadoop-deep learning is ideal, mainly since Hadoop offers most of all resilience.
The development of intelligent systems with direct application in optimizing laser-plasma interaction experiments is a highly demanding task. Since it requires above all, enough hardware resources, the underlying infrastructure supporting the construction and deployment of the predictive systems was chosen to be a private cloud. Interaction with users is achieved via internet, hence, by extension one can consider this a “client–server” system, schematically displayed in Figure 1.
The “server-side” offers various functionalities in five areas. Firstly, it ensures the communication with users and handles the requests queues. Concerning the data management, the “server” is also responsible for the data storage, data manipulation and related processing operations. Thirdly, it stores and facilitates the incorporation of new software libraries. It provides computing power for establishing the optimal structure of the intelligent systems, for training and validation. Last but not least, it supports the deployment mode of the validated predictive systems. At this stage, the users introduce the input parameters and obtain the predictions and/or recommendations. Among the advantages offered by a private cloud platform built using Hadoop are the rapid access to information, rapid processing and rapid transmission of results to the end user. But beyond processing and querying vast amounts of heterogeneous data over many nodes of commodity hardware, another significant advantage of the Hadoop streaming utility is the fact that it allows Java as well as non-Java programmed MapReduce jobs to be executed over the Hadoop cluster, in a reliable, fault-tolerant manner. The combination HDFS, HBase , Hive  and MapReduce is robust. Not only HDFS ensures data replication with redundancy across the cluster but every “map” and “reduce” job is independent of all other ongoing “maps” and “reduces” in the system. However, HDFS based data lakes lack what is a fundamental capability for complex applications that make use of the stored big data, and that is the random reads and writes capability. There is no point in trying to speed up data processing by developing new algorithms if accessing it translates into brute-force readings of an entire file system.
In this sense HBase was deployed on top of the HDFS data lake since it allows the fast random reads and writes that cannot be handled otherwise. As a NoSQL database, it is primarily useful because it can store data in any format. Additionally, HBase can also handle a variety of information that is growing exponentially, something which relational databases cannot. In other words, it supports the real-time updating and querying of the dataset which Hive does not and this is highly suitable for applying dropout and constructive learning on datasets. In contrast, Hive provides structured data warehousing facilities on top of the Hadoop cluster together with a SQL like interface that facilitates the creation of tables and subsequently, the storage of structured data within these tables. Although, existing HBase structures can be mapped to Hive and operated on easily due to the efficient management of large datasets, inconveniently enough for certain cases, the data can be further used only in batch operations. The predictive modelings subject to this chapter use alternatively HBase and Hive as suitable to each of the combinations of algorithms. For a particular interaction scenario, the relevant information is extracted from the data lake, processed for cleaning and then stored into either HBase or Hive. As these sets of data are subject to MapReduce jobs and to machine learning, they may consequently suffer alterations, hence the modified versions are also written to the warehouse. Database dumps to HDFS are performed after each successful prediction experiment.
Within the cloud, the server is running Ubuntu Server 16.04 with MyEclipse 2015 Stable 2.0, Tomcat 8.5.5, JDK 8, release 1.8.0_102, Hadoop 2.7.3, HBase 2.7.3, Hive 2.0.0 installed. User requests are handled via JDBC (with Phoenix for HBase accessing) while the communication with the user is done via servlet developed in MyEclipse. Each of the four cluster nodes consists of six PCs, connected to a switch and each having a QuadCore CPU, a hard drive (1 TB, 6 Gbps, 7200 rpm, 32 MB cache), 16 GB of RAM and a 1000 Mbps full duplex connectivity card. Additionally, four GeForce GTX Titan with 2688 CUDA cores and 6 GB memory were attached to the cluster, one by node, their intended purpose being to facilitate the deployment of the deep learning algorithms. GPU computing is reputed for being well suited to the throughput-oriented workload problems that are characteristic to large-scale data processing. However, integrating GPUs within a Hadoop cluster is not obvious. While, parallel data processing can easily be handled by using several GPUs together or by GPU clustering , implementing MapReduce on GPUs has enough limitations  and requires a lot of finagling. For example GPUs communicate with difficulty over a network, hence being recommended to function with an Infiniband connection. Moreover GPUs cannot handle virtualization of resources. Their system architecture is therefore not entirely suitable for MapReduce without excessive modifications  and, up to recently, GPU and Hadoop were not even compatible. Therefore, to keep things as uncomplicated as possible, MapReduce tasks were entirely handled by the CPU nodes at all times.
After multiple machine learning experiments performed on earlier versions of this cloud [100, 101], observed performances have triggered—apart from hardware upgrades—several other tunings towards its overall optimization and in preparation for applying deep learning on the interaction data sets. These modifications address issues related to increasing the speed of processing raw data along with the speed of MapReduce tasks, decreasing the associated latencies by using fast analytics designated tools and an efficient management of workflows and finally, the containerization of tools and applications. In the design phase of a big data based complex application, special attention is to be given to the way jobs are planned and executed as this contributes to a large extent to the software’s performances. For this purpose, workflow engines are a very useful tool as they schedule jobs in the data pipelines ensuring that they are ordered by dependencies. A workflow engine tracks each of the individual jobs and monitors the overall pipeline state. Built-in kill/suspend/restart/resume capabilities bring-in considerable improvements by helping diminish the potential bottlenecks caused by failed and downstreamed jobs. There are quite a few workflow engines available but for integration with Hadoop, the most stable and flexible are Oozie , Azkaban , Luigi , Airflow  and Kepler . Criteria for choosing between these take into account the way workflows are defined (configuration-based or code-based), the available support for various job types and its extensibility, the extent to which the state of a workflow may be tracked and most importantly, the manner in which the engine handles failures.
For the sake of simplicity and easiest integration, Oozie 4.2.0 was incorporated leading to a significant increase in the efficiency of all extract-transform-load (ETL) type of jobs as well as of the MapReduce ones. In spite of the lengthy and uneasy XML definition of workflows (configuration-based) and of individual jobs, Oozie is the only one that has built-in Hadoop actions, therefore enjoying the best compatibility with the Hadoop environment and the highest number of supported job types. Additionally, customized job support may be further integrated via available plugins. Within Oozie, workflow jobs are directed acyclic graphs (DAG) specifying a sequence of actions to be executed at certain time intervals, with a certain frequency and according to data availability. Recurrent and interdependent workflow jobs that form a data application pipeline are defined and executed through the Coordinator system. For a more efficient management, supplementary preventive or mitigating actions were coded in the coordinator application in order to cope with situations occurring due to partial, late, delayed or reprocessing of submitted data. A customized Java client that connects to the Oozie server was developed in order to monitor within the user interface, first of all, the workflow DAGs together with the corresponding states and secondly, to view and restart the failed tasks as soon as a notification in this sense is received. Since Oozie does not provide automatic notifications of failed jobs, this feature had to be implemented.
System resources are allocated to the jobs by YARN  with included optimizations in terms of efficiency and speed. YARN provides extensive support for long-running state-less batch jobs and analytical processing workloads such as machine learning algorithms. The containerization approach enhances even more these features however it does not rise to the same level of performance as Docker , and in this sense, it would be helpful to be able to install and deploy some other containerization technology on Hadoop in order to package applications and dependencies inside the container, to have a consistent environment for execution and, at the same time, enjoy the isolation from other applications or software installed on the host. The combination workflow engine—containerization is attractive for several reasons. First of all, it provides increased control both in the development phase as well as over the big data deployments. Secondly, it reduces significantly the rate of failed or stalling jobs and it offers uniformity and efficiency in resource allocation and resource sharing between different applications by orchestrating and organizing containers across any number of physical and virtual nodes. A containers’ orchestrator mitigates the effects caused by failing nodes, adding more nodes or removing nodes from the cluster and by moving the containers from one node to another to keep them available at all times. Unfortunately, associating Hadoop with other container technologies than YARN is cumbersome as this system is not easily able to delegate the clustering functions to an external tool such as a container orchestrator. For instance, the particular installed version of Hadoop together with Docker for YARN grant the YARN NodeManager the possibility to launch YARN containers into Docker containers according to users’ specification. However, this feature has certain caveats in terms of software compatibilities. Furthermore, the Docker Container Executor runs only in non-secure mode of HDFS and YARN and it requires Docker daemon to be running on the NodeManagers and the Docker client installed and able to start Docker containers. To prevent timeouts while starting jobs the Docker images that are to be used by a job should already be found in the NodeManagers. Therefore, a reasonable compromise was met by installing the Docker Engine Utility only on the GPU nodes—without the YARN compatibility mode—with containers incorporating the deep learning libraries, including cuDNN.
Additionally, optimizations in terms of speed and latency mitigation within MapReduce tasks and the raw data processing and analysis are mainly due to Apache Tez  installed and configured atop of HDFS. Within a complex system such as a Hadoop cluster, latencies are common, inevitable and may have a variety of causes like storage I/O operations, network communications, architectural design imperfections or running software. Some latency is also inherent when launching jobs. As we have seen above, these latencies can be partially diminished by efficient resource allocation combined with scheduling of jobs. For MapReduce, its startup time is known to be one of the main sources of latencies, further performance enhancements being achievable by improving the dataflow processing and transmission from one stage to another. In this sense, the objective is to completely decouple the execution of the “mapper” from that of the “reducer” and have a direct output transmission from “mapper” to “reducer”, with all “mappers” and “reducers” working in parallel. This approach might alleviate latency in jobs completion by up to 25 percent but unfortunately it tends to impact on the fault tolerance. Basically, a global sorting is potentially time-consuming—even when using multiple “mappers”—but it should be avoided mainly as this approach triggers by default the deployment of only one “reducer” which is very inefficient for large data sets.
An alternative strategy implies spilling files with intermediate results from “mapper” to “reducer” in order to preserve a certain degree of fault tolerance. Known as adaptive load moving, this technique leverages on a buffer attached to the output of each “mapper”. On filled buffers, a combining function is applied for sorting purposes and the data is “spilled” out to storage. The spilled files are next adaptively pipelined to the “reducers” according to an “avoid overloading” policy and to a spilled files merging perspective. Fault tolerance is hence improved by reducing the risk of “mapper failure” which in turn limits the reducer’s ability to merge files and process the information. Adaptive load moving applied to every “mapper” and “reducer” within the Hadoop cluster is better used in conjunction with process pooling for both the master and the worker nodes resulting in a significant spare of memory. Apache Tez was therefore employed to implement this strategy and to further improve other MapReduce related issues. For example, working with Hive and MapReduce often turns into costly operations and latencies of order of at least minutes, especially when executing a join, with “sky-high” query execution time and resource consumption. The data is often sharded and distributed across the network, thus performing a join requires matching tuples to be moved from one machine to another and consequently causing a lot of network I/O overhead. Tez is the one that gives Hive the possibility of running in real time, the query performance improvement being on average 50%. The major advantages of using Tez relate firstly to the adjustable number of “mappers” and “reducers” and secondly to the possibility of using the built-in cost-based query plan optimizations. Prior to executing a query, Tez determines the optimal numbers of “mappers” and “reducers” and automatically adjusts these numbers on the way based on the amount of processed bytes. Using the “Compute statistics” statement, the number of “mappers” and “reducers” can be monitored along with their speed in completing the corresponding tasks. Hence, should a bottleneck appear, its point of origin can be easily identified.
The high volumes of data employed here trigger high query execution times. Tez implements query planning by building up multiple plans and choosing the best one out of the available computed versions. Query plan optimization is constructed in steps, starting from containerization and multi-tenancy provisioning, continuing with vectorization and ultimately with the cost-based planning, evaluation of plans and picking up the optimal one. Multi-tenancy permits the re-use of a container within a query by releasing all containers idling for more than 1 second. Vectorized query execution implies performing operations like scans, aggregations, filtering and joins in batches of 1024 rows at once instead of row by row. Finally, cost-based optimization of query execution plans significantly improves running times and the consumption of resources by evaluating the overall cost of every query as resulted from its associated plan. The evaluation reveals the viable types of operations, computes the cost of each combination and determines to which extent an increased degree of parallelism speeds up the execution time while lowering the amount of commissioned resources and making use of their reusability as much as possible. Within a query, a MapReduce stage is followed by other stages. Tez checks the dependence between them and dispatches the independent ones to be executed in parallel. Another decision towards optimization concerns performing map joins instead of shuffle ones as the map joins minimize data movement and leverage on subsequent localized execution due to the fact that the hash map on every node is integrated into a global in-memory table and solely this table is being streamed, hence joins are made faster. A compromise has to be made, though, by provisioning larger Tez containers (much larger than the YARN ones) and by allocating one CPU and some GBs of memory per each of the containers. The performance of Hive queries can also be improved by enabling compression at the various stages, from table creation to intermediate data and final output. So, for these purposes a conversion to the ORC file format was done as these files result in 78% compression as compared to the initial text ones. Therefore, a search through 1 TB of data brings now only 5 seconds of latency.
Finally, to a reasonable extent, data intensive workloads also benefit from in-memory processing. Tez allows speculative executions to be attempted on faster nodes according to the Longest Approximate Time to End (LATE) strategy. These approaches were found to result in an overall speed performance improvement between one and one and a half orders of magnitude. In the case of iterative jobs, such as cost based function optimizations, an alleviation of up to 20 times in latency was obtained.
This subsection has so far been discussing just the underlying infrastructure used for building the predictive systems for laser-plasma interaction experiments optimizations, focusing not as much on the hardware but on the tools and tricks deployed for making the big data processing run faster and on less resources. However, some attention must be given also to the conceptual design of the predictive systems. This is displayed in Figure 2.
The particular cases of HHG experiments that were envisaged refer to the interaction of ultrashort and intense laser pulses with overdense plasmas (plasmas with density higher than the critical density). At the most basic level, this mechanism can be understood as the reflection of the incident laser and of its subsequently created harmonics on the oscillating plasma surface (oscillating mirror model OMM ). Since the plasma density is higher than the critical one, the laser cannot penetrate the plasma and thus it reflects on its surface. This surface is not flat and it exhibits an oscillatory movement due to the laser-induced heating mechanisms. While it is true that the yielded spectra depends a lot on the on the initial conditions—laser intensity, pulse duration, incidence angle, plasma density—the key factor is in fact the optimization of the resonance absorption as this fundamental process may account for up to 30% of the laser energy being absorbed by the plasma. Practically, the incident electromagnetic wave excites a plasma electron wave of the same frequency and the second harmonic results out of the mix between the plasma electron wave and the electromagnetic laser pump, hence its frequency being the double of the incident wave’s. Although the second harmonic is mainly reflected, part of it can propagate inside the plasma and excite a wave of the same frequency, that in turn, by mixing with the incident laser pump yields the third order harmonic. Moreover, it was also demonstrated that there is a correlation between the nonlinear, ponderomotively driven plasma surface motion and the production of energetic electrons [111, 112]. A pronounced asymmetry of longitudinal oscillations in a steep density profile is known to lead to wave breaking which in turn causes fractions of electrons to be irreversibly accelerated into the target. This kinetic process results in further absorption of energy from the laser. Furthermore, the accelerated fast electrons can themselves drive Langmuir waves, in the overdense region as well as in the ramps that form in front of the target, eventually leading to the generation of harmonics. This mechanism, namely, coherent wave excitation (CWE)  is the main responsible for HHG at moderate intensities. Further increase in laser intensities improves the prospects for efficient surface high order harmonics generation and, in principle, with relativistic lasers, high harmonics intensities may even exceed the intensity of the focused pulse by several orders of magnitude.
The goal of developing and deploying predictive modeling for HHG experiments was to have an estimate of the maximum order of the highest observable harmonic, along with the intensity, duration, wavelength of the various high harmonics and their conversion efficiency, given a particular laser interacting with a particular kind of plasma. The available data set consisted mainly of simulation data obtained by running various PIC codes but also from experimental data collected from the published scientific literature. Initially the data set amounted to 2 TBs but with the passing of time it reached about 5 TBs so the last predictions using deep learning were performed taking full advantage of the 5 TBs.
The first attempts in performing predictive modeling for high order harmonics generation experiments [100, 101] involved, on one hand commodity hardware with lower performances than the cloud currently used, without any GPUs and, on the other, an earlier version of Hadoop, installed and configured without any of the optimizations introduced in the meantime. This combination implied, first of all, long running times –up to several hours—just for MapReduce and further ones for the machine learning algorithms implemented with Mahout. Each additional TB of data was yet another challenge for the system and its available resources. Supervised learning made an obvious choice, consequently the most popular of the universal functional approximators , the MLP, was chosen as a starting point due to its versatility. Using its famous backpropagation algorithm (BKP) [115, 116] for error minimization during training, the MLP solves problems stochastically being able to provide approximate solutions even for extremely complex tasks. The high degree of connectivity between the nodes and the increased nonlinearity of this neural network cause its generalization ability to be among the best, coping rather well even with noisy and missing data. However this comes at the expense of significant running times in the training phase. While increasing the number of hidden layers is likely to lead to the improvement of overall performances, potentially revealing key features embedded in the data, adding too many of them was beyond the old system’s capabilities, thus bottlenecks were reached very quickly.
The training set’s input values are the laser intensity, laser wavelength, pulse duration, polarization, incidence angle and the type of plasma (introduced as ionization degree and elemental Z number) and its initial density. The desired output values in the training set are the maximum order of the highest observable harmonic, intensity values for different harmonics (including the highest one), harmonics’ wavelengths, durations as well as their conversion efficiencies. About 85% of the entire data formed the training set while the rest served as a test set and these percentages were hanged on to during the whole time up to the latest deep learning implementations. Multiple MLP topologies were tested, with different types and numbers of neurons, different numbers of hidden layers, batch or incremental training with various optimization algorithms. Deciding upon the number of neurons in the input layer depends mainly on the number of parameters that define a laser-plasma interaction scenario. The number of neurons in the output layer is generally a function of the yields that need to be classified or predicted. The number of hidden layers and the number of neurons within a layer were empirically determined. Hence, three of the investigated MLPs—henceforth labeled MLP1, MLP2 and MLP3, respectively—were found to exhibit satisfactory behavior in terms of accuracy. However the running hours were discouraging especially since, according to the results, it was obvious that an upgrade towards adding more hidden layers and more neural units was imminent. MLP1 has an input layer consisting of 8 Adaline neurons, two hidden layers, each with 12 sigmoidal neurons and an output layer of 5 sigmoidal units. It was trained with batch training, while the cost function was defined in terms of mean squared error (MSE) and optimized with Steepest Descent. MLP2 has three hidden layers, each with 10 sigmoidal neurons. The second difference from MLP1 is that its cost function was optimized with resilient backpropagation. Finally, MLP3 has two hidden layers, each with 11 sigmoidal units and it deploys the Levenberg-Marquardt algorithm for finding the global minimum of the cost function. For two HHG scenarios, Table 1 displays the prediction results obtained with each of the three MLPs. Within the first scenario, laser’s parameters are as follows: , , polarization p, pulse duration , incidence angle , interacting with an aluminum overdense plasma of electronic density equal to . For the second scenario, the laser parameters are: , , polarization p, pulse duration , incidence angle with the plasma surface , while the aluminum plasma has a density of . The obtained predictions were in good agreement with PIC simulations as well as the literature data. However, it is easy to notice that the predicted intensities of the highest observable harmonic are lower in comparison to both theory and PIC results. This is caused by several factors, one of them being the heterogeneity of the available interaction data and the fact that the sets were minimally processed for cleaning during the “machine learning stages”. As the collected information originates from multiple sources, it is obvious that the errors affecting the recorded values have different distribution functions. Furthermore, for a particular interaction scenario, we may have several experimentally determined values for the intensity of the highest observable harmonic and several numerical results. This constitutes redundant data, its principal negative effect being the overfitting. For the MLP based predictive modeling, all the redundant data was kept as it was, without any merging or advanced filtering. Overfitting is known to produce unrealistic predictions in MLPs even with noise free data, let alone with redundancy or sparsity. On the other hand, for certain scenarios, there was no available reference. Hence, the problem of missing information was solved by running a modified version of LPIC++ and recording the corresponding yields. In spite of having applied sampling and some filtering in order to assemble equilibrated training sets, a certain degree of incipient overfitting was detected in case of MLP1 and MLP2, thus some relative underestimation or overestimation was to be expected.
|Highest observable harmonic|
Another aspect to be noted is the fact that all the MLPs discussed in this chapter feature hidden and output layers of sigmoidal units and this is the most important factor responsible for underestimation. The sigmoid activation function has a non-zero mean being prone to cause non-zero values in the Hessian matrix of the objective function, hence modifying the global minimum of the latter. A high number of sigmoidal neurons in a network strongly influences the weights adjustment during training, specifically, the corresponding weights in the last layers tend to take very small values (close to zero) and this saturation can last a very long time. To a good extent, the effect was mitigated by using a random initialization of weights, not only in the very beginning but also during the training process. Respectively, after observing a persistent saturation situation for a number of epochs, I performed some adjustments by adding small random values to the stagnating weights. This was found to improve the MLP’s estimations on one hand and to increase the predicted values on another. Perhaps this was also one of the causes in the overestimation of certain parameters. A slightly better and more stable behavior was observed in case of MLP3, having required far less additive procedures of random values to the weights. Comparatively with the other two, the errors during training were smaller, the convergence faster and the predicted values for the high order harmonics were, in general, closer to the literature data, owing to the Levenberg–Marquardt algorithm, an algorithm known to improve the overall convergence speed due to the combination between Newton’s method and Steepest Descent.
As stated above, on the course of interaction, the laser heats the plasma through various mechanisms. Inherently, some of the electrons acquire a lot of energy and become “hot”, having very high temperatures, much higher than the plasma temperature. The percentage of hot electrons is very low but, in spite of this, their effects are not always negligible and, for certain experiments, even damaging. For an HHG experiment, a high percentage of hot electrons can disturb the oscillations of the plasma surface, a situation that affects the reflection of the laser, the CWE mechanism and consequently the HHG. For instance, a strong Brunel effect  leads to more thermal electrons. Consequently, it is important to have an accurate estimation of electron temperatures within the plasma along with the corresponding fractions of particles. For this purpose, another MLP (MLP4) was designed since the previous three gave only modest evaluations. Input values in the training set incorporate apart from the previously stated ones, the plasma’s initial electronic temperature. The desired output values are electron temperatures accompanied by the estimated percentages of electrons that have these temperatures and the corresponding time moments. The best performing topology was found to be an MLP with 9 Adaline neurons in the input layer, 2 hidden layers, each with 11 sigmoidal units and an output layer with 3 neurons, also sigmoidal. The training was performed incrementally and the cost function defined in terms of MSE and optimized with Levenberg-Marquardt. For the same interaction conditions discussed above plus two additional cases (for Scenario 1, the incidence angle was modified to from the normal to the plasma surface, this constituting Scenario 3 while for the same parameters in Scenario 2, the incidence angle was changed to , this being labeled Scenario 4.) prediction results are shown in four graphs below. Figures 3 and 4 display comparatively the percentage of electrons estimated to have a temperature above 10 keV at different time moments and above 100 keV, respectively. Figure 3 refers to Scenarios 1 and 3, while Figure 4 concerns the second and the fourth. The procedures of random initialization and adjustment (during training) of weights were also applied in an attempt of improving MLP4’s performance. However, it is the belief of this author that the combination between the network’s topology, the sampling of available interaction data, the random additions and the incremental training, have led to some significant overestimations of the percentages of electrons (some 10%) in certain cases as values reported in the literature are smaller.
Prior to migrating towards deep learning, some trials were made with an unsupervised network, namely a SOM. The same training sets were used just that the data was differently organized, namely one entry in the training set consists of a matrix. The matrix’s columns stand for: plasma (ionization degree, initial electronic temperature, initial plasma density, final plasma density, maximum plasma density), laser (intensity, wavelength, pulse duration, polarization, incidence angle) and 8 columns characterizing 8 different high order harmonics including the highest one (order, intensity, wavelength, duration, conversion efficiency). Several topologies were tested. However, just one of them yielded satisfactory results, namely a 2D network. The neurons’ positions into the map were optimized based on Euclidian distance minimization and the competitive learning principle [117, 118]. SOM1 has a total of nodes, disposed on a regular rectangular grid, with 16 nodes for mapping the harmonics’ intensity and 21 for the orders of the harmonics. While a color code was employed for duration of pulses, the wavelengths and conversion efficiencies were derived computationally and written in an additional text file accompanying the map. The large number of nodes in this network is the consequence of the need for a better visualization of the final results. However, this weighs considerably in terms of number of training epochs and computation time and it was found that the SOM required far more resources than the MLPs and it took longer to train. In principle, it would be ideal to add more units and some improvements in terms of algorithms, along with the elimination of the accompanying text file and the associated computationally derived values. This basically means a SOM with more than two dimensions which, at the time, it was nearly impossible to implement. Hence, I desisted to pursue the development of predictive modeling using unsupervised learning. For exemplification, MLP performances in predicting high order harmonics and their features—for the interaction conditions described in Scenario 1—are displayed in Table 2, together with the SOM’s and the results obtained from PIC simulations. The agreement between the forecasts of the MLPs and the ones of SOM is quite good, the values being within the same range.
|Harmonic order||Harmonic’s characteristics||PIC (calculated)||MLP1||MLP2||MLP3||SOM1|
In the view of building better predictive systems and even recommender systems for optimized laser-plasma interaction experiments, hardware upgrades were firstly made. Apart from adding an extra cluster node, replacing the storage hard drives with increased capacity ones in all computers and adding extra 8GB of RAM to all of them, a total of four GeForce GTX Titan were attached to the cluster, one by node. At the most basic level, deep learning networks can be viewed as modified MLPs that contain a high number of units and layers and are algorithmically more complex than the classical MLPs. Hence, the GPUs provide support for heavy computations. The Docker engine was installed on the GPU nodes along with the necessary Nvidia drivers and the nvidia-docker. A Docker image containing Theano, TensorFlow, Keras, Caffe, cuDNN and of course CUDA 8.0 and Ubuntu 14.04 was downloaded from GitHub, built and deployed as a container on the GPUs. All the deep learning based predictive modeling systems described in this chapter were discovered (structurally), trained, built and tested using these libraries. The optimal ones were implemented and deployed on the Hadoop cluster. The containerization of GPU applications provides important benefits such as reproducible builds, ease of deployment, isolation of individual devices running across heterogeneous driver/toolkit environments, requiring only Nvidia drivers to be installed. The images are agnostic of the Nvidia driver, with the required character devices and driver files being mounted when starting the container on the target machine.
The designation of the deep learning based predictive modeling systems were, for start, the same HHG experiments. However, the data lake increasingly incorporates other related interaction data. It is expected that more available information on what happens during various experiments performed in similar conditions will help to better understand the physics of interaction and consequently, to foresee what phenomena might occur. Huge data sets needed for training -after having been subject to MapReduce—have to be transferred to the GPU nodes. While the GPU memory system provides a higher bandwidth as compared to the CPU memory system, transferring data between the main memory and GPU memory is very slow. Copying via DMA to and from the GPU over the PCIe bus involves expensive context switches that reduce the available bandwidth considerably. This is why directives such as “gmp shared” and “gmp private” have been added for identifying the data to be transferred between main memory and GPU memory. These directives are translated to relevant memory transfer calls, like cudaMalloc, cudaMemcpy, cudaFree within CUDA. Furthermore, potential redundant data transfers may slow down the GPU while running other jobs. These can be avoided through various dataflow and jobs workflow optimization techniques. For this reason, it was highly important to have the workflow engine and resource allocator configured and running on Hadoop. Additionally, the optimizations brought to MapReduce impact directly on the dataflow to GPUs.
The first deep learning networks that have been implemented were the DNNs. Since, basically, DNNs are MLPs with many hidden layers—commonly, a few tens—it was a relatively easy transition from machine learning to deep learning. In spite of this, things tend to get complicated when trying to guess out an optimal DNN configuration. This is a very tedious process. The solution comes from adopting a grid search algorithm combined with other two, namely constructive learning and dropout. This way, I was able to generate several hundreds of DNNs using constructive learning and dropout algorithms during the training phase and search for the optimal ones with grid search. Each of the tested configurations was cataloged and the best performance ones were prioritized for further usage. Both constructive learning and dropout can be performed in three ways, all of which have been tested. The first one involves adding more neurons to layers along with their corresponding connections to the others in the network (constructive learning) or simply removing ones (dropout) if performances are found to stagnate at an unsatisfactory level during the training phase. The training is continued and the evolution monitored. These actions, of adding and removing units may be performed several times during a training procedure. The second approach involves keeping the same network configuration while applying the algorithms on the data set instead of layers. Hence, instead of adding or removing units, one adds more data or removes portions of it from the training set. Last but not least, the third method is a combination of these, namely the construction and dropout procedures can be applied to both the network and the data. Although this is the most costly strategy, both in terms of resources as well as in terms of running times, it was by far the most effective one, yielding the best performances. This latter approach was also the one chosen for building the DNN based predictive systems.
Out of the huge pool of networks (nearly 500), two deep neural networks were found to perform better than all others. They will henceforth be labeled DNN1, respectively DNN2. DNN1 has an input layer consisting of 8 Adaline units, 20 hidden layers, containing only sigmoidal neurons. All layers have 12 units, except for the layers 3, 5, 6, 8 and 11. Layer 3 has 11 units, layer 5 has 15, layers 6 and 8 contain 12 each while layer 11 has just 7. The output layer features 5 sigmoidal neurons. DNN1 was trained with batch training and the cost function was optimized with Levenberg–Marquardt. DNN2 has an input layer consisting of 8 Adaline units, 36 hidden layers, containing only sigmoidal units. All layers are formed by 14 neurons, except for the layers 2, 6, 7, 9, 12, 16, 18, 23, 24, 25, 28, 30, 31, 32 and 35. Layer 2 has 15 units, layers 6, 9, 16, 25, 28 and 32 have 12, layers 7, 18 and 31 contain 13 each, layer 12 has 16, layer 23 has 15, layer 24 has 11, layer 30 contains 9 units while layer 35 has only 7. The output layer features 5 sigmoidal neurons. Training was performed also in batches and the cost function was optimized with Levenberg–Marquardt. For HHG Scenarios 1 and 2 discussed in the previous subsection, Table 1 also includes the predictions obtained with DNN1 and DNN2. The following lines refer to predictions made with DNNs combined with ensemble learning and these are labeled EL1 and EL2, respectively. EL1 and EL2 were obtained by applying ensemble learning on the best 50 configurations of all tested DNNs—this being the case of EL1—and, respectively over all configurations (EL2). This means that the predictions offered either by the 50 DNNs, either by all of them, were averaged arithmetically and the result used as the prediction value. Although it might not seem appropriate to use averaging, this algorithm has its foundations in statistics and it is expected to offer better performances than a plain DNN. Using ensemble learning also mitigates the underestimation problem caused by the sigmoidal neurons although this problem tends to be less pregnant in the case of deep neural networks due to their increased numbers of layers and units. Consequently the effect on the cost function optimization is not as strong. As a general conclusion, the predictions furnished by the DNNs and the DNNs combined with ensemble learning are much closer to the ones reported in the scientific literature than the values offered by the MLPs.
For Scenarios 2 and 4 presented in Section 3.1, the temperatures of the electrons within the plasma along with the corresponding percentages were predicted using DNN3 and EL3. Figure 5a displays the evolution of the electrons having temperatures above 10 keV, in terms of percentages, for Scenario 2 while Figure 5b refers to the same evolution but for conditions consistent with Scenario 4. Figure 6a, b present the variation of electron percentages for electrons having temperatures higher than 100 keV for Scenarios 2 (Figure 6a) and 4 (Figure 6b), respectively.
In each of the graphs four curves can be noticed. This is because the two curves corresponding to DNN3 and to EL3 are accompanied by the predictions of the MLP4 presented in the previous subsection and also by the results of PIC simulations. DNN3 has an input layer consisting of 9 Adaline units, 43 hidden layers, containing only sigmoidal neurons. All layers are formed by 15 neurons, except for the layers 4, 6, 9, 13, 15, 19, 21, 27, 34, 35, 38, 40 and 41. Layer 4 has 16 units, layers 6, 9, 19, 34, 35 and 40 have 12, layers 13, 15 and 27 contain 11 each, layer 21 has 17, layer 38 has 14, and finally, layer 41 has 11. The output layer features 7 sigmoidal neurons. The training was performed also in batches and the cost function was optimized with Levenberg–Marquardt. EL3 was obtained by applying arithmetic averaging over a number of 100 predictions coming from the best 100 different DNN configurations that have been tested out of 478. Examining the curves, several conclusions can be drawn. Firstly, the DNN and the EL curves are very close, nearly superimposed. Secondly, the values predicted by DNN3 and EL3 are closer to the ones obtained from PIC simulations and more distanced from the predictions of the MLP. To the extent to which the PIC calculations are closer to real measurements, it can be confirmed that DNN and EL predictions are better than the MLP ones.
Since the obtained results were encouraging, further trials have been performed in the deep learning area, namely the deep neural networks were replaced with convolutional ones. CNNs are mostly reputed for their high suitability for applications dedicated to visual recognition from images. Therefore, in a way, CNNs’ architectures make the explicit assumption that the inputs are images but this is not an incommoding aspect as—prior to being fed to a CNN—the values in the training and test data sets can be reorganized within an input volume formed out of laser parameters, plasma characteristics and yielded high order harmonics’ characteristics just as images are normally structured. Consequently, I found a convenient way to organize the interaction information for the supervised training by making each entry in the training set a 20 × 20 × 20 volume, in conjunction with a look-up table technique (LUT). The first dimension of each cube contains a reference in a LUT regarding the information on the incident laser’s parameters, the second one includes references to the plasma characteristics (including electron and ion temperatures) while the last dimension has the references to high order harmonics spectra and to hot electrons’ temperatures and percentages. The very nature of the CNN facilitates the incorporation of more features within the training and test sets. What distinguishes CNNs from DNNs is the fact that all of its layers have neurons arranged in three dimensions: width, height, depth. A second major difference concerns the connectivity. Within a DNN, all units are connected to all other neurons in the previous as well as in the next layer. As the number of layers rises, the number of connections grows exponentially, thus impacting dramatically on the computational resources. The CNNs bring a major change. The neurons in a layer are only connected to a small region in the layer before it. The output layer is the smallest in dimensions, as inherently, by the end of the network, the full input is reduced to a single vector of class scores arranged along the depth dimension. Three main types of layers exist within the architecture: convolutional layer, pooling layer and the fully connected layer and these are stacked together to form a CNN. The input is fed firstly to one or more subsequent convolutional layers. This layer is the core building block of the network and it performs all the heavy computations. More specifically, it calculates the output of neurons that are connected to local regions in the input, each of the neurons computing a dot product between its weights and a small region it is connected to in the input volume. The convolutional layer has as parameters a set of learnable filters, defined by the user. Every filter is small spatially (along the width and height dimensions), but extends through the full depth of the input volume (what in this particular case is the high orders harmonics spectra). Moreover, each of the filters is looking for a different thing in the input. During the forward pass, each filter is slid (convolved) across the width and height of the input volume and dot products between the entries of the filter and the input at any position are hence calculated. As the filter is slid, a bi-dimensional activation map is produced, that gives the responses of that filter at every spatial position. These activation maps are stacked along the depth dimension and produce the output volume which is next fed either to a pooling layer, either to a second convolutional layer. Intuitively, the network will learn filters that activate when they see some type of feature such as an increased number of high order harmonics or very intense ones on the first layer, or, eventually, an entire rich spectra on the higher layers of the network. The pooling layers perform a downsampling operation along the spatial dimensions (width, height), resulting in smaller volumes. Most commonly, they are periodically inserted in-between successive convolutional layers as they progressively reduce the spatial size of the representation in order to lower the amount of parameters and ease up the computational load in the network. But more importantly, pooling layers mitigate overfitting. The pooling layer operates independently on every input slice, most of the time by using the “max” operation. In addition to max pooling, average pooling or L2-norm pooling may be encountered. Historically, average pooling used to be the most popular but recently it has been progressively replaced by the max pooling as the latter was demonstrated to work better in practice. The fully-connected layer computes the class scores and packs them in a vector, each class score representing a high order harmonic with particular features. This is the only layer within which neurons are connected just as in a DNN. Their activations can hence be computed with a matrix multiplication followed by a bias offset. Basically, both the fully connected layer and the convolutional layer perform the convolution but the neurons in the convolutional layer are connected only to a local region in the input, and many of them share parameters in order to save computational resources.
As with the previous case of the DNNs, about 600 different CNNs have been generated and searched through with the aid of the grid search algorithm. To generate the configurations, several operations have been applied. Firstly, the number of convolutional and pooling layers was varied, as well as their position. For example, I constructed networks containing a pooling layer after each convolutional layer or a pooling layer after each two or three convolutional layers. In some network versions, pooling layers were absent except for a single one, just before the fully connected layer. Secondly, within each convolutional layer, the number of filters was modified in order to observe what happens if the layer is sensitive to more features or if it is sensitive to features that are not relevant for all the types of HHG experiments. Thirdly, several pooling methods have been tested for the pooling layers in each network, namely, the classical max pooling, the average pooling and the stochastic pooling. Last but not least, the dropout and constructive learning algorithms were applied on the fully connected layer, resulting in more CNN configurations. For efficiency purposes, regularization methods such as L2  and elastic net regularization  were applied to all the convolutional layers and to the fully connected layer when some of the weights were observed to peak excessively. The objective was to force the layers of the CNN to make use of all of their inputs at the same rate (as much as possible) rather than to use portions of their inputs preferentially. However, the risk is ending up in having a network layer with neuron weights that are “diffuse” and rather small. Elastic net regularization—a combination between L1 and L2 types—proved to be more efficient than either of the two. Ensemble learning was also deployed, just as before, averaging over either the predictions offered by all networks, either by applying the average on the best performing 10% of the configurations. The best performing three configurations are labeled CNN1, CNN2 and CNN3, respectively. All the networks take the same input size, namely the 20 × 20 × 20 volume described above and were exposed to the elastic net regularization. Their configurations are as follows. CNN1 has four convolutional layers. The first one has 128 filters and a filter size of 5 × 5 × 20, the second and third convolutional layers have 256 filters but a smaller filter size, more precisely 3 × 3 × 20. Finally, the fourth convolutional layer has 512 filters and the same filter size as the latter two. After the first and the third layers, a pooling layer was introduced. The pooling layers use stochastic pooling. The network’s architecture ends with a fully connected 3D cubic layer with 1024 units. It can be noticed that when applying a cubic root to this value, the resulting number of units on each dimension is not an integer. This is because, dropout and constructive learning were applied to the fully connected layer resulting in either vacancies, either insertion of neurons into the volume and in an overall addition of 24 units. The training of CNN1 was done in batches of 512 examples per gradient step with stochastic gradient descent used for the cost function optimization along with the bespoke backpropagation of errors. CNN2 has five convolutional layers, also optimized with elastic net regularization, the first four ones being identical to CNN1’s. The fifth layer has, 512 filters, a filter size of 3 × 3 × 20 and it is followed by the sole pooling layer of CNN2. This pooling layer also employs stochastic pooling. The network’s architecture ends with two fully connected 3D cubic layers with 1024 units each but with different configurations of neurons within the layers’ volumes. This is again due to dropout and constructive learning applied to the fully connected layers. The training of CNN2 was done in the same way but the cost function optimization was achieved via Levenberg–Marquardt. Last but not least, CNN3 has also five convolutional layers (elastic net regularization was applied to the weights), with the first layer having 126 filters and the same 5 × 5 × 20 filter size. The second and the third layers have 252 filters, the second having a 5 × 5 × 20 filter size and the third a 3 × 3 × 20. The fourth and the fifth have 504 filters with the same filter size as the previous one. CNN3 has just one pooling layer in between the fourth and the fifth layers, which makes use of max pooling. The last convolutional layer is followed by two fully connected 768 units layers that were subject to dropout and constructive learning. The training was done also in batches, the stochastic gradient descent being employed with the AdaDelta adaptive learning method . CNN1 and CNN2 use a stride of one for all the convolutional layers while CNN3 uses a stride of 2 for the first and the fourth convolutional layers. This is a consequence of compromising based on the memory constraints that, at some point, bottleneck the GPUs. EL4 and EL5 are ensemble learning yields. EL4 averages over the best performing 10% of the CNNs while EL5 averages over all. For HHG Scenarios 1 and 2 discussed in the previous subsection, the last rows of Table 1 feature the predictions obtained with the CNNs, EL4 and EL5.
For predicting the temperatures of the electrons within the plasma along with the corresponding percentages, it was found that the performances of CNN1, CNN2, CNN3, EL4 and EL5 were roughly identical and very close to those of DNN3 and EL3. In terms of running times, the convolutional neural networks take less time to train than the deep networks, the order being 50 hours less, on average. Prior to applying ensemble learning, the GPU Inference Engine (GIE) was used in the test phase to optimize the trained networks for run-time performance. Layer optimizations are attainable through GIE to the extent to which layers with unused output are eliminated in order to save computation time or layers may be fused for better overall performance.
Technological advances in the field of laser -plasma interaction and diagnostics have provided the scientific community with lots of data. Within the last few years we have been experiencing a continuously upraising accessibility, not only to storage space and increased computer power, but also to a multitude of readily-built and easily modifiable open-source software libraries. It is thus becoming less and less problematic to exploit and explore this already available information in ways that have never been attempted before.
This paper proposes an alternative to the classical plasma kinetics simulations. Acknowledging the potential innovative technologies like cloud computing, big data, machine learning and, ultimately, the deep learning have for science, the author showed how these can be used for predictive modeling of laser-plasma interaction scenarios, with a focus on high harmonics generation. The deployment of the presented systems has the potential of yielding better predictive analytics and hence optimized laser-plasma interaction experiments, by offering a fair estimation of interaction conditions or insights on different phenomena occurring during the laser-plasma interaction.
The author would like to acknowledge support from the National Authority for Scientific Research and Innovation under Program NUCLEU, project PN1647 LAPLAS IV.
Organisms on earth developed the ability to predict and restrict their activity to the night or day by endogenous circadian clock [1, 2]. The mammalian circadian clock system is timed to a 24-h solar time period and maintains rhythmic physiology. In mammals, the circadian clock influences nearly all aspects of physiology and behavior, including sleep-wake cycles, cardiovascular activity, endocrine function, body temperature, kidney function, physiology of the gastrointestinal tract, hepatic metabolism, immune function, detoxification, and the reproductive system [3, 4]. Disruption of biological rhythms produces negative effects in the short and long terms leading to various diseases . For example, clock dysfunction accelerates the development of liver diseases such as fatty liver diseases, hepatitis, cirrhosis, and liver cancer. Liver disorders also, in turn, disrupt circadian clock function .\n
Circadian oscillations are generated by a set of genes forming a transcriptional autoregulatory feedback loop. In mammals, these include the core clock regulators (Clock, Bmal1, and Npsa2), the clock feedback loop regulator genes (Per1, Per2, Per3, Cry1, and Cry2), and the clock target genes (DBP, Rev-erbα (Nr1d1), RORα, Tef, CK1δ, etc.) [6, 7]. The central clock is located in the suprachiasmatic nucleus in the hypothalamus and peripheral clocks in all tissues. Peripheral clocks in the liver have fundamental roles in maintaining liver homeostasis, including the regulation of energy metabolism and the expression of enzymes controlling the absorption and metabolism of xenobiotics . Over the past three decades, researchers have investigated the molecular mechanisms using global clock-gene knockout mice, or clock gene mutant mice, or other genetic and molecular biology tools to elucidate molecular architecture of circadian clock in mammals .\n
Chronopharmacology and chronotoxicology is a new interdisciplinary science aimed at studying the influence of circadian system on drug disposition, efficacy, and toxicity. Xenobiotics absorption, distribution, metabolism, especially by P450, and excretion [10, 11, 12, 13, 14], all under circadian regulation. Circadian variations on these hepatic drug processing genes  greatly influence therapeutic effects and toxicity of drugs [10, 16, 17, 18]. The chronotherapy of anticancer drugs gives an excellent example . This chapter will focus on the general aspects of circadian rhythms on drug/toxicant disposition and biological effects, and will also discuss the effects of drugs/toxicants on circadian clock gene expression as a novel target of chronopharmacology and chronotoxicology. A dozen of our publications in recent 5 years were also included for discussion.\n
Liver is the major site of xenobiotics metabolism and disposition. Accumulating evidence clearly indicates that circadian rhythms affect the gene/protein expression encoding xenobiotics uptake (Oatps and Ntcp), Phase-I metabolism (P450) and detoxication (Nrf2, MT-1, and GSH systems), Phase-II conjugation (glutathione S-transferases, UDP-glucuronosyltransferases, and sulfotransferases), and efflux transporters (Mrps and MDR) (Figure 1).\n\n
Diurnal variation of hepatic uptake transporters. In the liver, the major uptake transporters are organic anion transporting polypeptides (Oatp1a1, Oatp1a4, Oatp1b2, and Oatp2b1), organic cation transporter (Oct1), organic anion transporters (Oat2 and Oat6), and others . The expressions of Oatp1a1, Oatp1a4, Oatp1b2, Oct1, and Oat2 display diurnal oscillations, with higher expression in the morning, while Oatp2b1 did not show circadian variation . Na+-taurocholate cotransporting polypeptide (Ntcp and Slc10a1) is a major bile acid uptake transporter that localizes to the basolateral membrane of hepatocytes, and displays apparent circadian rhythm, with higher expression in the afternoon [20, 21, 22].\n
Diurnal variation of hepatic Phase-I P450 metabolism enzyme genes. Hepatic cytochrome P450 is the major enzyme catalyzing the Phase-I drug metabolism. Most drugs are metabolized by P450 1–4 family enzymes. P450 enzyme genes and corresponding nuclear receptors display diurnal oscillations: AhR and Cyp1a1, 1a2 are higher in the morning; CAR and Cyp2b10 are higher in the afternoon and evening; PXR is higher in the afternoon but Cyp3a11 and Cyp3a25 are higher in the morning; PPARα is higher in the morning but Cyp4a10 is higher in the evening . Cyp7a1 is a rate-limit enzyme gene for bile acid synthesis, displays a typical circadian rhyme, with the peak around 18:00 [21, 22, 23, 24]. Bile acid synthesis is controlled by the circadian clock and Rev-erbα is a major clock gene controlling bile acid homeostasis .\n
In the liver, circadian rhythm serves to synchronize the metabolism of bile acid, glucose, and lipid, and their disruption could lead to diseases and affect chronotherapy . Indeed, the liver is the key organ to maintain energy metabolism which is greatly influenced by feeding, diets, and diurnal variation . For example, Peroxisome proliferator-activated receptor-gamma coactivator (PGC1α) stimulates the expression of clock genes, notably Bmal1 (also called Arntl) and Rev-erbα (also called Nr1d1), through coactivation of the ROR family of orphan nuclear receptors. Mice lacking PGC-1α show abnormal diurnal rhythms of activity, body temperature and metabolic rate . Circadian clocks regulate metabolic processes not only by simply in response to daily environmental/behavioral influences but also by synchronizing the cell with its environment to modulate a host of metabolic processes [27, 28, 29].\n
Diurnal variation of hepatic detoxification enzyme genes. Many antioxidant enzyme genes display diurnal variations, such as the Nrf2 detoxication pathway genes , enzymatic detoxication components such as superoxide dismutase (SOD), catalase, glutathione peroxidase (GSH-Px1) and non-enzymatic protein such as metallothionein (MT) . GSH is low in the afternoon which is partially responsible for acetaminophen hepatotoxicity when given in the afternoon .\n
Diurnal variation of hepatic Phase-II metabolism gene/proteins. Glucuronide and sulfate conjugations are major Phase-II pathways in the biotransformation and elimination of a wide variety of endogenous compounds, drugs, and other xenobiotics. Diurnal variations of these Phase-II reactions were reported in the 1980s . Consistent to the variation in the conjugation reactions, the expression of Ugt1a5, 2a3, 2b34, 2b36 and UDP-gpb, as well as Sult1a1, 1a5, and Sult5a1, all show diurnal oscillations . Hepatic GSH has the trough at dusk , and the activities of GSH S-transferase  were lower at the dark phase and the expression of Gst1a1/1, Gst1a4, Gstm2, and Gstt1/2 display diurnal rhythms which are generally lower in the dark phase .\n
Diurnal variation of hepatic Phase-III efflux transporters. P-glycoprotein is the major efflux pump in the liver, and its expression shows circadian variation together with the diurnal expression of Abcb1 . In addition to P-glycoprotein, hepatic multidrug-resistant protein 2 (MRP2), breast cancer resistant protein (BCRP) also show circadian oscillations . Diurnal variations in hepatic mRNA expression of multidrug-resistant gene 1a (Mdr1a), Mrp2, and Bcrp were also evident [20, 35].\n
Diurnal variation of hepatic Phase-I, Phase-II, Phase-III, and the nuclear transcription factors would affect the xenobiotic metabolism when administered at the different times of the day to impact their efficacy and toxicity, the time really matters .\n
Table 1 gives a few examples of how the disruption of circadian clock could affect drug effect and toxicity. Most of the examples used genetic models with disruption of circadian clock genes or administration of drugs at different times.\n
|Carbon tetrachloride\n||SD rats\n||18:00 toxicity >6:00, with lowest GSH levels\n||\n|
|Carbon tetrachloride\n||Per2−/− mice\n||Acute toxicity increased in Per2−/− mice\n||\n|
|Carbon tetrachloride\n||Per2−/− mice\n||Chronic toxicity, fibrosis increased in Per2−/− mice\n||\n|
|Acetaminophen\n||KM mice\n||18:00 toxicity >6:00\n||\n|
|Acetaminophen\n||Per2−/− mice\n||Toxicity decreased in Per2−/− mice\n||\n|
|Acetaminophen\n||Clock−/− mice\n||Toxicity decreased in Per2−/− mice, with prolonged PBST\n||\n|
|Acetaminophen\n||Bmal1fx/fxCreAlb mice\n||Reduced toxicity, reduced protein adducts, altered APAP metabolism\n||\n|
|Dixon (TCDD)\n||Per1Idc,Per2Idc mice, cells\n||Increased TCDD induction of Cyp1a1, Cyp1b1\n||\n|
|Dixon (TCDD)\n||Per1Idc,Per2Idc, Per1/Per2Idc mice\n||Abolished diurnal variation of TCDD induction of Cyp1a1\n||\n|
|Benzo[a]pyrene\n||Clock mutant (Clk/Clk) mice\n||Abolished diurnal variation of B[a]P induction of Cyp1a1\n||\n|
|Bile duct ligation\n||Per2−/− mice\n||Increased BDL-induced liver injury and fibrosis\n||\n|
|Cholestyramine diet restricted feeding\n||Per1−/−/Per2−/− mice\n||Lost diurnal variation in bile acid metabolic enzyme genes\n||\n|
|Isoniazide\n||Swiss mice\n||Isoniazid hepatotoxicity at ZT1 > ZT9, ZT17\n||\n|
|Chlorozoxazone\n||Wistar rats\n||Diurnal variation in CYP2E1 affect its half-life\n||\n|
|Alcohol\n||Per1−/−, Per2−/− mice\n||Less susceptible to alcohol toxicity\n||\n|
|Diethylnitrosamine (DEN)\n||Clockmut mouse hepatocytes\n||Decreased DEN metabolism and apoptosis tolerance\n||\n|
|Cadmium\n||ICR mice\n||Toxicity at ZT 8 > ZT 20, corresponding to low level of GSH at ZT8\n||\n|
Carbon tetrachloride is a commonly used hepatotoxicant. In SD rats, administration of CCl4 in the afternoon showed more toxicity than administrated in the morning, the increased toxicity was accompanied by the lowest hepatic GSH levels in the afternoon . Acute CCl4 toxicity was increased in Per2−/− mice. At the 12-h time point after CCl4 treatment, more vacuolations were observed in the liver tissues of Per2-null mice as compared to wild-type (WT) mice, and at 24 h after CCl4 treatment, more severe hepatic necrosis was evident than that occurred in WT mice. A deficit of the Per2 gene enhanced Ucp2 gene expression levels in the liver leading to reduced ATP and increased production of toxic CCl4 derivatives. The absence of Per2 also caused an increased expression of Clock gene . Per2-null mice were not only sensitive to CCl4-induced acute hepatotoxicity, but also to CCl4-induced chronic toxicity and fibrosis. CCl4 caused much more severe liver fibrosis and activated hepatic stellate cell (HSC) in mPer2 null mice as compared to WT mice. Per2-null mice exhibited less efficiency in fibrosis resolution and apoptosis resistance in HSC. Transfection of Per2 cDNA into CCl4-exposed HSC restored apoptosis sensitivity with up-regulation of the TRAIL-R2/DR5 signaling pathway .\n
Acetaminophen hepatotoxicity also displays diurnal variations. When given acetaminophen in the afternoon, toxicity was greater than that given in the early morning [23, 39]. At 8:00, there was no difference of acetaminophen toxicity between Per2-null and WT mice, but at 20:00 when the Per2 expression is highest, Per2-null mice had less liver injury, with less Cyp1a2 expression to bio-activate acetaminophen . In another study, acetaminophen toxicity is greater at Zeitgeber time (ZT)14 than at ZT2, and clock-deficient mice are resistant to the toxicity at ZT14, with prolonged pentobarbital sleep time (PBST), indicating the reduced activation of acetaminophen . Use Bmal1 mutant mice (Bmal1fx/fxCreAlb), the acetaminophen toxicity at ZT12 was decreased, along with decreased APAP protein adducts and altered acetaminophen metabolism kinetics (increased AA-Gluc), possibly due to decreased NADPH-cytochrome P450 oxidoreductase gene expression and activity at ZT12, as compared to WT mice .\n
In Per1, Per2-deficient mice, the ability of AhR ligand dioxin (TCDD) to induce the Cyp1a1 and Cyp1b1 was enhanced, especially with targeted interruption of Per1 . TCDD induction of Cyp1a1 was 23–43 fold greater during the night time (ZT18) than at the day time (ZT6) in WT mice. However, the diurnal variation in the TCDD induction of Cyp1a1 expression was abolished in Per1ldc, Per2ldc, and Per1ldc/Per2ldc mutant mice, suggesting that Per1, Per2 and their timekeeping function in the circadian clockworks mediate the diurnal variation in TCDD induction of Cyp1a1 . Clock mutant Clk/Clk mice failed to show typical oscillation of AhR expression, and BaP (an AhR ligand) induction of Cyp1a1 was disrupted .\n
In Per2−/− mice, bile duct ligation (BDL)-induced liver injury and fibrosis was increased, along with increases in TNFα, TGFβ1, Col1α, and TIMP1 in livers of Per2-null mice as compared to WT mice . In Per1−/− and Per2−/− mice fed on 2% cholestyramine diet, and/or restricted feeding (phase-shift peripheral clock), liver bile acid levels were increased, and the nuclear receptors CAR and PXR were activated, together with the increased serum enzyme AST levels, indicative of liver damage. In these Per1−/− and Per2−/− mice, the circadian expression of key bile acid synthesis and transport genes, including Cyp7a1 and Ntcp, was lost .\n
The hepatotoxic potential of antituberculosis drug isoniazid varied when it was administered at ZT1, ZT9, and ZT17, and the toxicity was highest when isoniazid was given at ZT1 . Chlorzoxazone is a CYP2E1 metabolized drug, and its kinetics and half-life were altered with the diurnal variation of CYP2E1 activity. The value of chlorzoxazone half-life in plasma of the light phase group was significantly longer than the dark phase group, with an increase of 6-hydroxychlorzoxazone production . Acute alcohol-induced higher toxicity at ZT13 than ZT1 when Per1 and Per2 were highly expressed. Per1−/− and Per2−/− mice were less susceptible to alcohol hepatotoxicity, especially in Per1 null mice. Per1 null mice had decreased expression of peroxisome proliferators-activated receptor-gamma and its target genes related to lipid metabolism such as Srebp1, fatty acid synthase (Fas), CD36, diacylglycerol O-acyltransferase 2 (Dgat2), AP2, and adipsin . In primary hepatocytes isolated from Clock mutant Clk/Clk mice and WT mice, diethylnitrosamine (DEN) induced apoptosis and cell death were reduced in Clock-deficient mice, probably due to decreased DEN metabolism . Cadmium hepatotoxicity is independent of metabolic activation; while its mortality was high at ZT8 than ZT20 when the hepatic GSH level was lowest .\n
Thus, alterations of diurnal oscillations would affect drug metabolism, efficacy, and toxicity. On the other hand, drugs could target circadian clock gene expressions to produce biological effects, which will be discussed below.\n
The circadian clock is located in both brain and peripheral tissues . The central clock pacemaker is located in the suprachiasmatic nucleus (SCN) of the hypothalamus, while the peripheral clock is distributed in all peripheral tissues. The liver is the main peripheral tissue under circadian clock regulation [7, 8, 9]. Drugs/toxicants could affect both central and peripheral clock gene expression. For example, Mn is a well-known neurotoxicant producing a Parkinson-like syndrome, but it also produces liver injury. In an attempt to examine the effect of Mn on the central and peripheral clocks, rats were given Mn 1 and 5 mg/kg, ip, every 2 days for 1 month, and the hypothalamus and liver were removed to examine the clock gene expression (Figure 2). The results showed that Mn-induced aberrant expression of circadian clock genes in both hypothalamus and liver, and liver was more sensitive to Mn-induced decreases in clock gene Bmal1, Per1, and increase in Dbp, indicating that both central and peripheral clocks could be disrupted by drugs/toxicants . Another example is chronic alcohol administration. Chronic alcohol consumption produced disruption of circadian clock gene expression in both central (hypothalamus) and peripheral tissues (liver and colon) , and the liver appeared to be more susceptible than brain in alterations of metabolic genes and core molecular clock disruption. In addition to the fatty liver and affected the diurnal oscillations of metabolic genes (alcohol dehydrogenase 1, carnitine palmitoyltransferase 1a, Cyp2e1, Phosphoenolpyruvate carboxykinase 1, pyruvate dehydrogenase kinase 4, Ppargc1a, Ppargc1b and Srebp1c), the diurnal oscillations of core clock genes (Bmal1, Clock, Cry1, Cry2, Per1, and Per2) and clock-controlled genes (Dbp, Hlf, Nocturnin, Npas2, Rev-erbα, and Tef) were altered in livers from ethanol-fed mice. In contrast, ethanol had only minor effects on the expression of core clock genes in the suprachiasmatic nucleus (SCN) .\n\n
Many drugs/toxicants could affect central and peripheral circadian clock gene expression as targets of chronopharmacology and chronotoxicology . Tables 2 and 3 provide some examples including our work in the field.\n
|Drugs\n||Animal model (dose, route, time)\n||Chronopharmacology\n||References\n|
|Atorvastatin\n||KM mice; 10–100 mg/kg, po × 30 days\n||Swollen hepatocyte and feather-like degeneration; increased Cyp7a1, FXR, decreased bile acid transporters; increased expression of Bmal1, Npas2, decrease Per2, Dbp.\n||\n|
|Metformin\n||C57 mice; 164 mg/kg in drinking water for 6 weeks\n||Increase in serum leptin and decreased glucagon levels. Increase in PGC1α, PPARα, AMPK; decrease in ACC in liver; Phase advance circadian clock and metabolic genes in liver and activation of liver casein kinase Iα (CKIα)\n||\n|
|Oleanolic acid\n||Apoe−/− mice on HFD, F344 rats; 0.01% OA × 11 weeks\n||Increased lipid droplets with no change in oxidative stress; increased Bmal1, Clock, and Elovl3, Tubb2a, and Cldn1 decreased Per3, Amy2a5, Usp2, and Thrsp.\n||\n|
|Resveratrol\n||C57 mice; fed normal or HFD; 0.1% Res × 11 weeks\n||Ameliorated HFD-increased plasma leptin, lipids, and BW. Restored rhythmicity of Clock, Bmal1, and Per2; and clock-controlled lipid metabolism genes (Sirt1, PPARa, Srebp-1, Acc1, and Fas).\n||\n|
|Sea cucumber saponin (SCS)\n||ICR mice; 0.03% SCS diet night feeding × 2 weeks\n||Improve serum lipid profile; restore rhythmicity of PPARα, Srebp1, Cpt, and FAS; restore nighttime feeding-disrupted clock gene expression.\n||\n|
|Zuotai\n||KM mice; 10 mg/kg, po × 7 days\n||Decreased the amplitude of Clock, Npas2, Bmal1; increased Dbp, Nfil3 at 10:00, and increased Nr1d1 at 18:00. No effect on Cry and Per genes.\n||\n|
|Polyporus and Bupleuri radix\n||ICR mice, Per2Luc mice 500 mg/kg, po × 3 days, at different ZT and light/dark\n||Polyporus and Bupleuri radix were effective in manipulating the peripheral circadian clock phase acutely, with stimulation time-of-day dependency in vitro as well as in vivo.\n||\n|
|Jiao-Tai-Wan\n||SD normal and model (HFD + PSD × 4 weeks) rats 2.2 g/kg, po × 4 weeks\n||Increased total sleep time and slow wave sleep time; reversed model rat-induced inflammation markers; increased Cry1, Cry 2, and decreased NF-κB in PBMC.\n||\n|
|\n||Animal model (dose, route, time)\n||Chronotoxicology\n||References\n|
|Carbon tetrachloride\n||BABL/C mice\n|
0.6 ml/kg, ip, 2/week × 4 weeks
|Chronic CCl4 produced liver fibrosis, altered the amplitudes, meros, acrophases of clock gene expression; circadian rhythms of Cry2, PPARα and POR were lost.\n||\n|
DEN 100 mg/kg, IP+ CCl4 + EtOH × 16 weeks
|Produced HCC, Markedly increased α-fetoprotein; at 10:00, expression of Bmal1 decreased, expressions of Dbp and Rev-erbα increased.\n||\n|
1 and 5 mg/kg, IP, × 4 weeks
|Produced neuroinflammation and dopaminergic neuron loss; decreased expression of Bmal1, clock, Per1, Per2, while increased expression of Dbp and Nr1d1\n||\n|
|LPS + Rotenone\n||SD rats\n|
LPS 5 mg/kg, IP ×1, 200 days later, rotenone 0.5 mg/kg, sc × 20
|Produced neuroinflammation and dopaminergic neuron loss; at the mRNA and protein levels, reduced expression of Bmal1, clock, Per1, Per2, Dbp, Nr1d1, while no effect on Cy1.\n||\n|
|LPS\n||ICR mice, LPS 1 mg/kg, IP at ZT4, 10, 16, 22 or at 2, 8, and 26 h after ZT 4 injection\n||Produced increases in serum TNFa, heart and liver apoptosis; Decrease Per1, Per2 2 h after dose at ZT4 in heart and liver; Increased Per2 8 and 26 h after LPS in heart and liver\n||\n|
|Alcohol\n||C57 mice, Per2Luc mice Lieber-DeCarli diet for 30–37 days\n||Produced steatosis, increased serum TG; diurnal oscillations of Bmal1, Clock, Cry1, Cry2, Per1, and Per2 and clock-controlled genes (Dbp, Hlf, Nocturnin, Npas2, Rev-erba, and Tef) were altered in livers of ethanol-fed mice\n||\n|
|Alcohol\n||WT and ClockΔ19 mutant mice received Nanji liquid alcohol diet at ZT4 for 10 weeks\n||Altered the expression of circadian and metabolism genes in hippocampus, liver, and colon from array analysis; ClockΔ19 affect inflammation and metabolism gene.\n||\n|
Examples of drugs Atorvastatin is an HMG-CoA reductase inhibitor used for hyperlipidemia. It is generally safe but may induce cholestasis. Repeated administration of Atorvastatin (10–100 mg/kg, po) to mice for 30 days produced hepatocyte swollen and feather-like degeneration, indicative of cholestatic injury, with increases of inflammation markers Egr1 and MT-1, and increased Cyp7a1, FXR, SHP, decreased bile acid transporters Ntco, Bsep, Oastα, and Ostβ. Since Cyp7a1 is a clock-driven gene, its effects on circadian clock gene expression were also examined. Atorvastatin increased the expression of Bmal1, Npas2, decreased the expression of Per2, Per3, Dbp, and Tef, but had no effect on Cry1 and Nr1d1 . The similar effects on the circadian clock gene expression were also observed when atorvastatin was given at the low dose (10 mg/kg) but for a longer period of 90 days, although to a less extent .\n
Metformin is commonly used for type 2 diabetes. In C57 mice, metformin in the drinking water for 6 weeks led to increased serum leptin and decreased glucagon levels. The effect of metformin on liver and muscle metabolism was probably mediated through AMPK activation, resulting in the inhibition of acetyl CoA carboxylase (ACC), the rate-limiting enzyme in fatty acid synthesis. Metformin-activated liver casein kinase I α (CKIα) and muscle CKIε, known modulators of the positive loop of the circadian clock, thud resulting in phase advances in the liver and phase delays in the muscle for clock and metabolic gene expressions .\n
Examples of active ingredients from herbal medicine. Oleanolic acid is a triterpenoid used to reduce hyperlipidemia. Dietary oleanolic acid supplementation (0.01%) was provided to Apoe- and Apoa1-deficient mice and F344 rats. In Apoe-deficient mice, oleanolic acid supplementation increased hepatic lipid droplets, increased circadian clock genes, together with increases in lipid metabolism genes (fatty acid elongase 3, tubulin beta-2A chain, and claudin 1), while the expression of per3, amylase 2a5, ubiquitin-specific peptidase 2, and thyroid hormone-inducible hepatic protein (Thrsp) were decreased .\n
Resveratrol is an active ingredient in grapes and red wine and shows beneficial effects in metabolic disorders. In HFD-fed mice, resveratrol restored high-fat diet-induced disorders about the rhythmic expression of clock genes and clock-controlled lipid metabolism, ameliorated the rhythmites of plasma leptin, lipid profiles and whole body metabolic status (respiratory exchange ratio, locomotor activity, and heat production). Meanwhile, resveratrol modified the rhythmic expression of clock genes (Clock, Bmal1, and Per2) and clock-controlled lipid metabolism-related genes (Sirt1, Ppara, Srebp-1c, Acc1, and Fas) .\n
Dietary sea cucumber saponin (SCS) has been shown to have beneficial effects on glucose and lipid metabolism, which is related to the circadian clock. Dietary SCS caused an alteration in rhythms and/or amplitudes of clock genes was more significant in the brain than in liver. In addition, the peroxisome proliferator-activated receptor (PPARα), sterol regulatory element binding protein-1c (SREBP-1c), together with their target genes carnitine palmitoyl transferase, and fatty acid synthase showed marked changes in rhythm and/or amplitude in SCS group mice .\n
Examples of mixtures from traditional medicine. Zuotai is an essential component of many popular Tibetan medicines. Mice were orally given Zuotai (10 mg/kg, 1.5-fold of clinical dose) daily for 7 days, and livers were collected every 4 h during the 24 h period to examine its effects on circadian clock gene expression. Zuotai decreased the oscillation amplitude of Clock, Npas2, Bmal1 at 10:00. For the clock feedback negative control genes, Zuotai had no effect on the oscillation of Cry1, Per1, Per2, and Per3. For the clock-driven target genes, Zuotai increased the oscillation amplitude of Dbp, decreased nuclear factor interleukin 3 (Nfil3) at 10:00, but had no effect on thyrotroph embryonic factor (Tef); Zuotai increased the expression of Nr1d1 at 18:00, but had little influence on Nr1d2 and RORα .\n
Polyporus and Bupleuri radix were popular traditional medicines. Polyporus (Zhulin) is used as a diuresis in the treatment edema, while Bupleuri radix (Chaihu) is used for chronic hepatitis. The Per2Luc mice were used to screen their effects on the circadian clock, and Polyporus was more effective than Bupleuri radix in manipulating the peripheral circadian clock phase-shift, and in promoting time-of-day dependency in vitro as well as in vivo .\n
Jiao-Tai-Wan (JTW), composed of Rhizome Coptidis and Cortex Cinnamomi, is a classical traditional Chinese prescription for insomnia. In obesity-resistant (OR) rats with chronic partial sleep deprivation (PSD) model, 4 weeks of administration of JTW increased total sleep time and total slow wave sleep (SWS) time in OR rats with PSD, and reversed the mode rats elevated serum markers of inflammation and insulin resistance, and these changes were also associated with the up-regulation of Cry1 mRNA and Cry 2 mRNA and the down-regulation of NF-κB mRNA expression in peripheral blood monocyte cells .\n
Table 3 lists some examples of known toxicants which disrupted circadian clock gene expression as a mechanism of their acute and chronic toxic effects to both brain and liver.\n
Examples of hepatotoxicants. Chronic carbon tetrachloride administration in C57 mice (0.6 mL/kg, IP, twice a week for 4 weeks) produced liver injury and fibrosis. The expression of clock genes and metabolic genes in fibrosis livers was altered. The amplitudes of circadian expressions of Bmal1 and Per1 were attenuated and the mesors in the expressions of Clock and Per1 were increased. Acrophases for the expressions of Clock, Per1 and Cry1 were significantly delayed. Circadian rhythm of Cry2 expression was lost in fibrosis group. The circadian rhythm of PPARα and cytochrome P450 oxidoreductase (POR) was also lost .\n
Chronic diethylenediamine (DEN) administration not only produce hepatocellular carcinoma and markedly enhanced expression of Afp, but also decreased the expression of Bmal1, increased the expression of Dbp and Rev-erbα (Nr1d1) . Circadian disruption is well-known to promote carcinogenesis . In the end-stage of human hepatocellular carcinoma, the expressions of the clock genes, including Bmal1, Per1, Per2, Cry1, and Cry2 were decreased, alone with decreases in clock targeted MT-1, MT-2, and MTF1 (which are considered as biomarkers of HCC). On the other hand, the expression of clock target genes Nr1d1 and Dbp was upregulated as compared with Peri-HCC and normal livers. Peri-HCC also had mild alterations in these gene expressions .\n
Examples of neurotoxicants. As mentioned in Figure 2, repeated Mn administration disrupted both central and peripheral liver circadian clock genes, with decreases in Bmal1, Clock, Npas2, Per1, Cry1, but increases in Dbp and Nr1d1. Mn-induced aberrant expression of these clock genes in the brain was consistent with that in the liver, and liver appeared to be more sensitive than hypothalamus to Mn-induced disruption of circadian clock .\n
Chronic neuroinflammation would aggregate neurotoxic effects of toxicants. Rats received a single injection of LPS at the dose of 5 mg/kg, and 200 days later given repeated injection of low dose of rotenone (0.5 mg/kg, sc, 5/week for 4 weeks), and produced neuroinflammation and loss of dopaminergic neurons in Substantia Nigra, replicate the model of Parkinson’s disease . In this PD model, aberrant expression of circadian clock genes in brain cortex was evident, as evidenced by decreases of core clock gene Bmal1, clock, and Naps2, decreases in circadian clock feedback gene Per1 and Per2, but had no effect on the expression of Cry1 and Cry2, as well as the decreased expression of clock target gene Dbp and Nr1d1 .\n
LPS not only produces inflammation in the brain but also in the liver. ICR mice received LPS (1 mg/kg, IP) at ZT4, ZT10, ZT16, and ZT22, and liver and heart were harvested 2 h later for gene expression analysis. Hepatic expression of Per1 and Per2 was decreased after LPS injection at ZT6, but Per1 was increased 8 and 26 h after LPS injection. Heart speared to be more sensitive than the liver to these changes as at ZT4, both Per1 and Per 2 in the heart were decreased .\n
Examples of chronic ethanol toxicity. Alcoholic liver diseases are a major concern as it produced metabolic disruption. In C57 mice and Per2 mutant mice, ethanol administration altered the expression of clock genes in the liver, but not in the brain. Diurnal oscillations of core clock genes (Bmal1, Clock, Cry1, Cry2, Per1, and Per2) and clock-controlled genes (Dbp, Hlf, Nocturnin, Npas2, Rev-erba, and Tef) were altered in livers from ethanol-fed mice .\n
In clock mutant mice, altered clock and metabolism genes were evident in hippocampus, liver, and colon. Of particular interest was the finding that a high proportion of genes involved in inflammation and metabolism on the array was significantly affected by alcohol and the Clock gene mutation in the hippocampus .\n
Thus, drugs/toxicants could affect central and peripheral circadian clock gene expression as targets of their therapeutic effects and/or toxicity .\n
The importance of chronopharmacology has been reviewed 10 years ago . Circadian rhyme governs many physiological functions, and the RNA-Seq revealed that over 3000 genes in the liver showed circadian oscillation . Over the past two decades, research has investigated the molecular mechanisms linking circadian clock genes with the regulation of hepatic physiological functions, using global clock-gene-knockout mice, or mice with liver-specific knockout of clock genes or clock-controlled genes. Clock dysfunction accelerates the development of liver diseases such as fatty liver diseases, cirrhosis, hepatitis, and liver cancer, and these disorders also disrupt clock function. Similarly, clock dysfunction clearly affects drug efficacy and toxicity.\n
In the liver, Phase-I is composed mainly of cytochromes P450 involved in detoxification and hormone and lipid metabolism , which are regulated by nuclear receptors. Phase-II enzymes modify the phase-I metabolites by conjugation reactions, while phase-III includes membrane transporters responsible for the elimination of modified xenobiotics. Phases I−III of drug metabolism are under strong circadian regulation . The rhythmic control of xenobiotic detoxification provides the molecular basis for the dose- and time-dependence of drug toxicities and efficacy, and makes the circadian clock gene expression as a target for chronopharmacology , not only for drugs but also for traditional medicines . Circadian rhythms also greatly affect drug toxicity at the different times of administration . Circadian rhythms are controlled, regulated and maintained by clock gene networks, which are the emerging targets of chronopharmacology and chronotoxicology.\n
This study is supported in part by the Chinese National Science Foundation (81560592).\n
The authors do not have conflict of interest.