Evolving Roles of Spontaneous Reporting Systems to Assess and Monitor Drug Safety

This chapter aims to describe current and emerging roles of spontaneous reporting systems (SRSs) for assessing and monitoring drug safety. Moreover, it offers a perspective on the near future, which entails the so-called era of Big Data, keeping in mind both regulator and researcher viewpoints. After a panorama on key data sources and analyses of post-marketing data of adverse drug reactions, a critical appraisal of methodological issues and debated future applications of SRSs will be presented, including the exploitation and challenges in evidence integration (i.e., merging and combining heterogeneous sources of data into a unique indicator of risk) and patient’s reporting via social media. Finally, a call for a responsible use of these studies is offered, with a proposal on a set of minimum requirements to assess the quality of disproportionality analysis in terms of study conception, performing and reporting.


Introduction
Prescription of a medication is based on a balance between expected benefits, already investigated before marketing authorization, and possible risks (i.e., adverse effects), which become fully apparent only as time goes by after marketing authorization. Premarketing development, in fact, provides evidence on efficacy of drugs in ideal clinical setting of use (i.e., clinical trials); only the most frequent side effects are recognized in this step. The use of drugs in the real-world circumstances will show the actual risk-benefit profile.
The World Health Organization (WHO) previously defined pharmacovigilance (PhV) as "the science and activities relating to the detection, assessment, understanding and prevention of adverse effects or any other possible drug-related problems" [1], a definition that, in the recent past, was regarded as being synonymous with post-marketing surveillance for adverse drug reactions (ADRs).

Main spontaneous reporting systems
Each National Drug Agency collects its own reports in a dedicated spontaneous reporting database, and some international SRSs gather reports originating both by systematic flows from national databases and by direct submission of the reporter. Each source has specific characteristics and limitations to be considered when planning a drug safety analysis (e.g., completeness of data and options for database interrogation); however, collecting information from all these accessible sources is the mainstay in PhV. Table 1 shows an overview of main international PhV databases, which cover a very large population and heterogeneous patterns of drug use and ADR reporting attitudes. Public access to SRSs is becoming a standard, as addressed in Section 8.2.

The appropriate choice of data source according to the research question
The identification of the most appropriate source of data is a key step to properly address the research question, considering strength and limitations of the different approaches ( Table 2). For instance, SRSs represent the best source of data to investigate the so-called designated medical events (DMEs), usually rare with strong drug-attributable component (e.g., Torsades de Pointes and Stevens-Johnson Syndrome) [5,6]. Conversely, possible role of drugs in events with high background incidence (e.g., myocardial infarction) can be better investigated by healthcare databases (EMRs and claim databases) [7,8]. No matter of the type of ADR, a typical time sequence to detect safety profile of drugs considers data mining of SRSs as the first step of the analysis, followed by investigation through healthcare databases to confirm or refuse statistically significant associations.
From data cleaning (a mere data managing step, see later) to statistical analyses, all steps of data management are considered tasks to address questions on ADRs. Usually, each source of data requires specific data-mining approaches (e.g., disproportion calculation for SRSs and multiple regression analysis for EMRs), but emergent strategies to better exploit the more accessible sources are now appearing in the literature (e.g., self-controlled time series and prescription sequence symmetry analysis-PSSA) [9]. In fact, data mining could virtually provide as many associations as possible between drug and effect, but without consensus among experts on the methodological steps and confirmation of pathophysiological pathways, the association can easily conduct to interpret errors. Validity depends on the scientific rigor of the methods, quality, and type of primary source (RCT or observational studies). Meta-analysis of nonrandomized studies (observational) is currently not standardized.

The regulator's view
Traditionally, regulatory decision-making has relied on detection of safety signals through spontaneous reports. Today, things are changing for several reasons, including increased awareness of prescribers on the importance of PhV and the emerging role of different health professionals and patients.
A modern model involves signal detection, signal validation (i.e., signal should represent a novel causal relationship between a drug and an event), signal prioritization (evaluation of clinical impact of the safety issue), and some other steps to drive the decision-making, also on the basis of data on how drugs are used in a population and how their utilization can be influenced. Drug consumption is also now frequently analyzed by regulators to evaluate the actual impact of risk minimization strategies in a specific settings, such as the risk of progressive multifocal leukoencephalopathy with multiple sclerosis therapies [10].
Regulatory agencies routinely perform analyses of SRSs to detect disproportionality signals, especially for new drugs. Although the Food and Drug Administration (FDA) and the European Medicine Agency (EMA) have different frameworks, they are promoting rigorous scientific information exchange for optimal post-approval drug safety monitoring [11]. Both agencies publicly posted the list of signals emerging from internal analyses, with the aim to promote transparency and stimulate research while avoiding alarm. Usually, many of these signals remain (fortunately) unnoticed by clinicians, and only a minority of them result in measures affecting clinical practice, such as ketoacidosis with sodium-glucose cotransporter-2 inhibitors, which in turn prompted the FDA to revise relevant labels.
Also for old drugs, the importance of spontaneous reports should not be overlooked, especially because the amount of time of a drug on the market (drug age) is correlated with the number of signals detected [12]. The recent case of tiocolchicoside, restricted in recommended dose and treatment duration by the EMA, is noteworthy: after withdrawal of tetrazepam, the use of alternatives (including tiocolchicoside) and relevant spontaneous reporting increased, which made evident specific safety concerns [13].
In the past, regulatory actions on a given safety issue did not support clinical practice. The case of haloperidol and the risk of torsade de pointes (TdP) is a typical example: an ECG before administration was indeed recommended in some circumstances before administering the medicine. However, it was not duly taken into account that a psychotic crisis does not usually allow appropriate ECG measurement, and this results in the inability to use injectable haloperidol in the emergency setting. The clinical consequence was a loss of this therapeutic option and its substitution with alternatives, which are not necessarily better.

The researcher's view
Disproportionality analyses (DAs) are attracting considerable interest in the medical literature for several reasons: 1. there is increasing availability of publicly accessible SRSs and open-access tools to independently analyze international databases [14]; the various web-based resources mainly differ in terms of data transparency, possibility to customize searches and analyses (e.g., correction for confounders); 2. DAs are inexpensive and relatively quick and easy to perform, at least by frequentist methods such as reporting odds ratio (ROR) and proportional reporting ratio (PRR); these methods can be applied systematically to analyze a given pharmacological class or specific DMEs such as TdP [15]; 3. they are likely to be published in a high ranking journal, especially when sophisticated analyses are presented, claiming to correct for multiple confounders [16], and a strong signal emerges. This aspect raises ethical issues: on one hand, the researcher may be more prone toward an alarming interpretation of the findings to increase the impact of the publication. On the other hand, when broadly looking at the published literature in the past 5 years, only a minority of industry-sponsored studies provided "negative findings," that is, the lack of statistically significant DAs [17,18].
This "uncontrolled" scenario has generated what someone coined "apophenia," that is, the perception of meaningful patterns and causal connections among random data [19], or the so-called pharmacovigilance syndrome, that is, the incorrect use of spontaneous adverse event reports to infer that a drug causes an adverse reaction, what the incidence or prevalence of such events may be, and whether one drug has lower or higher risk than another [20]. This in turn increases the complexity in the risk-benefit assessment [21] and may generate false alarm among clinicians [22].
It must be emphasized that statistical techniques, usually referred to as quantitative analyses [23], cannot be used as a standalone approach to assess a drug-related risk because no risk quantification can be offered: they should be viewed in conjunction with a qualitative analysis of individual reports, whenever feasible, and other pieces of evidence (e.g., observational studies). In other words, they cannot replace a proper clinical judgment in the individual patient.
In the recent past, a debate arose on the proper use of DAs and the benefit of their publication [24,25]. However, no actions have been taken so far. The key applications of DAs are summarized as follows: A. Signal detection (including specific events or the overall safety profile). This is the main goal of DAs, especially for medicines with unpredictable pharmacokinetics-pharmacodynamics such as biologicals [26], or recently marketed drugs with still undefined safety profile. This is also justified for rare adverse events that may escape detection in premarketing clinical trials (e.g., TdP, liver injury) or in case an imbalance (not reaching statistical significance) emerged from clinical data, as happened for pioglitazone and bladder cancer [27]. The choice of comparator group is pivotal in signal detection, especially in terms of clinical implications. For instance, a novel antidiabetic drug should be compared with other antidiabetic drugs through the so-called analysis by therapeutic area (i.e., comparing the reporting of a given drug with other agents belonging to the same therapeutic class), in order to identify patients that are likely to share the common risk factors, mitigate the confounding by indication bias, and investigate the potential intraclass variations of risk [28][29][30][31][32]. As a matter of fact, a suspected risk for a drug can be interpreted by a clinical point of view only if compared to the same risk of therapeutic alternatives, especially for severe disorders (e.g., diabetes) because patient cannot be left without treatment.
B. Test/verify/confirm a pharmacological hypothesis. This can be illustrated by a number of examples in the recent past, including the relationship between hERG blockade and occurrence of TdP in humans [33]; the risk of diabetes by antipsychotics, which was more frequently associated with agents blocking simultaneously histamine H1 and serotonin 5-HT2C receptors [34]; the association between different receptor occupancy and antipsychotic-induced movement disorders [35], and the link between dopamine receptor agonist drugs and specific impulse control disorders [36].
C. Address/verify methodological issues. This aspect is receiving an increasing attention because it may strongly impact on final results. Before planning the analysis, it is important to verify all potential biases affecting the drug(s) or event(s) under investigation and prespecify strategies to handle with these confounders (see Section 7) [37][38][39][40][41][42][43].

D. Investigate the likelihood of drug-drug interactions.
A few pilot initiatives proposed theoretical strategies as well as relevant automated methods to detect signals resulting from drug-drug interactions (DDIs) in PhV databases [44][45][46][47][48]. Various approaches can be used to highlight adverse drug interactions: (a) reported suspicion of interactions as noted by the reporter in a case narrative, (b) assignment of the two drugs as interacting (c) drug-drug interaction reported as adverse event, and (d) increased co-reporting for the drug pair when disproportionality is applied [49]. There is also interest in using SRSs to investigate whether a given drug-drug combination moderates the frequency of an adverse event [50,51].
A recent systematic review highlighted that only a minority of studies aimed at confirming or supporting previous regulatory decisions on a given safety aspect [52], thus strengthening the aforementioned concept that DAs do not usually support, on their own, regulatory actions but must be integrated with other data sources.
Apart from DAs, the value of case-by-case assessment should not be disregarded. In fact, the individual evaluation of reports performed by pharmacovigilance experts with medical background has multiple aims: (a) it may per se be used for signal detection of rare ADRs, such as in the case of DMEs by detecting potential drug-event combinations even earlier than DAs [53] and (b) it may confirm or refuse disproportionality signals, by strengthening/reducing causality assessment or by identifying duplicates by automated strategy (through the use of narratives). The key challenging aspect of case-by-case analysis is represented by causality assessment, that is, the process of differential diagnoses to prove actual causal relationship: exclusion of alternative causes, biological and temporal plausibility, evidence of dechallenge and rechallenge (usually unintentional) should be verified. The complexity of causality assessment stems from the fact that it needs to be viewed from the context of the patient treated rather than the drug product [54]. Although several approaches are available to assess causality, no single method is universally accepted and there is no gold standard [55]. The choice of the most suitable approach may also depend on the event under investigation; for instance, ALDEN is a specific algorithmic score validated for assessment of drug causality in Stevens-Johnson syndrome/toxic epidermal necrolysis [56], whereas Roussel Uclaf Causality Assessment Method (RUCAM) was implemented for drug-induced liver injury [57].
As a conclusive remark, it should be recognized that most researchers are from academia, and in fact, their additional role is university teaching. In the last few years, experts of medical teaching have strengthened the importance of PhV in the core curriculum of undergraduate students of healthcare courses (i.e., medicine, pharmacy, dentistry, nursing, etc.). WHO and the most active national PhV centers are committed to better define knowledge, skills, and attitudes that students should acquire in order to have an active role in pharmacovigilance [58].

Potential future applications: evidence integration and risk estimates
Integration of heterogeneous data (literature including mass media, clinical trials, observational studies, spontaneous reporting data analysis, case reports, and preclinical data) is currently in the research domain at the preliminary level, with the degree of confidence and reliance on a given source as key unresolved issues. An attempt to achieve a risk score on the pro-arrhythmic potential of drugs was undertaken within the ARITMO project [59], where a Dempster-Shafer model was used to combine evidence from heterogeneous and independent sources using expert judgment [60]. The only published experience on data integration in pharmacovigilance comes from the (useful) interplay between SRSs and healthcare databases to increase the accuracy of signal detection [61,62].
In the following section, the issue of evidence integration for research purposes will be addressed in the context of systematic reviews, which are increasingly being used as they can make researchers and readers aware about what is known, how it is known, how evidence varies across studies, and thus about what is not already known [63].
Issues of data quality and inherent limitations cause remarkable impact in spontaneous reporting studies in which more sources of variability (e.g., missing data) and biases affecting the results could be identified (competition or notoriety bias). Nevertheless, so far, no specific tools or techniques have been developed to select, compare, or pool together data from DAs. This could be due to a relative paucity of this kind of analysis in the medical literature.
Disproportionality is used to detect "signals of disproportionate reporting" (SDRs) that, once detected, are usually investigated through other, and more precisely, study designs. It is thus rare to have additional DAs regarding the same outcome related to the same drug or drug class and that used a comparable tool for signal detection (frequentist vs. Bayesian approaches). Nevertheless, at least theoretically, techniques and statistical basis used to perform meta-analysis could also be used to analyze results from disproportionality, at least to evaluate consistency of signal across different databases. A consistent signal found in two databases could be probably prioritized in comparison with inconsistent ones. Notably, raw data cannot be pooled because of the existence of an unquantified degree of redundancy (i.e., duplicates across databases), but results can be combined to reach a single "pharmacovigilance score" [59].
It is well known that results of DA cannot be considered as measures of risk: the number of cases in a spontaneous reporting database does correspond to neither the number of cases that happened under the drug nor to that of cases induced by the drug, and the number of exposed people is not measured. From this point of view, including results of disproportionality in a meta-analysis could be considered inappropriate, although identification of heterogeneity in reporting may be of interest [64]. In the absence of any clear guideline, disproportionality studies could be searched and included in (qualitative) systematic reviews, but their results must be kept separated from pooled risk estimates of (quantitative) meta-analyses [65]. A recent experience by a French team on safety of drugs acting on the nitric oxide pathway in pulmonary hypertension considers together results from a DA of VigiBase and from a meta-analysis of clinical trials and concludes that the safety profiles of riociguat and phosphodiesterase inhibitors were different, thus providing a rationale for safe prescribing [66]. This approach, as the integration of spontaneous reporting analysis in meta or teleoanalysis [67], is still a research question.
Preliminary findings raise the hypothesis that, provided that all technical and clinical aspects are addressed, the performance of DAs is remarkable [7] and may approach the relative risks of analytical studies, thus providing an initial indication of the likely clinical importance of an adverse event [68].

Current concepts in study design
Once the research question has been identified, the researcher must keep in mind the various limitations and biases affecting SRSs to reduce the likelihood of detecting spurious signals. Moreover, clinical, pharmacological, and statistical considerations are needed to select the most appropriate dataset, definition of cases, exposure, and covariables for stratification/adjustment.
Although the discussion on performance, accuracy, and reliability of different approaches to perform DAs was fascinating a decade ago, at present there is still no recognized gold standard methodology, and the key factor that may influence results is represented by the threshold defined for the number of cases [69,70]. DAs in spontaneous reporting databases test whether an ADR is reported more frequently than expected; they allow identifying the so-called SDRs [23,71]. These SDRs must be differentiated from safety signals because the existence of a SDR is not sufficient to constitute a safety signal (it does not always result in one, in fact), and a safety signal does not always imply a corresponding SDR [72].
As previously described, the various SRSs differ in terms of accessibility, catchment area, drug codification, and other technical issues. For instance, two key steps must be managed when analyzing the publicly available version of FAERS: drug mapping and removal of duplicates. These aspects have been extensively covered in the previous book chapter, and the reader should refer to this publication for details [73]. The FDA is continuously working to develop a probabilistic record-linkage algorithm combining structured and unstructured data (narratives) to improve the detection rate and accordingly reduce the occurrence of false positive signals [74].

Bias and strategies for their minimization
Before considering a potential causal relationship for a given identified SDR, main biases that affect signal detection from spontaneous reporting must be eliminated or at least mitigated. Notably, even after accounting for major bias, clinical association cannot be inferred from SRSs, and channeling bias (selective

Indication bias
Angiotensin Converting Enzyme (ACE) inhibitors showing signal of hypoglycemia.
These agents are largely used in diabetic patients.

Drug competition bias
Anticoagulants when analyzing drug-induced bleeding.
Anticoagulants are expected to cause bleeding as toxic effect of their drug class.
Analysis by excluding reports with anticoagulants.
ES is a typical ADR in FGAtreated patients.
Analysis by excluding ES to detect new safety signal for FGA.

Notoriety bias
Rhabdomyolysis occurrence with statins after regulatory warnings.
After that alert, the number of events arose.
Studying signal before the alert.  prescription of newer drugs to patients with more severe disease [75]) is unlikely to be fully accounted by statistical adjustments. They are described later together with practical examples and relevant minimization strategies as shown in Table 3.

Dilution bias
Overall, we can identify: (A) indication bias when a drug is found to be associated with a given event for the sole reason that it is indicated in patients with comorbidities that increase the risk of that event; (B) competition bias also called "masking effect" when an event/drug more frequently reported for a given drug/ event can "mask" identification of other possible ADRs/drugs [42,[76][77][78][79][80][81][82]; (C) notoriety bias when media attention (e.g., regulatory warning and milestone publication) causes over-reporting of peculiar ADR for specific drugs [37,38]; and (D) dilution bias when a whole drug class is influenced by media attention for an event, older drugs with a larger numbers of reports are less likely to generate safety signal than newer drugs (with less reports) [83].
The Weber effect is an additional factor that may influence the reporting of given drugs, although it cannot be formally considered as a source of bias [84]. It was originally described as a higher reporting especially during the first 2 years after marketing approval, thus suggesting novelty per se as a risk factor for notification, although modern adverse event reporting systems seem less affected by this bias [43].

Unsettled issues 8.1 Patient reporting: current status
The 2012 PhV legislation forced national competent authorities and marketing authorization holders to record and report cases of suspected adverse reactions reported by patients [3]. This, in turn, caused legislation remarkable increase of the total number of patient reports (+113%) after 3 years, with the Netherlands, the UK, Germany, France, and Italy accounting for 75% of all patient reports [85]. The relevance of patient reports is heterogeneous, and a recent survey on 141 countries worldwide showed that in one-fourth of them, patients were not allowed to report. Conversely, countries receiving the highest percentage of patient reports in 2014 were the USA (64%) and Canada (30%).
More than 70 countries had fewer than 50 reports from patients [86]. The quality and the value of patient reports in the context of signal detection were evaluated in many published studies [87][88][89][90][91]. The value of the reports as a signal is directly dependent on the amount of clinically relevant information, in addition to the fact that an ADR report requires a thorough examination of the potential drug-event association. Most of the published studies comparing information reported by patients and healthcare professionals focused on the completeness of information [86,92].
Patient reports give detailed descriptions of suspected ADRs, attribute reactions to specific medicines, and provide information useful for assessing causality. Patient reports often have richer narratives than those of healthcare professionals, including detailed information about the impact of the suspected ADR on the patient's life [91].
Many studies, mainly from the UK and the Netherlands, showed that patient reports allow for the identification of new ADRs and lead to the strengthening of signal detection activities [90,93,94].
In summary, patient's reporting offers a different perspective in drug safety assessment and may potentially contribute in signal detection. However, it is important to further investigate its actual role in drug safety assessment; in fact, the large number of reports without clear causal relationship (recently called "precautionary report") may alter adverse event profile by masking safety signals or, conversely, creating spurious associations [95].

Ethical and transparency issues
The relevance of patient reporting highlights the need of public access to spontaneous reporting data, and many countries now provide public access to SRSs, with the possibility to have summary presentations for reactions associated to each single drug in the database or a case listing of limited information for each single case report. Both EMA and WHO Uppsala Monitoring Centre (UMC) developed web tools to access a limited set of spontaneous reporting data in their database, EudraVigilance (adrreports.eu) and VigiBase (vigiaccess.org).
The EMA policy includes the possibility for academia or nonprofit organization to ask for a greater access to data as aggregated data outputs or line listings based on core data elements (http://www.ema.europa.eu/ema/index.jsp?curl=pages/ regulation/general/general_content_000674.jsp). However, it has been commented that the EMA's approach to transparency over PhV data is too timid. The public access of PhV data is even more restricted for vaccines, mainly due to the potential negative impact of this public access to the vaccination campaigns. The reporting of serious adverse events not causally related to the vaccination could lead to a misrepresentation of vaccine risks that could be used by antivaccine movement. To our knowledge, very few European countries (e.g., Italy and the Netherlands) give public access to spontaneous data related to vaccines.
A different approach to transparency is followed by UMC and FDA. In VigiBase, custom search service provided by UMC is performed upon request. Any stakeholders can use the custom search services to request a limited set of data for specific studies or projects for a fee.
The best level of transparency is observed for FDA data. Data for both drugs (FAERS) and vaccine (VAERS) can be obtained using web-based search tools that return structured and/or unstructured data. Moreover, the entire database is quarterly downloadable in comma-separated value (CSV) or other formats. This access needs technical skills to properly process the relational database files and any unstructured fields. However, it gives the possibility to any users to analyze FDA spontaneous reporting data even applying DAs [5]. Since June 2014, the FDA developed an innovative platform called openFDA (openfda.gov) to facilitate access and use of big important FDA public datasets by developers, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs) [96]. Recently, the FDA has also launched the FAERS Public Dashboard, a highly interactive web-based tool that will allow to query FAERS data in a user-friendly fashion (https://fis.fda.gov/sense/app/777e9f4d-0cf8-448e-8068-f564c31baa25/ sheet/7a47a261-d58b-4203-a8aa-6d3021737452/state/analysis). These different approaches to public access spontaneous reporting data lead to a bizarre situation because the reports included in EudraVigilance, VigiBase, and FAERS are largely overlapped, and it could be possible to have different information for the same report.

Social media: opportunities and challenges
An area of emerging interest for research is represented by the use of information provided by patients in social media on personal experiences when using a given drug. At present, it is under investigation whether or not (and how) social media data mining can contribute to signal detection [94,95]. A recent review summarizes prevalence, frequency, and comparative value of information on adverse events of healthcare interventions from user comments and videos in social media. The study assessed over 174 social media sites, with discussion forums (71%) being the most popular. The overall prevalence of adverse event reports in social media varied from 0.2 to 8% of posts. Moreover, there was general agreement on overall concordance between adverse events mentioned in social media and those already documented in other sources (such as drug labels and published trials) [97].
The web-recognizing adverse drug reaction (Web-RADR) project, leaded by EMA and funded within the innovative medicines innovation (IMI), aims to recommend policies, frameworks, tools, and methodologies in the use of social media and mobile technology to improve drug safety [98]. Specific objectives are as follows: (a) to develop the specific mobile application prototypes to support adverse drug reaction reporting and the provision of drug safety information to application users and (b) to assess the usefulness of social media data for PhV and more specifically in signal detection activities.
The theoretical advantages of social media in the context of signal detection rely on potential earlier identification of rare and serious drug-related problems, in comparison with conventional SRSs, considering the opportunity to share information as fast as possible and the large number of active users in the social media. It has been reported that patient reports of suspected adverse reactions, particularly for specific reactions, can precede those of healthcare professionals [99]. One study of social media posts containing discussions of adverse drug events ("Proto-AEs") found that there were nearly three times as many Proto-AEs found in Twitter data than reported to the FDA by consumers, with rank correlation between them at the distribution of reactions at MedDRA SOC level [100].
Another important value from social media analyses comes from extracting qualitative insights into the actual discussions made by patients around a drug and an adverse event. This can be of great value for addressing issues related to the patient experience around an ADR and its impact on the quality of life [101]. Moreover, mining data from social media gives us a greater chance of capturing ADRs that a patient would not necessarily complain about to their doctor or nurse and can also help assessment of the risk perceptions of patients.
Key challenge is represented by the identification of drugs and ADRs in the text strings through a particular type of machine learning called natural language processing (NLP). From the perspective of PhV and NLP specifically, user posts on social media contain colloquial language and also misspellings. Especially when using lexicon-based approaches, these present problems as the accuracy of direct matches decreases. Colloquial and informal language is more difficult to parse, and thus, recent research tasks have focused on developing NLP tools specifically for data from social media [102,103]. The balance between sensitivity and specificity of these tools in identifying ADRs is a key issue because a high number of false positives could heavily impact the efficacy of signal detection activities.
Another key element is the quality of the information on adverse events reported in the social media, which was analyzed only by a few works. A study where Internet narratives posted by patients were evaluated showed that the informativeness level was very incomplete and makes their assessment and use for PhV purpose difficult [104].
Concerning the potential of social media analyzes for earlier signal detection, contrasting data are published [105,106].
Social media data mining uses information for PhV purposes, which were not primarily shared by the patient for this purpose. This raises a number of ethical questions, especially about identification of individuals by utilizing additional information, such as the geocode location on posting, username, and other potentially personally identifiable information [107], which are still unresolved. How would patient using social media react when approached for additional information by organizations that collect PhV data? Since this is a new area, ethically sound policy guidance needs to be developed.
A different approach in the use of Internet data for signal detection is the use of anonymized logs of web searchers [108]. In a recent study, a web-based search query method called "query log reaction score" was developed to detect whether adverse events associated with certain drugs could be found from search engine query data. The web query methods have moderate sensitivity (80%) in detecting signals in web query data compared with reference signal detection algorithms, but many false positives were generated, and this method had low specificity [109].

Future perspectives
The continuous increasing number of spontaneous reports and the increasing quality in their systematic archiving and accessing comply scientific community to improve methods of analysis and ways to interpret them for regulatory, clinical, and research purposes.
A specific debated issue on the current role of data-mining procedures of SRSs regards the possibility to directly compare drugs within the same therapeutic class [110]. We are in favor of this approach and strongly encourage further research regarding the use of SRSs, under stringently defined conditions, to compare adverse event rates for drugs [111]. To this aim, all the following criteria must be fulfilled: 1. Same therapeutic indication(s). The effect of the underlying disease may be reduced by restricting DAs to drugs within the same therapeutic area [29,30].
2. Similar market penetration and utilization. Drug consumption/prescription should be considered in order to: (i) complement DAs by highlighting possible risk differences through reporting rates (especially for vaccines and DMEs) [112]; (ii) weigh the drug risk at the population level (and assess the public health impact of ADRs); and (iii) prioritize safety signals emerging from traditional DAs [113].
3. Similar time on the market. This aspect should be carefully considered in the analyses to avoid the temporal or time-point bias, especially when comparing first-versus second-generation drugs. Standardization of the time on the market using the same fixed-length post-approval time-frame has been proposed [110]. 4. Data distortions are unlikely to occur or apply in a similar manner across the drugs under investigation. Stratification (for age and sex) or adjustment should always be considered to minimize the presence of known confounders. Moreover, the existence of specific biases should be verified and accounted for.
An emerging application of SRSs, in the era of Big Data, is represented by their integration with other heterogeneous sources of healthcare data (e.g., the availability of prescription-data, hospital admission and discharge, population-based, disease-based, death registries, social media, and literature) to support proactive PhV in the risk-benefit assessment, as performed in the ARITMO projects through the Dempster-Shafer approach [59]. Finally, the question arises as to whether all disproportionality studies should be published in scientific journals. Supporters of scientific transparency and full release of datasets via Open Science would undoubtedly call for public availability of study results, including negative findings. A proposal was recently formulated [114].
This controversy on the quality of DAs raises the concern on how best assess it and reach consensus on a "set of minimum requirements to assess the quality of DAs in terms of study conception, performing and reporting." Provisional criteria have been recently proposed (from the experience of antidiabetic drugs) [114], but further discussion is warranted: • Clear title. Avoid the general terms such as "pharmacovigilance analysis." Prefer the following terms: "disproportionality analysis," "analysis of spontaneous reporting system," and "analysis of spontaneous reports." • Scientifically sound study conception. The scientific rationale must be clearly indicated and fall within one of these aforementioned categories (DAs are particularly suited for DMEs). Regulatory approach (i.e., identification of a potential signal during routine monitoring of spontaneous reporting systems) and commissioned analysis for regulatory purposes should not be formally eligible for publication in a journal, unless an added value emerges (e.g., the analysis is extended to the entire pharmacological class).
• Transparent study design. The unit of analysis should be described. Case(s) and exposure (reference group) definition should be specifically defined. The search strategy must be stated, and a clear description behind the choice is warranted. Key confounders to be accounted for must be a priori identified. Strategies to handle these biases must be indicated, including stratified or adjusted analyses. Notoriety must be carefully assessed: a structured literature evaluation is recommended, instead of a mere check to summary of product characteristics.
• Balanced discussion and conclusion. Prefer the term "disproportionality signal" and "signal of disproportionate reporting," and avoid the terms such as "alarm signal," "signal of risk," "increased risk," "association," "incidence." Compare the results with those emerging from similar studies (emerged from the structure literature evaluation). Limitations should be provided in a dedicated section, avoid a mere listing of known biases affecting spontaneous reporting system. Avoid the specific recommendations (decision-making approach) to support drug prescription or selection of drugs claimed to be safer.
From a technical standpoint, good signal detection practices have been published by the Innovative Medicines Initiative Pharmacoepidemiological Research on Outcomes of Therapeutics by a European ConsorTium (PROTECT) project, which have formulated 39 recommendations for those working in the PhV community [115].
A final issue regards the timeliness of publishing DAs when keeping with signal detection. For instance, the analysis by Elashoff et al. [16] on pancreatitis reports with incretin-based drugs, apart from methodological flaws and data misinterpretation causing unjustified alarm, was also untimely, considering that observational studies had already been carried out. Conversely, liver injury with direct-acting oral anticoagulants (DOACs) was studies because of limited predictivity of premarketing phases in detecting clinical signals of liver toxicity and previous concern with ximelagatran: the disproportionality signal raised for rivaroxaban in FAERS [116] was tested by the recent US population-based studies, which found lower hospitalization rates for liver injury with DOAC initiators than patients starting warfarin, with rivaroxaban and dabigatran associated with the highest and lowest risk [117,118], although confounders are likely to exist [119,120]. This case underscores the value of performing well-conducted DAs and the importance of directing subsequent analytical research to confirm or refute the drug-related hypothesis.
All these unsettled issues witness the need and the importance of implementing research to finally clarify the role of DAs in clinical practice.

Concluding remarks
Regulators and especially clinicians are appreciating the importance and the role of DAs to monitor and assess the safety profile of marketed drugs. All "actors" dealing with SRSs must always be aware of the so-called seduction bias and selfdeception bias (i.e., over-reliance on mathematical models and the subconscious confidence in expecting a given output from results), thus be reminded of inherent limitations that, at present, do not allow to assess actual risk in clinical practice, mainly because of the lack of certainty in the occurrence of adverse events and the lack of exposure data [121].
From a research perspective, there is an urgent need to raise the bar, aiming to increase the accuracy and reproducibility (in one word the quality) of this kind of study. From one side, there is a room for improvement in several aspects of the analysis of SRSs, including relevant implications and their appropriate use such as the aspect of "no findings" (i.e., findings of nondisproportional results), which has not received sufficient attention so far. Moreover, different research teams are implementing sophisticated methods to account for confounders in signal detection, so that DAs may approach relative risk. In the meantime, we propose to include disproportionality studies in (qualitative) systematic reviews keeping results separated from pooled risk estimates of (quantitative) meta-analyses [63].
In conclusion, SRSs represent an invaluable source to monitor and assess the safety of medications, including drugs, vaccines, and healthcare products.
We call for a responsible use and publication of DAs, which should be regulated through a consensus approach among experts; this would finally establish the use and transferability of DAs in clinical practice.