To conform to FAIR principles, data should be findable, accessible, interoperable, and reusable. Whereas tools exist for making data findable and accessible, interoperability is not straightforward and can limit data reusability. Most interoperability-based solutions address semantic description and metadata linkage, but these alone are not sufficient for the requirements of inter-comparison of population-based cancer data, where strict adherence to data-rules is of paramount importance. Ontologies, and more importantly their formalism in description logics, can play a key role in the automation of data-harmonization processes predominantly via the formalization of the data validation rules within the data-domain model. This in turn leads to a potential quality metric allowing users or agents to determine the limitations in the interpretation and comparability of the data. An approach is described for cancer-registry data with practical examples of how the validation rules can be modeled with description logic. Conformance of data to the rules can be quantified to provide metrics for several quality dimensions. Integrating these with metrics derived for other quality dimensions using tools such as data-shape languages and data-completion tests builds up a data-quality context to serve as an additional component in the FAIR digital object to support interoperability in the wider sense.
Part of the book: Cancer Bioinformatics