This chapter explores the concept of open data with a focus on Open Government Data (OGD). The chapter presents an overview of the development and practice of Open Government Data at the international level. It also discusses the advantages and benefits of Open Government Data. The scope and characteristics of OGD, in addition to the perceived risks, obstacles and challenges are also presented. The chapter closes with a look at the future of open data and open government data in particular. The author adopted literature review as a method and a tool of data collection for the purpose of writing this chapter.
- open data
- open government data
- OGD development
- OGD principles
- OGD practice
- OGD barriers
- open data portals
The concept of Open Government Data (OGD) has been heavily debated during the last few years. It has drawn much interest and attention among researchers and government officials worldwide. Many of the developed and developing countries have launched open data initiatives with a view to harnessing the benefits and advantages of open government data. This chapter is dedicated to highlighting the various aspects of open data and open government data.
According to the Open Definition, “Open” in the context of data and content “means anyone can freely access, use, modify, and share for any purpose”. There are many types of data that can be open and used or re-used by the public. These include data relating to culture, science and research, finance, statistics, weather, and environment [1, 2].
The Open Knowledge Foundation outlined key features of openness as the following:
Availability and access: the data must be available as a whole, in a convenient and modifiable form and at a reasonable reproduction cost, preferably by downloading over the internet.
Reuse and redistribution: the data must be provided under terms permitting reuse and redistribution, with the capability of mixing it with other datasets. This data must be machine-readable.
Universal participation: the data should be available for everyone to use, reuse and redistribute without discrimination against fields of knowledge, or against persons or groups .
Features of open data also include the following aspects: Data should be primary and timely and accessed data must be available in non-proprietary formats and free to use with unrestricted license. Data should also be as accurate as possible. Although most of the data will not meet all of these criteria, data is only truly open if it meets most of them .
The earliest appearance for the term open data was in 1995. It was related to the disclosure of geographical and environmental data in a document written by an American agency. The scholarly community understood the benefits of open and shareable data long before the term open data was a technical object or political movement .
The Scholarly Publishing and Academic Resources Coalition (SPARC) defined open data from a research perspective as: “Open Data is research data that is freely available on the internet permitting any user to download, copy, analyze, re-process, pass to software or use for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself” . SPARC stressed the benefits of open data in that it accelerates the pace of discovery, grows the economy, helps ensure people do not miss breakthroughs, and improves the integrity of the scientific and scholarly record,
The current concept of open data and particularly open government data (OGD) started to become visible and popular in 2009 with a number of governments in the developed world who announced new initiatives to open up their public information records such as the USA, UK, and New Zealand. These initiatives were triggered by the mandate for transparency and open government from the then American President Barack Obama administration, thus kick starting the Open Government Data Movement [6, 7].
2. Development of the open government data concept
Open government data (OGD) is government-related data that is made open to the public. Government data usually contain various datasets, such as budget and finance, population, census, geographical data, parliament minutes, etc. It also includes data collected by public organizations or agencies such as data related to climate or pollution, public transportation, traffic, child care or education .
Open government data has been associated with Open Government which is perceived as a phenomenon encompassing a number of characteristics and dimensions such as information availability, transparency, participation, collaboration, and information technologies . The concept of open government data can be traced back to the year 1966 when the USA federal government passed the Freedom of Information Act (FOIA). The coming of the internet and new information and telecommunications technologies contributed to the more recent interest and understanding of the value and benefits of government information for the sake of transparency, collaboration and innovation . Two significant consequent developments contributed positively to the open government data; these are the launching of data.gov in the USA in May 2009 and the data.gov.uk in the United Kingdom (UK), in January 2010. It subsequently spread out to many other countries around the world, as well as to international organizations, including the World Bank and the Organization for Economic Co-operation and Development (OECD). Moreover, the concurrent advances in the information and telecommunications technologies also played a role in the development of open government data, coupled with the passing of open standard laws by many countries such as Canada, the USA, Germany and New Zealand, and the setting of policies on open data focusing on indexing government data holdings [13, 14].
In 2015 a number of governments, civil society members, and international experts convened with the purpose of representing an internationally-agreed set of norms for how to publish government and other public sector organizations data. They then formulated a set of principles called the Open Data Charter. They introduced these principles with the following statement:
“We, the adherents to the International Open Data Charter, recognize that governments and other public sector organizations hold vast amounts of data that may be of interest to citizens, and that this data is an underused resource. Opening up government data can encourage the building of more interconnected societies that better meet the needs of our citizens and allow innovation, justice, transparency, and prosperity to flourish, all while ensuring civic participation in public decisions and accountability for governments…” .
The conveners agreed to adhere to the following set of principles concerning access and release of government and public sector data. That data should be.
Open by Default;
Timely and Comprehensive;
Accessible and Usable;
Comparable and Interoperable;
For Improved Governance and Citizen Engagement;
For Inclusive Development and Innovation .
The scope of Open government data which is made available with no restrictions on its use, reuse, or distribution covers all data funded by public money excluding private, security sensitive, and confidential data.
3. Open government data practice
The Open Data Barometer, an international benchmark of how open data publishing is used by governments for accountability, innovation and social impact ranked, in 2018, 30 leading world countries, excluding the EU countries, according to their performance and commitment to the principles of open data movement. It measured the progress these 30 governments have made against three essential ingredients for good open data governance, defined as part of the Open Data Charter updates process, namely Open by Default, Data Infrastructure, and Publishing with Purpose. In other words, the Barometer ranked governments according to three criteria: readiness for open data initiatives, implementation of open data programs and impact that open data is having on business, politics, and civil society. The top ten ranking countries were Canada, UK, Australia, France, South Korea, Mexico, Japan, New Zealand, USA, and Germany, respectively . On the other hand, the open data maturity assessment reported on data maturity in European countries for the year 2020. It provided insight into the developments in the open data field in European countries, including the 27 EU Member States, and the participating European Free Trade Association (EFTA) countries Liechtenstein, Norway, and Switzerland, including the Eastern Partnership countries Azerbaijan, Georgia, Moldova, and Ukraine, besides the United Kingdom. The assessment measured open data maturity with regard to four dimensions: policy, impact, portal, and quality. Maturity was scored against these dimensions, forming an overall score of open data maturity for each country. The countries were clustered into four groups, from the most mature to the least. Seven countries are labeled as trend setters according to their performance. They are Denmark, Spain, France, Ireland, Estonia, Poland, and Austria .
Public institutions are among the largest creators and collectors of data in many different fields or categories. These data categories include areas such as transportation, traffic, finance, environment, economy, government, weather, geographical information, tourist information, statistics, business, public sector budgeting, performance levels, and science and technology. Data about policies and inspection in fields such as education quality, safety, food… etc. is also included. In addition to this, international OGD sites have a specific characteristics and data patterns in terms of their OGD levels, data formats, and datasets. Top data formats used are CSV, PDF, RDF JSON, and XML. Table 1 provides the definitions and examples of these file types. However, there are clear variations among world regions in terms of the number of data formats, datasets, and data categories [7, 23].
Boosting democratic control and political participation, fostering service and product innovation, and enhancing law enforcement are three primary motivations to publishing government data. In comparing the open data strategies of five countries, namely Australia, Denmark, Spain, the United Kingdom and the United States, It was found that the focus of the strategic plans differs. For example, the United States government focused on transparency for the purpose of increasing public engagement, Denmark emphasized the potentials that open data offers for the introduction of new products and services, whereas the United Kingdom focused on the use of open data for strengthening law enforcement .
Citizens use four types of OGD applications in order to engage with their governments for the purpose of open government. The first type of application focuses on using access to government information to weed out corruption in government. The second type of application focuses on the direct benefit to the public of access to legal materials, such as access to the law itself. The third application is related to informing policy decisions whereby information helps citizens to better understand their own communities. The fourth type of application is related to consumer products where applications are products that bring open government to a wide consumer audience . Open Government Data can lead to a more effective and efficient government particularly regarding its relation with citizens. This can be achieved by increasing transparency and accountability, developing trust, credibility and reputation, promoting progress and innovation, encouraging public education and community engagement, and storing and preserving information over time . Therefore, open data can lead to open government which is defined as: “….. a multilateral, political, and social process, which includes in particular transparent, collaborative, and participatory action by government and administration. To meet these conditions, citizens and social groups should be integrated into political processes with the support of modern information and communication technologies, which together should improve the effectiveness and efficiency of governmental and administrative action” .
3.1 Portals and the publication of OGD
According to the principles of OGD, data must be: complete, primary, timely, accessible, and machine-readable. It should also be non-discriminatory, non-proprietary and License-free. Furthermore, public institutions should publish all data they have if it would not violate security, privacy or other legitimate restrictions .
The World Wide Web Consortium (W3C) outlined three steps for publishing open data, which will help the public to easily find, use, cite and understand the data:
Step 1: Publishing the data in its raw form. The data should be well-structured to enable its use in an automated manner by the users of the data. Data may be in XML, RDF or CSV formats. Formats used should allow the data to be seen as well as extracted by the users.
Step2: Creating an online catalog of the raw data, complete with documentation, to enable users to discover published data.
Step 3: Making the data human readable as well as machine-readable .
Open data portals are a very important component of open data infrastructure. They connect data publishers with data users enabling the former to deliver open data and establish the necessary relationships for increasing transparency. Open data portals, which are essentially data management software, contain metadata about datasets so that these datasets could be accessed and utilized by the users. The open data portal includes the tools which help the users to find and harvest all relevant data from public sector databases. From the users’ perspective, features of open data portals can be used to specify datasets they need and to request for datasets . Thus, Open data portals play the role of interface between government data and citizens who use or reuse this data. Consequently, a portal should have user- friendly features such as a clean look with a search facility. The portal should also provide information about the responsible authority which hosts the portal written clearly and in a simple language. The portal’s contents should be organized into categories and subcategories. It should also aim to engage citizens’ ideas and feedback in addition to its basic function of making data available to stakeholders. Data quality and standards, and the language settings are very important elements in portals so that they can satisfy their users’ needs .
The World Wide Web Consortium’s (W3C) benchmark for publishing open government data and the World Bank’s technical option guide outlined the necessary technical requirements for establishing efficient and modern OGD data centers. These requirements include, among other things, that:
Public datasets should be published in their raw state rather than in an analyzed form,
Each dataset is accompanied by a well-documented metadata, and.
Data is stored in multi formats – both human and machine- readable formats, such as CSV, XML, PDF, RDF JSON etc. to enable the users to easily access published data. It is expected that documental data are stored in either PDF, doc(x) or Excel, and geographical data are stored in Keyhole Markup Language (KML) or their equivalent alternatives [28, 31].
As for OGD portals’ content and functionality requirements, these include the following:
A number of Datasets.
Links to External Websites.
A number of data Categories such as data about education, weather, budget …etc.
Data Currency. Data should be current and up to date.
Availability of metadata. Datasets should come with requisite information that adequately describes the data.
Data Search. A ‘search box’ feature should be available to allow users to easily locate specific information by entering a search term.
Availability of working social media plugins. This feature enables data users to share their experiences and suggest new datasets through comments in social media websites such as Facebook, tweeter…etc.
There are a number of additional requirements that contribute to making portals achieve sustainability, meet user needs and have an added value impact. These requirements are the following:
Dataset should be organized for use and not only for the sake of publication.
Should learn from the techniques used by recent commercial data market, share knowledge to promote data use, and adapt methods that are common in the open source software community;
Invest in best practices related to discoverability.
Enhance reuse by publishing high quality metadata;
Ensure interoperability by adopting standards;
Engage with more users and re-users by co-locating tools;
Enhance value by linking datasets;
Being accessible by offering both options for big data, such as Application Programme Interfaces (API), and options for more manual processing, such as comma separated value files, thus ensuring a wide range of user needs are met;
Co-locating documentation to make it easy for non-expert users to understand the data;
Assess how well they are meeting users’ needs by being measurable .
A number of open source and commercial open data portals software exist. Some of the more widely used open source software are the following:
CKAN: This is an open source data portal designed to allow publishing, sharing and managing datasets; it has a number of functionalities to the managers and end-users such as full-text search, reporting tools, and multi-lingual support. It also provides an Application Programming Interface (API) to access the data.
DKAN: compared to CKAN, this software has more data-oriented features including scrapping, data harvesting, visual data workflow, and advanced visualization. DKAN users are mainly government organizations and Non-Governmental Organizations (NGOs).
Socrata: It has a number of powerful data management tools for database management, data manipulation, reporting, visualization with advanced options and customized financial analytics insights. Socrata has two licenses; an open source license for the community edition and commercial one for the enterprise edition.
Dataverse: It is built to share and manage large data-sets. It helps its users to collect, organize, and publish their data-sets in a collaborative platform. Dataverse is employed around the world by Non-Governmental Organizations (NGOs), Government organizations and research centers .
3.2 Open data best practice
The European Data Portal published a report in the year 2020 highlighting the best open data practices implemented by the three top performing countries of the year 2019 assessment - Cyprus, France and Ireland. The reported practices may be applicable to other international contexts. The practices were categorized into four aspects relating to open data, namely, Open Data Policy, Open Data Portal, Open Data Impact, and Open Data Quality. Table 2 shows the best practices associated with each one of these aspects.
|CSV||CSV stands for “comma-separated values||Product, Size, Color, Price Shirt, Large, White,$15 Shirt, Small, Green $12 Trousers, Medium, Khaki, $35|
|RDF||An RDF file is a document written in the Resource Description Framework (RDF) language. This language is used to represent information about resources on the web. It contains the website metadata. Metadata is structured information. RDF files may include a site map, an updates log, page descriptions, and keywords.||<?xml version = “1.0? > <rdf xmlns = “http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:s = “http://description.org/schema/”> <Description about = “https://www.xul.fr/Wells”> <s:author>The Invisible Man<s:author> </Description> </rdf>|
|XML||An XML is a file written in extensible markup language. It is used to structure data for storage and transport. In an XML file, there are tags and text. The tags provide the data structure, and the text in the file is surrounded by these tags, which adhere to specific syntax guidelines. The XML format is used for sharing structured information between programs, and between computers and people, both locally and across networks.||<part number = “1976” > <name>Windscreen Wiper</name> <description>The Windscreen wiper automatically removes rain from your windscreen, if it should happen to splash there. It has a rubber <ref. part = “1977” > blade</ref.> which can be ordered separately if you need to replace it. </description> </part>|
|Open Data Policy|
|Open Data Portal|
|Open Data Impact|
|Open Data Quality|
4. Benefits of open government data
Open government data has a number of economic and political implications and benefits, particularly on the democratic aspect. They include better transparency, citizens’ trust in the government and collaboration in governance, and economic development. These benefits of open government data utilization can be detailed in the following:
Political and social benefits. These include the following aspects: more transparency, democratic accountability; more participation and self-empowerment of citizens; creation of trust in government; public engagement; equal access to data; new governmental services for citizens; improvement of citizen services, citizens’ satisfaction, and policy-making processes; allowing more visibility for the data provider; creation of new insights in the public sector; and introduction of new and innovative social services.
Economic benefits. These include the following aspects: economic growth; stimulation of innovation; contribution towards the improvement of processes, products, and services; adding value to the economy by creating a new sector; and availability of information for investors and companies. When Open Data is used to produce new products or start new services, it can increase the demand for more data causing the release of more datasets and improvements in data quality .
Operational and technical benefits. These include the following aspects: The ability of reusing data; optimization of administrative processes; improvement of public policies; accessing external problem-solving capacity; fair decision-making by enabling comparison; easier discovery of and access to data; creating new data based on combining and integrating existing data; validation of data by external quality checks; and avoidance of data loss [6, 23].
For OGD to be beneficial, it should meet the following conditions:
Quality of data. This includes four components; timeliness, availability of metadata, accuracy and usefulness.
Legislation/policy. A clear legal framework should be put in place in order to organize the relationship and avoid uncertainties regarding copyright, privacy, personal data and data openness.
Skills. Technical skills and knowledge about data on the part of users is essential in order for them to be able to use open government data, such as knowledge about statistics or programming.
Infrastructure. Infrastructure is required for the purpose of facilitating the exchange of data between government institutions and users. Examples of such infrastructure are software for data analytics and discovery, and web-based platforms. The essential features of OGD infrastructure which have strong impact on its utilization are feedback mechanisms between supplier and users, and data processing capabilities,
Privacy. OGD policies should consider privacy issues by eliminating private-sensitive data and data related to national security issues. These policies should ensure compliance to confidentiality and privacy guidelines .
5. Open government data barriers, challenges, and risks
5.1 Barriers and challenges
Open government data faces a number of barriers and challenges that may impede its development and implementation. Some of these barriers are related to either the data providers or the data users, while other barriers can be attributed to both sides. Barriers that might face either side are outlined below:
Prevalence of closed government culture and lack of open government data policy.
Existence of privacy legislation that protects privacy violation leading to identification of persons, besides existence of conflicting laws about data access.
Poor data quality. This includes lack of sufficient and accurate data and availability of obsolete and non-valid data.
Difficulty in searching and browsing data due to lack of metadata or an index, complexity of available data formats and datasets, in addition to information overload and lack of open data user manuals.
There are some other barriers that are encountered by both open data publishers and users. These include the following:
Lack of technical knowledge on metadata quality on the part of portal owners. This may lead to the publication of inappropriate metadata which in turn causes re-users to find it difficult to find the data they need.
There are political, organizational, legal, technical and financial barriers. These instances can be improved by taking into consideration the specific needs of the users of open data.
Geospatial data has its own specific barriers resulting from the use of different standards in relation to other types of open data. Dealing with this type of data requires specific technical knowledge and expertise [37, 38].
Challenges at the institutional level include the avoiding culture of governments whereby governments are reluctant to open public data, the time consuming procedure to access and reuse data, and the fact that governments do not take users’ ideas into consideration in government administration. On the other hand, challenges at the users’ level include lacking of advance search facilities, lacking of helpdesk facilities for the users, and lacking of expertise to analyze data on the part of the users .
There are some challenging factors which discourage institutions and governmental bodies to join an open government data initiative. These factors include, but are not limited to the following:
Lack of awareness of open data.
Lack of motivation and purpose to opening public data.
Lack of open-mindedness about the application of open data and the focus on publishing of data regardless of its good quality or perceived value.
Non allocation of budget for opening data because it is still a recent not fully understood concept.
Absence of an institutional body that is dedicated solely to the task of opening data which results in lack of regular monitoring of the performance of the open data initiative .
Many risks confront and may consequently impede the proper implementation and utilization of open government data. They include the ones listed below:
Difficulties in determining who owns the published data. This may be accompanied by unclear responsibility and accountability about publishing the data.
Unintentional violation of privacy and violation of legislation may take place.
Published data can be biased, misinterpreted or misused.
Open data may have negative consequences for the government.
Poor information quality may lead to wrong decision making.
Embargo period may cause published data to be out of date.
Others may profit from open data rather than the intended citizens.
Data with little or no value may be published resulting in a waste of resources .
To avoid and mitigate OGD challenges and risks, a number of practical solutions can be designed to enhance the accessibility and reusability of open government data on the legal, institutional and technical levels. These solutions include:
Creation of data portals and metadata;
Simplification of licensing issues.
Education of data users and data providers on what is technically and legally possible so they can develop their plans within these boundaries.
linking of the discussion on technical and legal requirements so that the former may not end up being difficult to implement in national jurisdictions and the latter may not be unrealistic or unadjusted to technical developments and practice .
6. Future of OGD
Four key elements characterize future trends of open data, namely:
Purposeful publication of the data, focusing on its impactful reuse.
Strengthening of data collaboration and partnerships; expanding the circle of those involved in open data projects and enabling more direct collaboration between data holders and data users.
Advancement of open data at the subnational level and emphasizing on building open data capacity and meeting open data demand at the subnational level, rather than only at the national level. This is achieved by publishing data held by the public sector and other institutions in cities, municipalities, states, and provinces.
Prioritization of data responsibility and data rights; Potential bias in the analysis and use of certain open datasets or how open data initiatives might negatively impact the rights of citizens. Moreover, privacy issues should be taken into consideration by practitioners. These are key elements in any open data project .
A seminar held by Statisticians, civil society and private sector ahead of the 48th session of the UN Statistical Commission that took place in 2017 discussed new trends and emerging issues in open data in light of the 2030 Agenda for Sustainable Development. The outcome of this seminar included the following insights and recommendations for the purpose of making the world more open to open data:
Providing free access and use of data by open data platforms for purposes of transparency, accountability and daily decision making;
Ensuring that the principles of data rights and access are matched with strict ethical and security protocols;
Facilitating and enabling the efforts to making data more open by advanced technologies and approaches to data architecture and management.
Collaboration with civil society in issues related to open data, particularly in the areas of principles, readiness and evaluation of openness of data, and collaboration with academia and technology firms for building portable and interoperable common technology infrastructure as a public good .
This chapter explored the various aspects of open government data. The chapter opened by defining the concepts of openness, open data, and open government data (OGD). It then proceeded to explaining how OGD developed during the last few years and highlighted the most important cornerstones of this development. The chapter then explained the various requirements of OGD implementation and utilization. It highlighted the practice of OGD around the world. It also explained the role of portals in OGD implementation and utilization, outlining their various technical and functional requirements, besides introducing a number of open source portal software and applications. The chapter then elaborated on a number of benefits and advantages of open government data for the government and the citizens. Finally the chapter discussed the barriers, challenges and risks that are confronted by open government data initiatives. The chapter then closed by highlighting some perceived future trends of open government data.