Open access peer-reviewed chapter

Ethical Use of Data in AI Applications

Written By

Anthony J. Rhem

Submitted: 20 February 2023 Reviewed: 07 April 2023 Published: 26 May 2023

DOI: 10.5772/intechopen.1001597

From the Edited Volume

Ethics - Scientific Research, Ethical Issues, Artificial Intelligence and Education

Miroslav Radenkovic

Chapter metrics overview

288 Chapter Downloads

View Full Metrics

Abstract

Artificial Intelligence (AI) equips machines with the capacity to learn. AI frameworks employing machine learning can discern patterns within vast data sets and construct intricate, interconnected systems that yield results that enhance the effectiveness of decision-making processes. AI, in particular machine learning, has been positioned as an important element in contributing to as well as providing decisions in a multitude of industries. The use of machine learning in delivering decisions is based on the data that is used to train the machine learning algorithms. It is imperative that when machine learning applications are being considered that the data being used to train the machine learning algorithms are without bias, and the data is ethically used. This chapter focuses on the ethical use of data in developing machine learning algorithms. Specifically, this chapter will include the examination of AI bias and ethical use of AI, data ethics principles, selecting ethical data for AI applications, AI and data governance, and putting ethical AI applications into practice.

Keywords

  • AI bias
  • AI ethics
  • data ethics
  • artificial intelligence
  • machine learning
  • data governance

1. Introduction

Artificial Intelligence (AI) provides machines (i.e., computers) with the means to acquire knowledge. AI structures that utilize machine learning possess the capability to identify patterns in immense quantities of data (structured, semi-structured, and unstructured) and simulate sophisticated, interlinked systems, ultimately producing results that augment the proficiency of decision-making. Machine learning is also seen as a form of applied statistics, albeit with an increased use of computing and data that is used to statistically estimate complicated functions. Machine learning depends on learning from patterns of data to make new predictions.

AI in particular machine learning has been positioned as an important element in contributing to as well as providing decisions in a multitude of industries. The use of machine learning in delivering decisions is based on the data that is used to train the machine learning algorithms. It is imperative that when machine learning applications are being considered that the data being used to train the machine learning algorithms are without bias, and the data is ethically used. AI Ethics is the responsible and trustworthy design, development, implementation, and use of AI systems including the data used to train the AI systems and the knowledge produced by them.

Ensuring that AI systems are developed and used in a way that promotes equality and fairness for the users and those effected by the AI system should be at the forefront of any AI system implementation as well as its ethical use and the ethical use of data. To ensure AI systems are developed with an ethical core, it is essential to start with establishing a diverse AI product development team that is active in the design, development, and implementation of the AI application. A diverse team will bring a “diversity of thought” to the initiative and especially during the selection and cleansing of data to assist in removing bias from being a part of the algorithms being used and ensure the models are trained with ethical data that adheres to data privacy and security. A diverse team, through collaboration, knowledge sharing, and knowledge reuse will bring different points of view, different experiences, and different cultural backgrounds to stimulate innovation and to eliminate (or limit) bias. This action leads to innovation. This innovation will enable organizations to deliver unique and or improved AI products.

Leaders also need to be aware of the ethicality of AI applications being developed and deployed at their organizations. Leaders must examine and understand whether the outcomes from the application of AI violate US Federal, GDPR, and/or other ethical, security, and privacy standards. Leadership will need to adopt a standard for AI that identifies general tenants for AI implementation focused on ethical adherence [1]. Leaders must enable support for implementation, acceptance, and adoption of AI. Considerations for cultivating a system thinking mindset and incorporating the five disciplines of systems thinking, personal mastery, creation of mental models, creation of a shared vision and cultivation of team learning [2], is essential for effective leadership of AI implementation.

The ethical use of data in AI applications is a critical issue as AI systems and algorithms rely on data to learn and make decisions. The way data is collected, stored, used, and shared can have significant impacts on individuals, organizations, and society. Ethical use of data in AI systems builds trust and ensures that they are adopted and used in a responsible manner. This chapter examines AI bias and ethical use of AI Applications, Data Ethics Principles, Selecting ethical data for AI applications, AI and Data Governance, and Putting Ethical AI Applications into Practice.

Advertisement

2. Examining AI bias and ethical use of AI applications

When examining AI bias and the ethical use of AI, there are several key factors to consider which include:

  • Data bias: The training data used to develop an AI model may contain biases that are then reflected in the model’s decisions and predictions. It is important to examine the data used to train the model and identify any potential sources of bias that may be present.

  • Algorithmic bias: The algorithms and mathematical models used in AI can also be biased and may lead to biased decisions or predictions. It is important to examine the algorithms used in the AI system and identify any potential sources of bias that may be present.

  • Fairness: AI systems should be fair and non-discriminatory, and not perpetuate or exacerbate existing inequalities or biases. It is important to examine the AI system to ensure that it is not treating different groups of people unfairly.

  • Explainability: AI systems should be explainable, so that individuals can understand how and why decisions are being made. It is important to examine the AI system to ensure that it is transparent and interpretable.

  • Privacy: The use of AI should respect privacy and personal autonomy, and not violate individuals’ rights. It is important to examine the AI system to ensure that it is collecting and using data in a way that is consistent with privacy laws and regulations.

  • Transparency: The purpose and use of the AI system should be clear and open to all stakeholders. It is important to examine the AI system to ensure that it is transparent and that stakeholders are aware of how the data is being used.

  • Security: The AI system should be designed to protect data from unauthorized access, use, or disclosure. It is important to examine the AI system to ensure that it is secure and that data is being stored and transmitted securely.

  • Continual assessment: Organizations should regularly assess the ethical implications of their AI use and make any necessary changes to ensure they align with these principles.

Advertisement

3. Data ethics principles

Data ethics principles are the principles and values that govern the collection, storage, use, and dissemination of data. The Organization for Economic Cooperation and Development (OECD) and the Institute of Electrical and Electronics Engineers (IEEE), among others have guidelines and principles related to data ethics.

The OECD has published the OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data, which set out a framework for data protection and privacy. The guidelines cover topics such as data collection and use, security, and access to personal information, and emphasize the importance of protecting privacy and personal data in the digital age [3].

The IEEE has developed a set of Ethically Aligned Design (EAD) principles for autonomous and intelligent systems, which include data ethics considerations. The IEEE EAD principles focus on promoting human well-being, protecting privacy and individual rights, and ensuring that data is collected and used in a responsible and ethical manner [4]. The principles emphasize the importance of transparency, fairness, accountability, and informed consent in the development and deployment of autonomous and intelligent systems [4].

Data ethics principles emphasize the importance of privacy, transparency, and responsibility in data practices, and provide guidelines and principles for ensuring that data is collected, stored, and used in an ethical manner. The following are key areas representing data ethic principles:

  • Transparency: Data ethics principles should be clear, open, and transparent to all stakeholders. This includes clearly stating the purpose and use of collected data, as well as providing information on how data is collected, stored, and protected.

  • Fairness: Data should be collected and used in a way that is fair and non-discriminatory. This includes ensuring that data collection does not perpetuate or exacerbate existing inequalities or biases.

  • Privacy: Data privacy should be respected, and data should be collected and used in a way that protects individuals’ personal information and autonomy. This includes ensuring that data collection is done with informed consent, and that data is not shared or used in ways that violate individuals’ privacy rights.

  • Responsibility: Data collectors and users are responsible for ensuring that data is collected and used ethically. This includes being accountable for any harm that may result from the collection or use of data and taking steps to mitigate that harm.

  • Security: Data should be stored and transmitted securely, in order to protect it from unauthorized access, use, or disclosure.

  • Inclusivity: Data collection and use should be inclusive and considerate of diverse perspectives and experiences. This includes being aware of and addressing any potential biases in the data, and actively seeking out underrepresented perspectives.

  • Transparency in decision making: Decisions that are made using data should be explainable and interpretable, so that individuals can understand how and why decisions are being made, and if any bias is present in the model.

  • Continual assessment: Organizations should regularly assess the ethical implications of their data collection and use practices and make any necessary changes to ensure they align with these principles.

Advertisement

4. Selecting ethical data for AI applications

Selecting the appropriate data for AI applications is critical in the process of building and training machine learning algorithms. The data that is selected should be representative of the problem that the model is intended to solve. In the selection of ethical data this will include data relevance, the data should be directly related to the problem that the model is intended to solve; data quality, the data should be of high quality, with minimal errors, missing values, and outliers; data diversity, the data should be diverse, representing a variety of examples from the problem domain; data quantity, the amount of data used for training the model can have a significant impact on its performance; data preprocessing, the data may need to be preprocessed before it can be used for training; and data annotation, depending on the type of model being built, the data may need to be annotated, which involves adding labels or tags to the data to indicate what the model should learn.

The ethical data selection process for AI/ML applications delivers methodical and technological data management backing to tackle data quality concerns, optimize data utilization, and ensure continuous management of data throughout its lifespan. This process encompasses data discovery and acquisition, upholding data quality standards, enhancing value, and facilitating reuse over time. The selection of ethical data starts with a comprehensive and repeatable data curation process.

Data curation for AI/ML applications is the process of selecting/creating, organizing, and determining data gaps so data can be used to train machine learning (ML) algorithms. Data curation is a form of data management, and it involves annotation, publication, and presentation of the data. Data curation can include data from various sources. Data curation services include data profiling, data management, data lineage, data disposal, and data assurance.

Data curation process for AI/ML applications includes (see Figure 1: Data Curation Process):

Figure 1.

Data curation process [5].

Data Audit

Determine what data is ready to be used for algorithm training and determine data gaps. Perform the data audit from the various sources of data under consideration. Also, determine what data is ready to be utilized, evaluate the quality of data, determine the gaps in data, and identify the measurements to determine what data is used (and not used).

Data Analysis

Data analysis investigates the ideas, connections, business regulations, and metadata within information. This offers a consistent, orderly, and shareable framework for enterprise data. Semantics focus on the interpretation of the concepts identified in the data model, as well as the significance of the relationships between those concepts, typically conveyed through business rules.

Address Data Gaps

Conducting a Data Audit yields insights into the discrepancies present within the data, pinpointing the supplementary data sources required for efficient algorithm training. This process not only highlights potential areas of improvement but also facilitates a more comprehensive understanding of the data landscape, ultimately enhancing the performance and accuracy of the AI/ML models being developed.

Data Selection, Validation & Ethics

Data selection should be considered in terms of significance. Ascertaining the importance of data hinges on several factors: the degree in which the data is fundamental to the subject area; validity, referring to the data’s accuracy, timeliness, and pertinence to the domain in question; relevance, denoting the disciplinary, occupational, or societal worth of the data; and utility, which encompasses the overall usefulness of the data within the domain being examined [6]. Validation, or data validity, centers around ensuring that data is procured from, and/or grounded in, authoritative or dependable sources, while also being subject to routine evaluations in accordance with the specified governance policies. This process bolsters the credibility and reliability of the information being utilized [7]. Also, the application of ethical data principles is applied during this process.

Conduct Algorithm Training

The process of training an ML model involves providing an ML algorithm with training data to learn from [6]. Once the data has completed its transformation in the data selection and validation process, including applying ethical data principles, the machine learning algorithms now have data ready for the training process.

Selecting ethical data for AI applications involves several steps, including:

  • Identifying the purpose and goals of the AI application: Before selecting data, it’s important to understand the purpose and goals of the AI application, and how the data will be used to support those goals.

  • Examining the data sources: Look for data sources that are reliable, accurate and unbiased, and consider the potential sources of bias that may be present.

  • Ensuring data quality: Data should be complete, accurate, and consistent, and any errors or inconsistencies should be identified and addressed.

  • Checking for data privacy: Data should be collected with informed consent and should not violate individuals’ privacy rights.

  • Ensuring data security: Data should be stored and transmitted securely, to protect it from unauthorized access, use, or disclosure.

  • Balancing data inclusivity: Be aware of and address any potential biases in the data and actively seek out underrepresented perspectives.

  • Ensuring transparency: The purpose and use of collected data should be clear, open, and transparent to all stakeholders.

  • Continual assessment: Organizations should regularly assess the ethical implications of their data collection and use practices and make any necessary changes to ensure they align with these principles.

  • Diversifying the data: Diversify the data as much as possible by including a wide range of perspectives, experiences, and characteristics.

  • Auditing the data: Use techniques like bias detection, fairness and accountability to measure and mitigate any potential biases in the data.

4.1 Use of synthetic data

Synthetic data is computer-generated data rather than data coming from real-word records [8]. This data is typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models. In the context of AI and ML, synthetic data is often used when real-world data is scarce or unavailable. It can also be used to augment existing datasets to improve the performance of AI/ML models [9].

There are several methods for generating synthetic data, including generative adversarial networks (GANs), variational autoencoders (VAEs), and simulation models. The specific method used will depend on the type of data being generated and the desired outcome. One of the key benefits of synthetic data is that it can be generated in large quantities, allowing AI/ML models to be trained on much larger datasets than would otherwise be possible [9]. This can lead to improved performance, especially in cases where real-world data is limited. Additionally, synthetic data can be generated to match specific characteristics or distributions, making it possible to train models to recognize patterns in data that may be difficult to find in real-world data.

Another benefit of synthetic data is that it can be used to test AI/ML models in controlled conditions, helping to ensure that they are working as intended. For example, synthetic data can be used to test models for bias and fairness, or to evaluate the robustness of models in the face of adversarial attacks [9]. Overall, synthetic data can play a valuable role in the development and deployment of AI/ML models. However, it is important to keep in mind that synthetic data may not perfectly reflect the characteristics of real-world data, so care must be taken when using it to train and evaluate models.

4.2 Ethical use of synthetic data

The use of synthetic data raises several ethical issues that are important to consider. Some of the key ethical issues include:

Bias: Synthetic data may be generated to reflect certain biases, either intentionally or unintentionally. This can lead to AI/ML models that are biased in their predictions and decision-making, potentially exacerbating existing inequalities.

Privacy: Synthetic data may be generated using real-world data, which could include sensitive information about individuals. If the synthetic data is not generated and used in a privacy-preserving manner, it could potentially be used to violate people’s privacy rights.

Misrepresentation: Synthetic data may not accurately reflect real-world data, which could lead to AI/ML models that are not representative of the real world. This could result in models that make incorrect predictions or decisions, which could have serious consequences in fields such as healthcare, finance, or criminal justice.

Lack of transparency: The process of generating synthetic data is often complex, and it can be difficult to understand how synthetic data was generated and what assumptions it reflects. This can make it challenging to interpret the results of AI/ML models trained on synthetic data, and to understand the potential limitations and biases of the models.

Responsibility: If AI/ML models trained on synthetic data are used to make decisions that have a significant impact on individuals or society, it can be difficult to determine who is responsible for the decisions made by the model. This raises important questions about accountability and the ethical use of AI/ML technologies.

It is important to carefully consider these ethical issues when using synthetic data and to ensure that synthetic data is generated and used in a responsible and transparent manner. This may involve taking steps to reduce bias and protect privacy, as well as being transparent about the limitations and assumptions of synthetic data and the models trained on it.

Advertisement

5. Ethical AI and data governance

Ethical AI and data governance are closely related, as effective data governance is essential for ensuring the responsible and ethical use of AI. Data governance refers to the processes, policies, and procedures that organizations use to manage and govern the collection, storage, use, and dissemination of data. It is a way of ensuring that data is used in an ethical and responsible manner, and that data privacy and security are protected.

In the context of AI, data governance plays a crucial role in ensuring that the data used to train and develop AI models is of high quality, accurate, and unbiased. It also helps to ensure that the AI models developed are transparent, interpretable, and fair, and that they do not perpetuate or exacerbate existing inequalities or biases.

Effective data governance can also help organizations to comply with data privacy laws and regulations, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). This includes ensuring that data is collected with informed consent, and that data is not shared or used in ways that violate individuals’ privacy rights.

In practice, data governance for AI may include processes such as data quality checks, data privacy impact assessments, and ongoing monitoring of AI systems to ensure that they are operating in an ethical and responsible manner. It also requires collaboration across different departments within the organization, including legal, IT, data science, and compliance.

5.1 AI and its effect on personal data protection

AI can increasingly link different datasets and match different types of information with profound consequences. Data held separately and considered Non-PII (Personal Identifiable Information stripped of personal identifiers), with the application of AI can become Personal Identifiable Information (PII). This occurs when AI in the form of machine learning algorithms can correlate non-personal data with other data and matched to specific individuals, becoming personal data. AI through algorithmic correlation will weaken the distinction between personal data and other data. Non-personal data can increasingly be used to identify individuals or infer sensitive information about them, beyond what was originally and knowingly disclosed [10].

Personal identifiable information as defined by the U.S. Department of labor and the GDPR are as follows:

Personal Identifiable Information (PII) - defined by the U.S. Department of Labor states:

“Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. Further, PII is defined as information: (i) that directly identifies an individual (e.g., name, address, social security number or other identifying number or code, telephone number, email address, etc.) or (ii) by which an agency intends to identify specific individuals in conjunction with other data elements, i.e., indirect identification. (These data elements may include a combination of gender, race, birth date, geographic indicator, and other descriptors). Additionally, information permitting the physical or online contact of a specific individual is the same as personally identifiable information. This information can be maintained in either paper, electronic or other media” [11].

5.2 General data protection regulation (GDPR)

GDPR enhances how people can access information about them and places limits on what organizations can do with personal data. GDPR’s seven principles are: lawfulness, fairness and transparency; purpose limitation; data minimization; accuracy; storage limitation; integrity and confidentiality (security); and accountability.

“The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for the collection and processing of personal information from individuals who live in the European Union (EU). In the interest of enhancing consumer protection, the GDPR mandates that any personally identifiable information (PII) gathered by websites must either be anonymized (meaning, made anonymous, as suggested by the term) or pseudonymized (wherein the consumer’s identity is substituted with a pseudonym). This measure helps safeguard the privacy of individuals engaging with online platforms. GDPR affects data beyond that collected from customers. Most notably, perhaps, the regulation applies to the human resources’ records of employees” [12].

Incorporating the rules involved in determining data that constitutes Personal Health Information (PHI), Personally Identifiable Information (PII), GDPR, and relevant US Federal and European regulations within the AI/ML application will ensure your organization’s AI tools comply with relevant privacy regulations. Utilizing AI/ML applications to constantly monitor the implemented AI tools providing oversight to your organization’s leadership and applicable stakeholders regarding any potential privacy violations will provide the ability for your organization to mitigate any issues.

The OECD AI value-based principles and recommendations providing guiding principles for governments, organizations and individuals in the design, development, and implementation of AI systems is one such standard that can provide the bases for your organization’s AI Ethical Security and Privacy Standards. To provide additional measures to ensure that your organization’s AI solution/tool complies with ethical, security, and privacy standards, your organization’s internal teams will play a crucial role in the compliance to ethical, security, and privacy standards by providing external knowledge and innovation to new AI/ML applications being considered for implementation. This will enable your organization to address compliance to ethical, security, and privacy standards from the analysis, design, data collection, data cleansing, testing and implementation to ensure future AI/ML applications will be compliant. In addition, exploring other AI ethics standards such as the European Commission Ethics Guidelines for Trustworthy AI, The US White House Blueprint for an AI Bill of Rights, UNESCO Ethics of Artificial Intelligence, and others to provide continuous evolution and improvement to your organization’s AI Ethical Security and Privacy Standards is recommended.

Given the specifics of these guidelines for PII, we must constantly consider and monitor our data being used in these AL/ML applications from inception to deployment. Without the proper governance it will be difficult to assess which data will remain non-PII. However, having consistent data selection, training and monitoring throughout the AI/ML lifecycle can ensure that AI/ML applications will distinguish between PII and non-PII and enact the necessary protocols.

5.3 Data governance and AI ethical standards

Data governance is essential for ensuring the responsible and ethical use of AI, including the processes, policies, and procedures that organizations use to manage and govern the collection, storage, use, and dissemination of data. Effective data governance can ensure that the data used to train and develop AI models is of high quality, accurate, and unbiased and that AI models developed are transparent, interpretable, and fair.

Organizations must adopt and adhere to an AI ethical standard one in which AI solutions will comply with ethical, security, and privacy standards set by the individual organization and aligned with an established AI standard. The question however will be which AI standard to adopt and/or align to? There are several to consider including the AI Bill of Rights, International Standard for AI developed by the Organization for Economic Co-operation and Development (OECD), United Nations Educational, Scientific and Cultural Organization (UNESCO), United States Artificial Intelligence Institute (USAII), and U.S. Department of Commerce National Institute of Standards and Technology (NIST) AI Standards for Federal Engagement, just to name a few.

Advertisement

6. Putting ethical AI applications into practice

The latest emphasis on AI revolves around the deployment of Machine Learning (ML) algorithms. It is within the realm of ML algorithms that the ethical and bias-related challenges accompanying AI applications have garnered significant attention. Cognitive Bias, Cultural Bias, and Systems of Belief pertain to the methodical manner, in which the context and presentation of data, information, and knowledge can impact an individual’s judgment and decision-making [13]. Numerous types of cognitive biases exist, each capable of influencing one’s decision-making process at different moments. The types of cognitive bias include Actor-observer bias, Anchoring bias, Attentional bias, Availability heuristic, Confirmation bias, False consensus effect, Functional fixedness, Halo effect, Misinformation effect, Optimism bias, Self-serving bias, and The Dunning-Kruger effect [13].

The predispositions of team members responsible for selecting datasets for algorithm training can impact the algorithm’s outcomes, thereby contributing to the ethical dilemmas associated with the results. When confronted with new information, it becomes essential to scrutinize the algorithm’s findings. This involves examining the data utilized in training the algorithms to eliminate biased data and ensuring that the appropriate algorithms are employed for the relevant tasks. If unchecked, these algorithms may perpetuate, intensify, and compound biases in the outcomes they produce, as well as in the interpretation and knowledge gleaned from those results.

An individual’s belief system regarding AI implementation significantly influences their perception of ethical AI use, particularly in the context of machine learning. In any research endeavor, it is crucial to pinpoint potential biases and strive to eliminate their impact on the research outcomes. This principle applies equally to the selection of data and the construction and training of machine learning algorithms, where biases must be identified, mitigated, and eradicated. Biases pervade all aspects of our actions, including the data we choose and the emphasis we place on specific data types. It is imperative to acknowledge the presence of bias and eliminate it before it affects the design, development, and execution of AI applications. Having a diverse team that brings diverse ideas, experience and knowledge is important in addressing bias in AI and in turn improves the ethicality of AI.

To put AI Ethics into practice you must start with a sound AI Policy and Standards. There are several standards on AI that have been detailed in this chapter. AI technology will continue to evolve, and the AI ethics community will need to evolve the standards to keep pace. The following are steps to take to put ethical AI applications into practice [7].

Adopt and adhere to AI standards that include criteria for examining ethicality of AI applications and identifying criteria for eliminating bias.

Establish a Diverse AI Product development team:

By fostering collaboration, knowledge exchange, and knowledge repurposing, it is vital to capitalize on diverse perspectives, experiences, and cultural backgrounds to encourage a variety of ideas. This diversity of thought serves as a catalyst for innovation, empowering organizations to develop distinctive or enhanced AI products, thereby driving growth and advancement.

Establish a Diverse Team in the design, development, and implementation of AI applications.

  • A diverse team will bring a “diversity of thought” to the initiative and especially during the selection and cleansing of data for AI applications that use Machine Learning.

Develop AI applications to be people (human)-centered.

  • People-oriented AI applications prioritize the inclusiveness and well-being of the individuals they serve, adhering to human-centric values and promoting fairness.

  • The design, development, and implementation of people-oriented AI applications necessitate transparency, robustness, and safety; moreover, they must be held accountable for the outcomes generated, as well as the decisions influenced by these AI applications.

Establish AI KPI’s and Metrics:

  • To make strides in the implementation of AI standards, guidelines, and principles, it is essential to institute standardized metrics for evaluating AI systems. Construct evidence-based metrics and KPIs to consistently assess the performance of AI applications that have been deployed.

Advertisement

7. Conclusions

AI solution implementors face the risk of intensifying existing disparities related to AI resources, technology, talent, data, and computational capacity. Consequently, this may result in AI perpetuating biases and affecting vulnerable and marginalized communities. In numerous instances, AI can diminish the subjective interpretation of data by humans, as machine learning algorithms are designed to focus solely on variables that enhance their predictive accuracy, based on the employed training data.

As AI/ML applications continue to evolve there are some pressing ethical challenges that need to be addressed in the development and deployment of AI. The following represents some of these challenges:

  • Bias and discrimination: AI systems can perpetuate and even amplify societal biases, leading to unfair treatment of certain groups of people.

  • Job displacement: As AI systems become more advanced and capable, they may begin to replace human workers, which could lead to widespread job loss and economic disruption.

  • Privacy and security: AI systems can collect and process large amounts of personal data, raising concerns about how this data is used, stored, and protected.

  • Explainability and transparency: As AI systems become more complex, it can be difficult for humans to understand how they make decisions, which could lead to mistrust and a lack of accountability.

  • Autonomy and control: As AI systems become more autonomous, there are concerns about how to ensure they are used ethically and do not harm humans.

  • Social responsibility and governance: As AI become more pervasive, there is a need to establish ethical guidelines and regulations to ensure that the technology is developed and used in a responsible way.

To train and optimize AI systems, ML algorithms require vast quantities of data. This creates an incentive to maximize, rather than minimize, data collection. The expanding utilization of AI devices leads to more frequent and effortless data gathering, which is then connected to other datasets, often with little or no awareness or consent from the concerned data subjects. Anticipating the patterns identified and the progression of the “learning” is challenging. As a result, data collection and usage may extend beyond what was initially known, disclosed, and consented to by a data subject. AI/ML applications, capable of learning over time, can provide individuals with personalized services tailored to their privacy preferences. It is crucial for AI systems to be developed in accordance with privacy principles outlined by the U.S. Department of Labor and the GDPR, as AI has the potential to enhance personal data. This amplified personal data could, in turn, lead organizations to inadvertently violate these policies.

To provide a framework for the use of ethical data in AI applications, it is important to identify the purpose and goals of the AI application, examine the data sources being considered, ensure data quality, check for data privacy, ensure data security, balance data inclusivity, ensure transparency, perform continual assessment, diversify the datasets used, audit the data, and perform continuous monitoring of the data being used and the decisions being produced by the algorithms.

References

  1. 1. Rhem AJ. AI ethics and its impact on knowledge management. AI Ethics. 2021;1:33-37. DOI: 10.1007/s43681-020-00015-2
  2. 2. Senge P. The Fifth Discipline: The Art and Practice of the Learning Organization. New York, NY: Random House Books; 2006
  3. 3. OECD. OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data. Paris: OECD Publishing; 2002. DOI: 10.1787/9789264196391-en
  4. 4. IEEE. Ethically aligned design first edition: A vision for prioritizing human well-being with autonomous and intelligent systems. The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. 2019
  5. 5. Rhem A. Principles and Practices in Information Architecture. Chicago, IL: A.J: Rhem & Associates, Inc; 2022
  6. 6. Mahesh B. Machine learning algorithms-a review. International Journal of Science and Research (IJSR).[Internet]. 2020;9:381-386
  7. 7. AI Policy Exchange. Anthony Rhem – How to put AI Ethics into Practice [Video]. 2020. YouTube. https://www.youtube.com/watch?v=XW4Bc9LLR9Y
  8. 8. Toews R. Synthetic Data Is about to Transform Artificial Intelligence. Forbes; 2022. Retrieved from. https://www.forbes.com/sites/robtoews/2022/06/12/synthetic-data-is-about-to-transform-artificial-intelligence/?sh=3de39e527523
  9. 9. Hittmeir M, Ekelhart A, Mayer R. On the utility of synthetic data: An empirical evaluation on machine learning tasks. In: Proceedings of the 14th International Conference on Availability, Reliability and Security. 2019. pp. 1-6
  10. 10. OECD. Artificial Intelligence in Society. Paris: OECD Publishing; 2019. DOI: 10.1787/eedfee77-en
  11. 11. U.S. Department of Labor. Guidance on the Protection of Personal Identifiable Information. https://www.dol.gov/general/ppii
  12. 12. Frankenfield J. General Data Protection Regulation (GDPR) Definition and Meaning. 2020. https://www.investopedia.com/terms/g/general-data-protection-regulation-gdpr.asp
  13. 13. Cherry K. What Is Cognitive Bias? 2022. https://www.verywellmind.com/what-is-a-cognitive-bias-2794963

Written By

Anthony J. Rhem

Submitted: 20 February 2023 Reviewed: 07 April 2023 Published: 26 May 2023