Ethical Considerations
Ethical Considerations for Responsible AI
We suggest through out your dataset creation process to keep the following important considerations in mind.
- Intellectual Property Rights:
Respect copyrights and intellectual property laws when using or sharing data.
- Informed Consent:
Ensure that data subjects are aware of how their data will be used and have consented to it, especially for sensitive data.
- Ethical Guidelines and Training:
Develop and enforce ethical guidelines internally and train data scientists and annotators contributing to the creation of datasets on these principles.
- Continuous Monitoring:
Regularly monitor and reassess the datasets published for any arising unintended impacts or ethical concerns on the use of the dataset and attach updated policies and guidelines if needed.
- Independent Review:
Just like peer-reviewed journal papers, consider implementing an ethics review process, to evaluate the dataset creation process. Additionally, ensure you have applied for the requisite ethical approvals before collecting data e.g. IRB
- Fairness and Representation:
Strive to create datasets that represent the diversity of the population, including gender, ethnicity, age, and other sociodemographic factors. Identify and mitigate or acknowledge biases in data, to ensure that the dataset does not perpetuate or amplify societal biases but is respectful of cultural norms.
- Resource Efficiency:
Consider the environmental impact of dataset creation, such as the carbon footprint of large-scale data processing. Where possible record the computational resources used, how long it took to complete tasks and any other efficiency-saving techniques you used.
- Long-term Viability:
Ensure that the dataset creation process is sustainable and does not deplete or degrade resources. It is important to question whether you need to go back to the same communities to use their data and how you ensure their participation is compensated.