From 5b3081fd29ca2aba1c9571a650aa41bea4da88db Mon Sep 17 00:00:00 2001 From: Jay Qi Date: Fri, 20 Mar 2026 01:32:33 -0400 Subject: [PATCH 1/4] Fix malformed examples table --- docs/docs/examples.md | 56 ++++++++++++++++++++-------------------- docs/render_templates.py | 10 +++---- 2 files changed, 33 insertions(+), 33 deletions(-) diff --git a/docs/docs/examples.md b/docs/docs/examples.md index c1ee2fc..c3338d1 100644 --- a/docs/docs/examples.md +++ b/docs/docs/examples.md @@ -4,31 +4,31 @@ To make the ideas contained in the checklist more concrete, we've compiled **examples** of times when tradoffs were handled well, and times when things have gone wrong. Examples are paired with the checklist questions to help illuminate where in the process ethics discussions may have helped provide a course correction. Positive examples show how principles of `deon` can be followed in the real world. -
Checklist Question
|
Examples
---- | --- - |
**Data Collection**
-**A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | -**A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? | -**A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? | -**A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? | - |
**Data Storage**
-**B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)? | -**B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed? | -**B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed? | - |
**Analysis**
-**C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)? | -**C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)? | -**C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data? | -**C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis? | -**C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future? | - |
**Modeling**
-**D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory? | -**D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)? | -**D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics? | -**D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | -**D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood? | - |
**Deployment**
-**E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)? | -**E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)? | -**E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary? | -**E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed? | +| Checklist Question | Examples | +| --- | --- | +| **Data Collection** | | +| **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | | +| **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? | | +| **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? | | +| **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? | | +| **Data Storage** | | +| **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)? | | +| **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed? | | +| **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed? | | +| **Analysis** | | +| **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)? | | +| **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)? | | +| **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data? | | +| **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis? | | +| **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future? | | +| **Modeling** | | +| **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory? | | +| **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)? | | +| **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics? | | +| **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | | +| **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood? | | +| **Deployment** | | +| **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)? | | +| **E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)? | | +| **E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary? | | +| **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed? | | diff --git a/docs/render_templates.py b/docs/render_templates.py index f0f14b1..42417cf 100644 --- a/docs/render_templates.py +++ b/docs/render_templates.py @@ -54,12 +54,12 @@ def make_table_of_links(): for r in refs: refs_dict[r["line_id"]] = r["links"] - template = """
Checklist Question
|
Examples
---- | --- + template = """| Checklist Question | Examples | +| --- | --- | {lines} """ - line_template = "**{line_id} {line_summary}**: {line} | {row_text}" - section_title_template = " |
**{section_title}**
" + line_template = "| **{line_id} {line_summary}**: {line} | {row_text} |" + section_title_template = "| **{section_title}** | |" line_delimiter = "\n" formatted_rows = [] @@ -74,7 +74,7 @@ def make_table_of_links(): for link in refs_dict[line.line_id]: text = link["text"] url = link["url"] - bullet_hyperlink = f"
  • [{text}]({url})
  • " + bullet_hyperlink = f'
  • {text}
  • ' bulleted_list.append(bullet_hyperlink) formatted_bullets = "".join(bulleted_list) From e4432982e7127a2a704c1d819fa5af2bdf21ad40 Mon Sep 17 00:00:00 2001 From: Jay Qi <2721979+jayqi@users.noreply.github.com> Date: Fri, 20 Mar 2026 17:04:20 -0400 Subject: [PATCH 2/4] Add noopener noreferrer Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- docs/render_templates.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/render_templates.py b/docs/render_templates.py index 42417cf..4d8eadd 100644 --- a/docs/render_templates.py +++ b/docs/render_templates.py @@ -74,7 +74,7 @@ def make_table_of_links(): for link in refs_dict[line.line_id]: text = link["text"] url = link["url"] - bullet_hyperlink = f'
  • {text}
  • ' + bullet_hyperlink = f'
  • {text}
  • ' bulleted_list.append(bullet_hyperlink) formatted_bullets = "".join(bulleted_list) From 68cf56adfd5d5235b9ac5b0852bda760071182bb Mon Sep 17 00:00:00 2001 From: Jay Qi Date: Fri, 20 Mar 2026 17:05:19 -0400 Subject: [PATCH 3/4] Fix typo in URL --- deon/assets/examples_of_ethical_issues.yml | 8 ++++---- docs/docs/examples.md | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/deon/assets/examples_of_ethical_issues.yml b/deon/assets/examples_of_ethical_issues.yml index 4c03b61..27d0af7 100644 --- a/deon/assets/examples_of_ethical_issues.yml +++ b/deon/assets/examples_of_ethical_issues.yml @@ -55,7 +55,7 @@ - line_id: C.2 links: - text: ✅ A study by Park et al shows how reweighting can mitigate racial bias when predicting risk of postpartum depression. - url: https://doi.org/10.1001/jamanetworkopen.2021.3909 + url: https://doi.org/10.1001/jamanetworkopen.2021.3909 - text: ⛔ word2vec, trained on Google News corpus, reinforces gender stereotypes. url: https://www.technologyreview.com/s/602025/how-vector-space-mathematics-reveals-the-hidden-sexism-in-language/ - text: ⛔ Women are more likely to be shown lower-paying jobs than men in Google ads. @@ -82,7 +82,7 @@ links: - text: ✅ Amazon developed an experimental AI recruiting tool, but did not deploy it because it learned to perpetuate bias against women. url: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G - - text: ⛔ In hypothetical trials, language models assign the death penalty more frequently to defendants who use African American dialects. + - text: ⛔ In hypothetical trials, language models assign the death penalty more frequently to defendants who use African American dialects. url: https://arxiv.org/abs/2403.00742 - text: ⛔ Variables used to predict child abuse and neglect are direct measurements of poverty, unfairly targeting low-income families for child welfare scrutiny. url: https://www.wired.com/story/excerpt-from-automating-inequality/ @@ -92,7 +92,7 @@ url: https://www.whitecase.com/publications/insight/algorithms-and-bias-what-lenders-need-know - line_id: D.2 links: - - text: ✅ A study by Garriga et al uses ML best practices to test for and communicate fairness across racial groups for a model that predicts mental health crises. + - text: ✅ A study by Garriga et al uses ML best practices to test for and communicate fairness across racial groups for a model that predicts mental health crises. url: https://www.nature.com/articles/s41591-022-01811-5 - text: ⛔ Apple credit card offers smaller lines of credit to women than men. url: https://www.wired.com/story/the-apple-card-didnt-see-genderand-thats-the-problem/ @@ -119,7 +119,7 @@ - line_id: D.4 links: - text: ✅ GDPR includes a "right to explanation," i.e. meaningful information on the logic underlying automated decisions. - url: hhttps://academic.oup.com/idpl/article/7/4/233/4762325 + url: https://academic.oup.com/idpl/article/7/4/233/4762325 - text: ⛔ Patients with pneumonia with a history of asthma are usually admitted to the intensive care unit as they have a high risk of dying from pneumonia. Given the success of the intensive care, neural networks predicted asthmatics had a low risk of dying and could therefore be sent home. Without explanatory models to identify this issue, patients may have been sent home to die. url: http://people.dbmi.columbia.edu/noemie/papers/15kdd.pdf - line_id: D.5 diff --git a/docs/docs/examples.md b/docs/docs/examples.md index c3338d1..89ea363 100644 --- a/docs/docs/examples.md +++ b/docs/docs/examples.md @@ -25,7 +25,7 @@ To make the ideas contained in the checklist more concrete, we've compiled **exa | **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory? | | | **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)? | | | **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics? | | -| **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | | +| **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | | | **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood? | | | **Deployment** | | | **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)? | | From 8d771fec423cb80e343680b0d921a744d0f6ce1b Mon Sep 17 00:00:00 2001 From: ejm714 Date: Mon, 23 Mar 2026 13:56:10 -0700 Subject: [PATCH 4/4] remake docs so we get noopener noreferrer in examples.md --- docs/docs/examples.md | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/docs/docs/examples.md b/docs/docs/examples.md index 89ea363..d4e9960 100644 --- a/docs/docs/examples.md +++ b/docs/docs/examples.md @@ -7,28 +7,28 @@ To make the ideas contained in the checklist more concrete, we've compiled **exa | Checklist Question | Examples | | --- | --- | | **Data Collection** | | -| **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | | -| **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? | | -| **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? | | -| **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? | | +| **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | | +| **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? | | +| **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? | | +| **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? | | | **Data Storage** | | -| **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)? | | -| **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed? | | -| **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed? | | +| **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)? | | +| **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed? | | +| **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed? | | | **Analysis** | | -| **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)? | | -| **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)? | | -| **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data? | | -| **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis? | | -| **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future? | | +| **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)? | | +| **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)? | | +| **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data? | | +| **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis? | | +| **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future? | | | **Modeling** | | -| **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory? | | -| **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)? | | -| **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics? | | -| **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | | -| **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood? | | +| **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory? | | +| **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)? | | +| **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics? | | +| **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed? | | +| **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood? | | | **Deployment** | | -| **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)? | | -| **E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)? | | -| **E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary? | | -| **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed? | | +| **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)? | | +| **E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)? | | +| **E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary? | | +| **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed? | |