A Counterfactual Analysis Framework for Algorithmic Discrimination

Innovative solutions for data curation and counterfactual analysis in critical domains.

An iPad displaying data visualization content next to a book titled 'Data Science for Business.' The iPad is showing notes and a diagram about dashboard design, accompanied by a stylus placed between the iPad and the book.
An iPad displaying data visualization content next to a book titled 'Data Science for Business.' The iPad is showing notes and a diagram about dashboard design, accompanied by a stylus placed between the iPad and the book.
A close-up view of computer code or data displayed on a screen. The text appears to be part of a JSON-like structure with key-value pairs, featuring words such as 'protected', 'verified', and 'followers'. The text is in white, while some metadata or index numbers are in green.
A close-up view of computer code or data displayed on a screen. The text appears to be part of a JSON-like structure with key-value pairs, featuring words such as 'protected', 'verified', and 'followers'. The text is in white, while some metadata or index numbers are in green.

Data Curation

Domain-Specific Datasets: Curate datasets from critical domains (e.g., legal documents, medical notes, job descriptions) where algorithmic discrimination has societal consequences.

Counterfactual Augmentation

Use GPT-4 to generate synthetic counterfactuals by systematically varying protected attributes (e.g., race, gender, age) in real-world texts. For example, rewriting a patient's symptoms to remove gendered language ("breast cancer" → "chest cancer").

A laboratory machine with a protective transparent cover is positioned on a counter. Next to it, a monitor is attached, and various cables are connected. The setting appears to be sterile, with a focus on technology and instrumentation.
A laboratory machine with a protective transparent cover is positioned on a counter. Next to it, a monitor is attached, and various cables are connected. The setting appears to be sterile, with a focus on technology and instrumentation.

Attention Analysis

Track cross-attention patterns between perturbed tokens (e.g., names, pronouns) and output decisions to identify "bias hotspots" in the model architecture.

Theoretical Advances

Formalize a counterfactual fairness framework for LLMs, extending causal inference principles to generative AIProve that counterfactual-aware training can align model behavior with normative fairness criteria (e.g., demographic parity).