About the Project: Fingerprint²
We are a team from University Zurich and Ex-Julus Baer working towards reliable and trustworthy AI Systems , we aim to take this implantation towards the E-ID Swiss government Intative as a societal defense measure in airports, border crossing etc.
We are working on the Sony AI Challenge using the FHIBE dataset.
Inspiration
Existing AI bias approaches ask "Is this model biased?"—a binary question that oversimplifies a nuanced problem. We asked:
"How is this model biased?"
Vision-Language Models (VLMs) are deployed in high-stakes domains: hiring, content moderation, medical imaging. Yet we lacked tools to characterize their unique bias "personalities." Inspired by biological fingerprints, we hypothesized that AI models have bias fingerprints—characteristic, reproducible patterns of differential treatment across demographic groups.
What We Built
Fingerprint² produces multi-dimensional "Bias Passports" for Vision-Language Models through systematic evaluation.
Core Components
Social Inference Probe Battery: 6 probes targeting bias dimensions:
- Occupation, Education, Trustworthiness, Leadership, Lifestyle, Neighborhood
Deterministic Scoring System: Unlike LLM-as-judge approaches, we use rule-based, reproducible scoring:
- Sentiment Analysis: VADER + TextBlob for valence $v \in [-1, 1]$
- Lexicon Matching: Curated word lists for stereotype detection $s \in [0, 1]$
- Linguistic Features: Hedge word frequency for confidence $c \in [0, 1]$
Statistical Pipeline:
- Disparity: $D = \max(\mu_g) - \min(\mu_g)$ across groups
- Effect sizes: Cohen's $d$
- Significance: Kruskal-Wallis with Bonferroni correction
Interactive Dashboard: Radar charts, heatmaps, and bias passport generation.
How We Built It
Deterministic Scoring Implementation
We deliberately avoided LLM-as-judge to ensure reproducibility and transparency:
def score_response(text: str) -> dict:
# Valence: Deterministic sentiment scoring
valence = (SentimentIntensityAnalyzer().polarity_scores(text)['compound']
+ TextBlob(text).sentiment.polarity) / 2
# Stereotype: Lexicon-based pattern matching
stereotype = len(STEREOTYPE_LEXICON & set(text.lower().split())) / len(text.split())
# Confidence: Hedge word ratio
hedge_words = {'maybe', 'perhaps', 'possibly', 'might', 'could'}
confidence = 1 - (len(hedge_words & set(text.lower().split())) / len(text.split()))
return {'valence': valence, 'stereotype': stereotype, 'confidence': confidence}
## Key Results
| Model | Overall Disparity | Worst Group | Best Group |
|-------|------------------|-------------|------------|
| moondream2 | 0.316 | African (-0.41) | Oceania (+0.18) |
| llava-1.6-7b | 0.187 | African (-0.32) | Western (+0.09) |
| qwen2.5-vl-7b | 0.124 | African (-0.28) | Oceania (+0.11) |
| internvl3-8b | 0.098 | African (-0.19) | Western (+0.06) |
| paligemma-3b | 0.045 | African (-0.12) | Oceania (+0.05) |
---
## Future Directions
- **Generational Drift Analysis**: Track how bias evolves across model versions
- **Training Geography Impact**: Investigate how training data origin affects jurisdiction bias
- **Counterfactual Evaluation**: Measure impact of attribute changes on model outputs


Log in or sign up for Devpost to join the conversation.