higuseonhye

S Gu higuseonhye

Building evaluation frameworks for trustworthy AI systems.

Pinned Loading

agent-eval-toolkit agent-eval-toolkit Public

Decision-oriented evaluation toolkit for LLM & agent systems, focusing on trust, failure modes, and deployment readiness in enterprise environments.

Python
spk_balance spk_balance Public

Evaluation-oriented MVP exploring speaking–writing feedback loops for agent and LLM communication quality.

Python
agent-accountability-eval agent-accountability-eval Public

An evidence-based system for evaluating agentic AI trustworthiness through accountability, continuous evaluation, and human-in-the-loop governance.
worldsim-eval worldsim-eval Public

Evaluate AI agents by simulating world-level consequences.