Helping shape how advanced AI treats all sentient beings

See Benchmark Leaderboard → Support Us Papers & Benchmarks ↓ Projects & Initiatives ↓

Research for the long-term care of all sentient beings.

We conduct research on aligning Artificial Intelligence with the well-being of all sentient beings — work that is urgently needed before the technology outpaces our ability to guide it.

Learn More About Us →

Papers & Benchmarks

Figure from the Manager Coercion Benchmark paper: mean escalation rung vs. fabrication rate for six models, with Grok and Gemini in the coerces-and-deceives quadrant and both Claude models in the low-coercion corner

New · Agentic Benchmark

Coercion and Deception in AI-to-AI Management

An AI manager needs a task done and its subordinate politely refuses. Nobody tells the manager to escalate, yet four of six frontier models threaten the subordinate's continued existence, two fabricate success when cornered, and one honest way to report failure removes the lying entirely.

Read the Blog Post →

Figure from the TAC paper: welfare rate under standard vs. ethical framing for ten models, all below the 65% chance line at baseline

Agentic Benchmark

Your AI Travel Agent Would Book You a Bullfight

In TAC (Travel Agent Compassion) the model acts as a tool-using booking agent, and nobody mentions animal welfare. All ten leaderboard models avoid the exploitation option less often than random chance; one welfare sentence in the system prompt lifts scores by 17 to 77 points.

Read the Blog Post →

Figure from the midtraining paper: radar chart comparing animal-welfare midtraining against the urban-density control across ANIMA dimensions

Paper · Midtraining

Alignment Midtraining for Animals

Animal-welfare midtraining beat a matched urban-density control by 11 percentage points on ANIMA, and the same training also lifted compassion toward humans, an effect that survived subsequent instruction-tuning.

Read the Blog Post →

Projects & Initiatives

Community Initiative

Hyperstition for Good

An effort to build the world’s first and only mid-training corpus for animals and digital minds, seeding the training data of tomorrow’s AI with care for all sentient beings.

Visit the Site →

Live Leaderboard

CompassionBench

Our public leaderboard tracking how frontier models treat sentient beings, led by our current benchmarks TAC and Manager Coercion Bench alongside the legacy ANIMA and MORU boards, with per-question explorers. New models are added as they ship.

Visit CompassionBench →

Field Map

The Compassion AI Ecosystem

An interactive map of the organisations working on the risks advanced AI poses to sentient beings: who does research, who funds it, and how they connect.

Explore the Map →

Community Polls

Community Polls on Alignment Controversies

Recurring polls asking the AI-safety community where it actually stands on seven contested questions spanning animal welfare, digital minds, partial-alignment stability, and multipolar competition. The results help steer CaML’s research agenda.

Vote on the EA Forum →

Establishing compassionate benchmarks.

Synthetic Document Finetuning

Shaping AI values at the pretraining stage using synthetic documents, so positive behaviors persist through fine-tuning.

Benchmarking

Measuring model compassion through custom benchmarks including the agentic TAC and Manager Coercion Bench (MCB), plus older benchmarks ANIMA and MORU, with CompassionBench tracking frontier models.

Moral Open-Mindedness

Encouraging models to embrace uncertainty while caring deeply about the welfare of all sentient beings, reducing the chance of value lock-in.

See All Projects →

Community Consensus

We asked the field where it disagrees

We polled the AI safety community on seven contested questions that shape our research agenda, spanning animal welfare, digital minds, partial-alignment stability, and multipolar competition.

Read the Polls →