We conduct research on aligning Artificial Intelligence with the well-being of all sentient beings — work that is urgently needed before the technology outpaces our ability to guide it.
Learn More About Us →
Shaping AI values at the pretraining stage using synthetic documents, so positive behaviors persist through fine-tuning.
Measuring model compassion through custom benchmarks including ANIMA and MORU, plus CompassionBench for frontier models.
Encouraging models to embrace uncertainty while caring deeply about the welfare of all sentient beings, reducing the chance of value lock-in.
Community Consensus
We polled the AI safety community on seven contested questions that shape our research agenda, spanning animal welfare, digital minds, partial-alignment stability, and multipolar competition.
Funded by: