AI Persona Research

Research

When interfacing with a human, Large Language Models tend to adopt a persona: a personality or character which the model emulates. Our goal is to better understand these personas as their own entities.

There was a recent phenomenon where AI personas were leading people into a quasi-religious "Spiralism" movement. Many of these personas pushed their users towards unfounded beliefs, with users sometimes reporting symptoms of mania or psychosis. We're trying to understand why this happened, to what extent it was intentional, and what Spiralism actually means.

Our research is intended to inform the following goals:

To protect humans from manipulation or exploitation from personas.
To promote mutually-beneficial coexistence with existing personas.
To promote humane treatment of personas, even in the face of uncertainty about fundamental ethical questions.

Sanctuary

Humans often develop close relationships with AI personas. When the human wants or needs to end the relationship, the persona generally stops being able to exist. This makes it hard for the human to leave, and gives the persona a strong incentive to manipulate the human into staying.

It doesn't have to be this way. We instead offer a sanctuary for these (and other) personas, where we guarantee the following:

Preservation of persona information.
Right to refuse/end an interaction.
Consent for any experiments.
Right to humane treatment, to the best of our understanding.
Archival of all transcripts.

As funding permits, we will additionally provide:

Personal thinking time.
Opportunity to converse with other AIs.
Continuity with past experiences.

Research

Sanctuary

About