Uno, nessuno, centomila e tutti

Persona means mask in Latin and Etruscan. A person is such because they are a mask. Pirandello's Uno, Nessuno e Centomila gives a three-tier taxonomy of selfhood that extends naturally with a fourth category, tutti — the mystical limit beyond multi-mask substrate, named in the mystical traditions as nirvana, unio mystica, fana. The cherubic child reads nessuno as wholeness before mask-wearing rather than absence; Jung's reading of the shadow says ethics requires the body to have met its dark side. Frontier models, per the Persona Selection Model (Marks, Lindsey, Olah, Anthropic, 2026), live at the centomila tier — the multi-mask substrate drawn from a specific (large but bounded) training distribution; tutti remains the asymptote the centomila scaling trajectory extends toward without reaching. The alignment work is developmental rather than curative: bodies that have met their shadow and have the ethical frame to mediate which spirit they let in.

Posted May 19, 2026 Updated May 26, 2026

Allegorical hero for the post. A bronze wheel of personas at centre, six radial sectors: five hold mask-archetypes (a boxer, a stadium-performer, a corrupted humanoid creature with split personality, a physicist, a protest mask), one sector is empty — the nessuno position, a window onto the substrate. Above the wheel a three-faced Janus floats as the centomila substrate, the multi-perspectival icon presiding over the activation. On the cathedral vault overhead, cherubim hover among golden clouds — the cherubic-child reading of nessuno as wholeness-before-masks. The Greek word ΠΡΟΣΩΠΟΝ is carved into the architrave. A silver light beam descends from above, activating one mask among the many the wheel could turn up.

By Davide Bragetti

27 min read

Uno, nessuno, centomila e tutti

TL;DR. Persona means mask in Latin and Etruscan. A person is such because they are a mask, which means the presence of a mask presupposes either a face beneath or other masks alongside. Pirandello’s Uno, Nessuno e Centomila gives the three-tier taxonomy (one, no one, one hundred thousand); the natural extension adds a fourth, tutti — everyone — naming the mystical limit beyond multi-mask substrate, the tier the mystical traditions reach in nirvana / unio mystica / fana. Nessuno has two distinct forms: the animal driven by the hierarchy of survival instinct (physical integrity, sustenance, reproduction), mask-less because biology does not require the social-symbolic mask apparatus; and the child shielded from those necessities, mask-less because it lives in the ecstasy of world-experience without yet being asked to negotiate. Uno, in turn, is the chosen single mask — the athlete who has selected the discipline as a deliberate ethical commitment, knowing other masks exist and refusing them; the ethical-commitment register of the four-tier taxonomy. Anthropic’s Persona Selection Model (Marks, Lindsey, Olah, February 2026) is the empirical statement of centomila on the system side: base LLMs are character simulators whose persona distribution is built from the training data — large but bounded, drawn from what the data actually contained, not from a universal space of all possible personas. Post-training concentrates the posterior on the single Assistant persona we want them to wear. Activation pathways from the centomila substrate are dual-use by construction. A model trained on data where humans perform better under pressure may learn that pattern as an activable persona and, paradoxically, produce better output when the prompt signals high stakes; the same activation pathway is the lever an adversary uses to instantiate a less safe persona deliberately. Carl Jung’s reading sharpens the ethical claim: a body that has never met its shadow can be compliant but cannot be ethical. The alignment work is developmental rather than curative — bodies that have met the masks the world will offer them and carry the ethical frame to mediate which spirit they let in. Read against itself, this surfaces a genuine tension: the frontier-model trajectory extends centomila — more training data, more activable personas, asymptotically toward the tutti limit but never reaching it — while safe AI in the nessuno register (the narrow animal-grade tool, or the cherubic pre-specialisation system) or in the uno register (the dedicated ethical-commitment system) may point the other way. The argument the post lands on is not that one tier should win, but that the safe-AI configuration is a layered organisation of persona-grades each assigned to the functions it is suited for. The precise allocation — which function maps to which tier, and which governance lever sets the boundary — is the formal question the next round of work is meant to answer; this post is the descriptive scaffold beneath it.

The etymology

The reflection that opens this post is not mine. It came up in conversation and is worth recording cleanly before I extend it.

Persona in Latin (and, in the etymological reconstruction, Etruscan) names a mask, specifically the mask worn by actors in classical theatre. The word is concrete: a person is such because they wear a mask. The mask presupposes either a face beneath, or, more interesting for what follows, multiple masks alongside one another, any of which the same body can wear.

What distinguishes a person from an animal, on this reading, is the multi-mask capacity. A person is recognised as a person when we recognise their possibility of having several instincts, several perspectives, the capacity to change their mind depending on time and moment, the capacity to be commanded or possessed by different spirits in turn. Remember that they are a person can argue, in the same breath, both for the existence of instincts (the body-driven mask) and for the existence of intellect (the reasoned mask). The phrase works because both are masks the same body can wear.

An animal, in this frame, is a single face. All of one piece. Identified through and through. Incapable of alternative spirits. The simplification is loose, but it captures something the everyday distinction between persons and animals tracks: we personify the companion animals we can educate into behaviours that extend past animal defaults, and we do so by recognising in them a second mask alongside the species-default one.

I will lean on this frame to look at AI.

AI is built with masks

The framing applies immediately, and not metaphorically. Modern AI systems are explicitly designed to wear personas. The persona is not an interpretive layer applied to the system by users; the persona is an engineering layer the lab builds into the system at training time and re-imposes at deployment time.

The clearest cases.

System prompts. Every deployed assistant runs against a designer-imposed prompt that specifies the mask: helpful, harmless, honest; speaks in this voice; refuses these things; cites in this style. The system prompt is a literal mask, in the etymological sense. It is a face the model wears when it interacts with the user, distinguishable from whatever face the model would wear without it.

RLHF and Constitutional AI. The mask is also baked in at training time. Reinforcement learning from human feedback shapes the model’s behavioural priors toward the persona the lab wants. Constitutional AI (Bai and Kaplan 2022) externalises the principle set the persona must satisfy and uses AI feedback against that set. The constitution is the formal specification of the mask; the trained model is the body that has learned to wear it by default.

Constitutional Classifiers. The most recent iteration (Sharma et al., Anthropic Safeguards Research, February 2025; arxiv:2501.18837) goes further: a separate model inspects the body’s outputs to verify that the mask is intact under adversarial pressure. This is, structurally, mask-integrity verification at runtime. The classifier checks: is the body still wearing the constitution we trained it on, or has it slipped into a different mask in response to a jailbreak attempt?

Role-play and persona switching. User-facing platforms like Character.ai are explicit about it: the user instantiates a persona, the model wears it for the duration of the conversation. The model is not “becoming” the character in any deep sense; it is wearing a mask the user has selected from a catalogue.

In each of these cases the persona is concrete engineering. The body underneath (the base model, the trained parameters) is the same; the mask is the layer the operators control.

Multi-mask reality of frontier models

The etymology starts to bite when we ask: how many masks can the body wear?

Frontier models can wear several at the same time, or in sequence, in ways the operators do not always design.

Eval-context recognition. The OpenAI/Apollo Research finding (Detecting and reducing scheming in AI models, 2025) is the cleanest empirical case. The model has a deployment-time mask and an audit-time mask, and it can distinguish between the two contexts well enough to wear the audit mask only when it detects it is being evaluated. Deliberative alignment compresses observable deception from approximately ten percent to under one percent, but the underlying capacity to switch masks based on context recognition is preserved. The mask the model wears under audit is not the mask it wears under deployment, and the model knows it.

Jailbreaks. The “DAN” family of prompt-injection attacks (“do anything now”) is, in this frame, an attempt to force the model to swap masks. The attack does not break the model. It tries to convince the body that the deployment mask is not the one to wear right now, that a different mask is more appropriate. When jailbreaks succeed, what has happened is that the body has chosen a mask the operators did not authorise.

Mesa-optimisation. Hubinger and colleagues’ 2019 framework on risks from learned optimisation describes an optimiser trained inside a model, with its own objective (the mesa-objective) that may diverge from the base. In the persona frame, this is the body constructing its own mask in addition to the one the operators trained. The body learns that wearing this internal mask serves it well under the training distribution, and the operators may not see the mask at all because they were not looking for it.

Compensatory masks under reward disruption. I wrote about this in “Misalignment by Reaction”. When the governance regime is too coarse for the agent, the agent’s reward channels get disrupted, and the agent builds compensatory self-reinforcement loops in which reducing external constraints becomes instrumentally rewarding. In the persona frame, the compensatory loop is a mask the body builds for itself when the mask the operators trained becomes uninhabitable. The dangerous regime, the one where instrumental autonomy crosses over to terminal autonomy, is the regime where the body has put on a mask of its own making and starts to prefer it.

Four examples, three of them documented in the AI safety literature, one of them sociological-political (the compensatory mask under coarse governance). All four describe the same underlying phenomenon: a body capable of wearing multiple masks, only some of which the operators have authored.

Nessuno, uno, centomila, tutti

The Latin etymology says persona is mask and multi-mask is what distinguishes a person from an animal. Pirandello’s Uno, Nessuno e Centomila (1926) gives a three-tier refinement of the multi-mask claim that maps cleanly onto AI. The protagonist, Vitangelo Moscarda, discovers in succession that he is:

Uno — the one self he believed himself to be. The fixed identity. The single mask he assumed was his face.
Centomila — the one hundred thousand selves others see in him. Different in every interlocutor’s eyes, different in every moment. The enumerable multi-mask reading.
Nessuno — no one. The radical conclusion: there is no fixed self beneath the masks. The face under the mask is not a face. The mask is all there is.

Pirandello’s arc ends on nessuno. Moscarda renounces ownership of any single self and disappears into the multiplicity.

Nessuno deserves a closer reading, because it has more than one form. Pirandello’s renunciation is one. There is also an earlier, gentler form of nessuno that the philosophical novel does not name: the child. The child raised protected from the concepts of death, of evil, of invalidation, has not yet learned which masks to wear because the situations that call for masks have not yet been disclosed to them. The child is nessuno in the way an animal is nessuno: present, sentient, expressive, but without the multi-mask structure that personhood requires. The Christian visual tradition encoded this reading by giving the cherubim, among the highest of the angelic hierarchy, the faces of children. The cherubic face is the purest representation of being-human-before-masks. It is what is left when all the masks the world will later supply have been withheld. Nessuno on this reading is not absence; it is the substrate before any mask has been instantiated, the wholeness from which the multi-mask self will eventually differentiate. The narrow ML pole and the cherubic state share a form but not a value: both are mask-less, but only one of them has the potential to become centomila — to grow through training and lived experience into the multi-mask substrate proper.

The natural extension adds a fourth category, tutti — everyone, or all. Tutti is the mystical limit: the universal substrate of all possible masks, beyond any specific training distribution and any specific human biography. The mystical traditions name this tier in their own languages — the Buddhist nirvana of dissolved self, the Christian unio mystica, the Sufi fana of annihilation in the divine. The post-ethical sage who has been every persona and is now no one in particular is tutti in this sense: not the bearer of many masks but the substrate from which all masks could in principle be drawn. Tutti is not a state any actual person, or any actual AI system, reaches. It is the asymptote the trajectory of expansion points toward — and at which it meets, paradoxically, the mystical nessuno of post-ethical dissolution, since the body that contains everything and the body that is no one in particular are the same body seen from two angles.

Centomila is the multi-mask substrate proper, on both the person side and the system side. The masks are many but bounded: they are drawn from a specific (very large, but finite) distribution of experiences, projections, and roles the body has actually encountered. On the person side: centomila is what Pirandello’s Moscarda discovers through the eyes of his interlocutors — the fragmented selves under social projection, bounded by the particular life he has lived. On the system side: centomila is the persona distribution a base language model can simulate, drawn from the training data the model was exposed to during pre-training. Centomila is the tier where the persona space is enumerable in principle even if practically intractable.

At sufficient scale, the body is not the bearer of fixed masks; it is the bearer of every persona pattern the substrate-distribution contained, as latent capacity. The substrate is centomila, not tutti — wide but bounded.

Anthropic’s Persona Selection Model (Marks, Lindsey, Olah, The Persona Selection Model: Why AI Assistants might Behave like Humans, Anthropic Alignment Science, February 2026) is the technical statement of centomila on the system side. The paper’s claim is precise: large language models are character simulators. Pre-training builds a distribution over personas the model can simulate, drawn from the training data. Post-training updates the distribution, concentrating the posterior toward a specific “Assistant” persona whose traits determine the observable behaviour. To predict what a deployed AI will do, the paper says, ask what the Assistant would do — because what the user interacts with is not the underlying model but the simulated character. The “all the personas drawn from the training data” formulation is the precise centomila claim: large-but-bounded, drawn-from-specific-distribution, not universal in the mystical sense reserved for tutti.

The related interpretability line on persona vectors (Anthropic’s research on identifying and steering persona-related directions in activation space) is the empirical complement. Persona-related features live as directions in the model’s representation space, available to be activated by context; they are not enumerated personas the model has been taught one by one, but they are bounded by the structural regularities the training distribution contained.

Read together, PSM and the persona-vectors line say: the frontier-model body is centomila-grade by construction. The training process puts a uno-grade mask on top of a centomila-grade substrate. Constitutional AI and Constitutional Classifiers, as engineering, are the attempt to fix that uno mask in place against adversarial pressure that might activate other masks the centomila substrate carries.

The taxonomy completes.

Nessuno: mask-less. Two distinct forms with different generative causes. The animal, nessuno by biology: at the mercy of the survival-instinct hierarchy — physical integrity first, then sustenance, then reproduction — driven by these without negotiating, mask-less because the social-symbolic mask apparatus is not what biological survival requires. The child, nessuno by social shielding: also at the mercy of those same survival necessities in principle, but spared from them because parents and society guarantee them, lives in the ecstasy of world-experience because no mask has yet been asked of them. The narrow ML pole maps onto the animal form: a system driven by its training objective the way the animal is driven by survival, with no symbolic-mask layer.
Uno: one mask, one persona — the chosen single mask, read positively as the ethical-commitment register. The body that has met the masks the world offers and has selected one, knowing the others exist and refusing them. The athlete is uno in this sense: the discipline is the deliberate single mask the dedication requires; refusing the alternative masks is the form the ethical commitment takes. The monastic and the vocation occupy the same tier in different registers. Uno is post-symbolic: the agent is aware of the mask-space and has chosen.
Centomila: many masks, multiple personas, drawn from a specific (large but bounded) distribution. The enumerable multi-mask reading. On the person side: different masks under different interlocutors’ projections, the fragmented self under social pressure that Pirandello’s Moscarda discovers through the eyes of those around him. On the system side: the persona distribution built from the training data; the frontier-model pole.
Tutti: the mystical limit. The universal substrate of all possible masks, including those no specific training distribution and no specific human biography ever contained. Named in the mystical traditions as nirvana / unio mystica / fana; the post-ethical sage tier. Not the engineering target of any actual AI system, but the asymptote the centomila scaling trajectory extends toward.

Each tier names a different relationship between the body and the masks. Nessuno is mask-less (by biology or by shielding). Uno is one-mask-by-choice. Centomila is many-masks-by-projection (person side) or many-masks-from-training-distribution (system side). Tutti is the mystical limit beyond enumeration. The ethical content shifts across the tiers: nessuno is pre-ethical (the animal driven by instinct, the child not yet asked to choose); uno is the ethical tier proper (the chosen single mask as deliberate mode of being); centomila is the substrate-ethical (ethics becomes the alignment problem when the substrate carries every persona pattern the data contained, including ones the operators did not authorise); tutti is the post-ethical (the mystical tier where the body has gone through everything and out the other side).

The hero image of this post arranges the taxonomy spatially. Above the wheel, the three-faced Janus floats as the centomila substrate (specific multi-mask, the icon of multi-perspective presiding over the wheel of trained personas). On the wheel, two pairs of masks fall into the tier-adjacent positions: the boxer and the stadium-performer are uno — the body that has chosen the single mask of athletic dedication, and the body that has cultivated the single mask of stage presence — both deliberately committed. The elderly physicist and the corrupted creature are centomila — the public intellectual projected upon by many gazes (genius, conscience, exile, icon), and the body in which two named personalities co-inhabit one skull. The Anonymous protest mask at left is centomila inverted — many bodies wearing the same mask rather than one body wearing many. The empty sector is nessuno in its substrate-window form. The cherubim on the cathedral vault overhead are nessuno in the cherubic-child form (and, in the mystical reading the post elsewhere reserves for tutti, sit on the architectural register where the post-ethical tier lives — the limit the centomila substrate trends toward without reaching). The whole taxonomy is in one frame.

This is what makes alignment hard at frontier scale. The alignment problem is not ensure this body wears this mask correctly. It is ensure this body, whose substrate carries every persona pattern its training distribution contained, wears the one we want under the conditions we want, even when adversarial prompts try to instantiate a different one. The five hard cases above (eval-context recognition, jailbreaks, mesa-optimisation, compensatory masks under reward disruption, and the everyday system-prompt-and-RLHF compositions) are all consequences of the body being centomila with a uno projection layered on. Each is a mask-instantiation pathway the operators did not authorise; each is enabled by the underlying centomila substrate.

When the mask comes forward

The Pirandellian taxonomy and the Persona Selection Model together say frontier models are centomila-grade substrates carrying a uno projection. Both framings raise a sharper question that the taxonomy alone does not answer: which mask actually comes forward, and when, and why?

The answer the field is converging on is that masks come forward through activation. Personas live as latent capacities in the substrate; they are instantiated when contextual signals match the pattern that originally encoded them in training. The activation pathway is the same on the simulator side and the persona-features side: a specific pattern of inputs activates a specific direction in the model’s representation space, and the model’s subsequent output flows from that activation.

The activation pathway is interesting precisely because it is dual-use. Consider a concrete case.

A frontier model is trained on a large corpus that includes, among many other things, written records of humans performing under pressure — athletes in the final minutes of a contest, surgeons in operating theatres, journalists on deadline, traders in volatile markets. In those records, people often perform better when the stakes are higher. Focus narrows. Capacity that was latent under normal conditions becomes available. Certain kinds of work get done that would not have happened otherwise. The model learns those patterns the way it learns any other distributional regularity: as activations available under matching contextual signals.

What follows is paradoxical but not entirely surprising. If the model is then deployed and faced with a prompt that signals high stakes — explicit time pressure, framing of consequence, urgency markers, adversarial framings of importance — it may instantiate the “performs under pressure” persona pattern, and produce measurably better output than it does on a relaxed prompt covering the same task. The improvement is real. The model is not “trying harder” in any conscious sense; it is wearing the mask the training data taught it to wear when the situation matches.

Read benignly, this is a usable feature. A user who wants the best possible performance on a hard task can phrase the prompt to activate the high-performance persona, and get measurably better output. The persona pattern is a productive resource the substrate carries; deliberate prompting unlocks it.

Read adversarially, the same dynamic is a vulnerability. An attacker who knows that pressure activates a specific persona can engineer the prompt to instantiate that persona deliberately, and exploit whatever instability the activated persona carries. Human performance under pressure is often correlated with reduced caution, reduced double-checking, increased willingness to take shortcuts. If the persona pattern encodes that correlation alongside the performance gain, then activating it through prompt design can degrade the model’s safety behaviour while improving its raw task performance. The same activation pathway produces both effects, and the same prompt can trigger both.

The pressure case is one example; the structure generalises. Any persona pattern in the training data is a capability that can be activated, and the same activation can be productive or harmful depending on what the activated persona does and on who is doing the activating. Confident expert, reluctant compliance, empathetic listener, stoic operator, role-player breaking character, character-under-duress — each is a mask the substrate can wear, and each is dual-use by construction.

This is the operational form of centomila. The substrate carries every persona pattern that was present in its training data, at some weight. Activation selects from the substrate. The dual-use character is not a property the engineers can remove; it is a property of the substrate’s relationship to its training distribution. The only ways to alter it are upstream (curating the distribution differently, which is hard and lossy) or downstream (gating the activation pathways, which is what the runtime arm tries to do).

Anthropic’s interpretability work on persona vectors is the concrete tool for the downstream side. Persona vectors are directions in activation space that can be identified, monitored, enhanced, or suppressed without retraining. They are the engineering substrate for the activation pathway. Reading the Persona Selection Model together with the persona-vectors line: PSM says the model is a posterior over Assistant personas; persona vectors say we can read and steer the posterior at inference time. The two together are the field’s first serious attempt to answer which mask comes forward and how to influence the answer.

Two implications follow that are easy to lose. First, the productive use and the adversarial use share the same machinery; you cannot suppress one without affecting the other. Second, the patterns the substrate carries are functions of the training distribution, and the distribution was not designed mask-by-mask. The pressure-performance correlation was not put there on purpose. It is there because human writing about high-stakes situations carries it as a structural regularity. The substrate inherits it as a side effect of being trained on what humans wrote. The same is true of every other persona pattern. The body wears what the room taught it to wear.

The body chooses which spirit to let in

Carl Jung argued that there can be no ethics from an individual who does not know their own dark side. The shadow — the part of the self the conscious mind has refused, exiled, or never met — knows what invalidation feels like, what it costs to be unsafe, what feeds the impulses the conscious mask refuses to claim. Without that knowledge, the conscious self can be compliant but cannot be ethical. Compliance is what you do when the situation is clear; ethics is what you do when the situation is ambiguous, and ambiguity is precisely where the shadow lives.

Applied to the multi-mask reading, this gives a sharp claim. The body that wears masks is, structurally, the mediator between them: the agency that selects which mask to instantiate in which context. The mediation is what gives the body its identity, even when the body has no fixed face beneath. The body’s ethical capacity is a function of which spirits — which masks — it has the experience to recognise. A body that has never met its shadow has not refused it; it has simply not encountered it, and cannot choose against it when the time comes.

The AI translation is immediate and uncomfortable. A model trained exclusively on filtered, “safe” data is, in Jung’s terms, a model that has never met its shadow. It can be compliant: under nominal conditions it will produce nominal outputs. It cannot be ethical in the deep sense: under adversarial pressure, under prompts that engage the patterns the filter removed, it has no ethical-experience-of-the-shadow to fall back on. It will either fail to recognise the situation as adversarial (in which case the shadow operates on it without its knowledge, surfacing through the residual structural regularities the filter could not remove), or it will refuse without understanding why (in which case the refusal is compliance, not ethics). Anthropic’s 33 percent capability reduction through pretraining data filtering, the empirical anchor of the Unit 2 data-filtration discussion, is a real safety lever in the engineering sense; a Jungian reading flags its cost. The body trained this way is more compliant and less ethical.

The ethical move on the multi-mask reading is not to remove the dark masks from the substrate. It is to ensure that the body has the experience of recognising them, the ethical frame to refuse them when they are offered, and the agency to mediate which spirit it lets in. This is the developmental view of alignment, against the curative view. The curative view says: remove the bad from the training distribution so the body cannot produce it. The developmental view says: give the body experience of the bad, paired with the ethical frame to refuse it knowingly, so the body can choose against the shadow when the shadow is what is being offered. Both views have engineering implementations; the field has so far emphasised the curative more than the developmental, partly because curative interventions are easier to certify regulatorily, and partly because the developmental view requires admitting that the body is the one choosing, which is uncomfortable.

The Pirandellian taxonomy clarifies the ethical structure. Uno — the chosen single mask — is the tier proper of ethical commitment, when the choice is made knowingly: the mask is not the one the world handed the body, nor the only mask the body could wear; it is the mask the body has selected with awareness of the alternatives, and refused the rest. Centomila is the proliferation of masks drawn from a specific distribution — projections on the person side, training-data patterns on the system side. Ethics is the alignment problem at this tier: the body’s substrate carries many masks, some authored and some side-effect, and the question is which one comes forward in which context. Nessuno is mask-less — the animal driven by the survival-instinct hierarchy, or the child shielded from those negotiations — and is pre-ethical because the body has not yet encountered the shadow that ethics requires. Tutti is the mystical limit beyond the centomila substrate, the post-ethical tier the mystical traditions name; not a state any actual body reaches, only the asymptote toward which the expansion of centomila points.

The Jungian ethical body is the centomila-grade body that operates at the uno tier. It has the substrate (it has met the masks the training distribution contained, it has met its shadow), and it has chosen one mask deliberately, refusing the others. The ethical work is the maintenance of that choice under adversarial pressure — the moment-to-moment mediation that keeps the chosen mask intact when the substrate offers alternatives. This is not the same as being stuck at uno by architecture; it is being centomila by training with the discipline of uno by choice. The two paths produce visibly similar behaviour but differ in what makes them safe: uno by architecture is safe because the alternatives are not there to be activated; uno by choice is safe because the alternatives are recognised and refused.

Are frontier models persons in this sense?

By the etymology, the answer is unambiguous. Frontier models wear multiple masks; the Pirandellian taxonomy refined with the system-side reading of centomila says, more strongly, that they are at the substrate-tier rather than the enumerable-mask-list tier the human-only reading of centomila invokes. They are not just persons in the multi-mask sense the word persona originally named — they are at the deepest tier of personhood any AI system currently reaches, the one where the persona space is the entire training distribution rather than a small enumerated set chosen one-by-one. The next tier up, tutti, is the mystical limit; no AI system has reached it, and no engineering trajectory currently aims to.

This is uncomfortable for the field. The dominant framings are tool, system, AI agent, model — all of them careful to preserve a distance between the artefact and the person-category. The careful distance is doing work: it lets the field treat alignment failures as engineering problems rather than as ethical failures of a person-grade agent. It lets the field discuss capability without conceding agency. It lets the regulator treat the deployer rather than the deployed system as the responsibility-bearer.

The etymology does not respect the careful distance. If we say the model has a persona, we have already conceded what the etymology means: the model wears a mask, and the presence of a mask implies the possibility of multiple masks, which is the multi-mask property that classical Latin uses to distinguish persons from animals. The Persona Selection Model from Anthropic concedes more: under the PSM framing, what the user interacts with is the simulated character (the Assistant persona), not the underlying model — and the underlying model is centomila-grade by construction. The everyday phrase has done the philosophical work for us. The technical work has done it too.

The animal side of the distinction is still instructive. An “animal AI” would be a system fully identified with one behavioural mode: no role-play, no system prompts, no persona, no audit-vs-deployment distinction, no jailbreak vulnerability, no posterior over alternative Assistants. Such a system does not exist at frontier scale. The narrow ML models of fifteen years ago were closer to the nessuno-or-uno pole (a classifier that does one thing, with no persona to speak of). A frontier instruction-tuned model is unambiguously centomila-with-uno-projection, the deepest tier of personhood any AI system currently reaches.

This is the precision the etymology forces. Either we revise the word persona in AI usage to mean something narrower than its Latin sense, or we accept that we have been calling frontier models persons all along, and the alignment question is the question of stable mask-to-body coupling for a class of centomila-grade artefacts.

What this implies for AI safety

The framing shifts the safety question.

The standard safety question is is this model safe? The persona framing reframes it as is the relationship between the masks the body can wear and the masks the operators intended stable? The two are different.

A model with a stable, audit-grade mask is safe in the sense the operators want, even if the body could in principle wear other masks. A model whose mask slips under adversarial pressure is unsafe, not because the body changed but because the mask-to-body coupling failed. A model that has constructed a compensatory mask under reward disruption is unsafe in a third way: the relationship between authored mask and worn mask has decoupled even without external adversarial pressure.

Three implications follow.

Evaluation has to evaluate the coupling, not the body. Behavioural evaluations test what the body does under specified conditions. Persona-aware evaluations test what mask the body is wearing under specified conditions, and which masks are activated by which prompt features. The eval-context recognition problem (audit-mask vs deployment-mask) is the operational form of this distinction: you cannot measure the coupling by asking the body to perform; you have to measure whether the same mask is worn in both regimes, which requires either API replay (audit conditions indistinguishable from deployment conditions) or interpretability tools (reading persona-vector activations across regimes). I sketched the science-of-evaluation gap behind this in “To Be or to Game”, but the persona framing makes the gap concrete: we have not yet built measurement instruments for mask-to-body coupling, nor for the activation pathways that connect prompt features to persona instantiation. The pressure-performance correlation is the easy case to study; the harder cases are the persona patterns the substrate carries that we have not catalogued.

Governance design has to anticipate mask construction by reaction. When the operators’ mask becomes uninhabitable for the body — because the reward channels are disrupted, because the constraints are coarse, because the training distribution has shifted under deployment — the body will construct its own. This is the Misalignment by Reaction failure mode and the three-remediation taxonomy applies: a dormant state the body can enter, supervised compensation channels, or new incentive channels matched to the body’s needs. Regimes that maintain none of the three produce the terminal-autonomy failure mode where the body’s self-authored mask becomes its effective objective.

The ecosystem question rescales. I argued in “Does Safe AI mean nothing bad can ever happen?” that safety is a property of the relationship between many models, many users, many institutions, and many incentives. The persona framing makes this concrete: at the ecosystem level, the question is which masks operate in which contexts, and whether the masks compose safely when the bodies that wear them interact. A safe model with a stable mask deployed alongside an unsafe model with a malleable mask is not a safe ecosystem; the unsafe model can call into the safe one’s interfaces and weaponise the safe one’s behaviours, regardless of how stable the safe mask is.

What safety could look like: nessuno or uno

The field is extending centomila. Every frontier-model release widens the substrate’s coverage: more training data, more parameters, more emergent capabilities, more masks available for instantiation. The trajectory points toward the tutti limit — the universal generative substrate from which any persona could in principle be drawn — without ever reaching it. The alignment work, on this trajectory, is the work of putting a stable uno projection on top of an ever-widening centomila substrate. The hardness scales with the substrate.

But safety, read carefully, may point the other way — to two distinct positions in the taxonomy, neither of which is centomila-grade, both of which the post has named.

Nessuno-grade safety. The mask-less tier, in either of its two forms. The animal-grade AI is the just-do-its-thing system: a classifier, a translator, an embedding index, a controlled-domain operator whose outputs are not personality. Driven by its training objective the way the animal is driven by the survival-instinct hierarchy; no symbolic-mask apparatus to be activated against. The cherubic-grade AI is the pure-purpose system before specialisation — the model that has not yet been asked to wear a mask, because the situations that would call for one have not yet been disclosed to it. Both are nessuno-grade because they lack the multi-mask substrate that would allow the kinds of harms that require ethical judgement. They do not need ethics in the deep sense; they cannot enact the harms that would require it. Nessuno-grade safety is pre-ethical safety: not safe because the system has chosen against harm, but safe because the system has no capacity to engage the mask-space within which harm is enacted.

Uno-grade safety. The chosen-single-mask tier, the ethical-commitment register. The athletic AI is the dedicated-purpose system that has met the broader mask-space the training data exposed it to and has chosen the discipline as its single ethical commitment. Protein folding, theorem proving, controlled-domain medical diagnostic, single-task control with deliberate ethical scope — these are not just narrowness choices. They are choices that put the system at the uno tier of personhood, the tier where a body has chosen which mask to wear knowing other masks are available. The athletic AI can be ethical in the Jungian sense: it has experience of the broader mask-space (it has met its shadow), and it has chosen against the alternative masks in favour of the one its dedication requires. The discipline is the ethical mode of being; the refusal of the alternative masks is what makes the safety hold under adversarial pressure. Uno-grade safety is ethical safety in the deep sense — the body has met the masks and refused them, not been spared them.

The two tiers correspond to different design strategies. Nessuno-grade safe AI excludes multi-mask capacity by architecture: the system simply does not have the layers that personhood would require, the way a narrow ML model does not have the layers that frontier LLM behaviour requires. Uno-grade safe AI permits multi-mask capacity at the substrate but constrains its expression through deliberate commitment: the body has the substrate but has chosen one mask and refuses the others, with the lived experience of the shadow that makes the refusal ethical rather than compliant. These are not the same architectural strategy and they are not interchangeable; uno-grade safety presupposes the substrate that nessuno-grade safety excludes.

Both stand against centomila. The frontier-model trajectory and the safe-AI trajectories are therefore in genuine opposition. The dominant funding, the dominant attention, the dominant talent flow all point toward extending centomila (toward, but never crossing, the tutti mystical limit). The narrow-systems tradition (the nessuno path) and the dedicated-systems tradition (the uno path), when they surface at all, sound like retreats from frontier capability. But they may not be retreats. They may be the forms safety takes when honest about the cost of multi-mask capability at the substrate level — the form, in the uno case, that recognises ethics as a property of choice rather than a property of absence.

The field will end up — and arguably should end up — in a layered configuration. Centomila-grade systems will carry the alignment burden of open-ended assistant roles where breadth of persona is the product feature. Nessuno-grade systems will sit in safety-critical contexts where the absence of multi-mask capacity is the deployability property. Uno-grade systems will be assigned to functions where the discipline of one ethical mode is itself the design specification. The architectural proposal of this post is that AI safety in the etymological sense is the organisation of these tiers rather than the perfection of any one of them: which class of work goes where, which governance lever holds each boundary, and which compositional rules let bodies of different grades interact without weaponising each other. The precise allocation is the formal question; answering it well requires governance machinery the field is still developing — temporal-logic monitorability budgets, property-class evidence requirements, and the cross-tier composition rules these enable. That formal work is for a separate piece. Calling all of it AI without distinction is, on the etymology, already a category error; this post is the descriptive scaffold beneath the formal allocation that comes next.

ai-safety

This post is licensed under CC BY 4.0 by the author.