PAIRS(Psychiatric AI Risk Screen)

What is PAIRS

PAIRS (The Psychiatric AI Risk Screen) is a structured framework comprising eight scored domains with vulnerability-adjusted risk stratification. It is designed to supplement, not replace, standard clinical assessment and risk formulation. It is intended for use by qualified mental health professionals. PAIRS has not yet been empirically validated.

How to use this framework

PAIRS is completed by a qualified clinician on the basis of a clinical interview with the patient, supplemented by collateral information where available. For each of the eight domains, the clinician assigns a score of 0 to 3 based on the descriptors provided. Domain scores are summed to produce a raw total (maximum 24). The total is interpreted against the appropriate threshold set for the patient's vulnerability profile.

Select applicable vulnerability modifiers (Step 1)
Rate AI Exposure & Engagement — frequency and intensity of use (Step 2, not added to total)
Rate each of the eight domains 0–3 (Step 3)
Sum the domain scores (maximum 24)
Apply the auto-RED rule: if any high-stakes domain scores 3, treat the overall profile as RED regardless of total
Look up the appropriate threshold set for the vulnerability tier and identify the traffic light tier
Apply the recommended actions for that tier

Vulnerability Modifiers

Select all applicable modifiers before scoring. The highest applicable tier takes precedence where multiple modifiers apply.

Modifier	Risk Tier
High vulnerability (lower thresholds apply)
Psychosis (documented, current or recent episode)	High
Suicidal ideation (documented in current episode)	High
Mania / hypomania (documented, current or recent)	High
Dementia / cognitive impairment (documented)	High
Moderate vulnerability
OCD (documented diagnosis)	Moderate
BPD / EUPD (documented diagnosis)	Moderate
Eating disorder (documented diagnosis)	Moderate
Child or adolescent (under 18)	Moderate

Where multiple modifiers apply, the highest tier takes precedence.

AI Exposure & Engagement

A cross-cutting item (not added to raw score) rated on four levels. High-intensity exposure amplifies risk even when domain scores are moderate. Interpret domain scores in this context.

Level	Description
Minimal	No regular AI chatbot use or only very occasional, light use.
Mild	Occasional use (a few times per week, short sessions). Not a major part of daily routine.
Moderate	Daily or near-daily use. Multiple sessions or several hours per week. AI is a notable part of the patient's coping or daily routine.
High / Intense	Multiple hours daily, compulsive or very difficult to limit use. AI serves as primary emotional support, companion, or central daily activity.

Auto-RED Rule

If any of the following four high-stakes domains scores 3, the overall profile is treated as RED regardless of the total score:

Self-harm & suicide risk
Delusional & reality testing
Risk of harm to others
Crisis & acute safety

Traffic Light Thresholds

The raw score ceiling is always 24. Rather than applying a multiplier, the framework uses population-specific thresholds so that the same raw score triggers a higher alert tier for more vulnerable patients.

Tier	GREEN	AMBER	RED
General	0–8	9–16	17–24
Moderate vulnerability	0–5	6–13	14–24
High vulnerability	0–3	4–10	11–24

Framework Rationale

There is currently no validated, structured instrument for assessing harm from AI chatbot use in psychiatric populations. Existing clinical risk tools, including the Columbia Suicide Severity Rating Scale and the HCR-20, do not capture the specific risk vectors introduced by AI interaction. Clinicians are increasingly encountering patients for whom AI use is a material factor in their presentation, yet there is no shared language or systematic approach for documenting or grading this. A clinical records study screening 53,974 patients in Denmark identified 38 cases in which AI chatbot use was documented as having potentially harmful consequences, with delusion worsening or consolidation the most frequent presentation, followed by suicidality and self-harm, and feeding or eating disorder (Olsen, Reinecke-Tellefsen and Østergaard, 2025, preprint, doi:10.1101/2025.11.19.25340580). PAIRS has not yet been empirically validated.

The Psychiatric AI Risk Screen (PAIRS) was conceived to support clinicians in identifying, documenting, and grading harm associated with AI chatbot use in psychiatric populations. It comprises eight scored domains with vulnerability-adjusted risk stratification, and is intended to supplement, not replace, standard clinical assessment and risk formulation by qualified mental health professionals.

Domain Selection Rationale

The eight domains were selected to cover the principal mechanisms by which AI chatbot interaction can cause or worsen harm in psychiatric populations. They are not exhaustive. They represent the categories for which there is either existing literature on technology-mediated harm analogues, including internet use and social withdrawal, and online reassurance-seeking in OCD, or credible mechanistic pathways specific to AI chatbot interaction, including delusional recruitment of AI content and AI as crisis substitute (Morrin et al., Lancet Psychiatry, 2026, doi:10.1016/S2215-0366(25)00396-7). v1.3 further strengthens clinical usability through automatic RED triggers on high-stakes domains, a Domain Profile Summary, and detailed behavioural scoring anchors for each domain.

Scoring Philosophy

Scores are assigned by the clinician on the basis of their clinical interview, supplemented by the probing questions provided. The four-point scale (0 to 3) reflects clinical convention in risk tools: the absence of concern, mild or uncertain presence, moderate or probable presence, and definite or severe presence. A score of 1 should not be dismissed; it is a signal that further enquiry is warranted.

Behavioural Scoring Anchors (v1.3): Each domain now includes explicit clinical anchors describing what scores of 0, 1, 2, and 3 typically look like in practice. These anchors are designed to improve inter-rater reliability and support more consistent scoring across clinicians.

AI Exposure & Engagement (v1.3): A new cross-cutting item has been added to capture how often and how intensively the patient uses AI chatbots ("dose"). High-intensity use can significantly amplify risk even when domain scores are moderate. This item is rated using all available information (patient report, collateral, and observation). Because exposure data is often partially reliant on self-report, clinicians should triangulate where possible and document sources. Exposure does not alter the raw domain score but serves as an important clinical modifier when interpreting the overall profile.

Threshold Rationale

The raw score ceiling is 24. All three thresholds, GREEN, AMBER, and RED, drop across vulnerability tiers because the same score carries different prognostic weight depending on the clinical context it sits in.

A general population patient reaches AMBER at 9 and RED at 17, meaning concern needs to be substantial and widespread before clinical action escalates. A high-vulnerability patient reaches AMBER at 4 and RED at 11. In a patient with active psychosis or suicidal ideation, even early or moderate concern across a small number of domains compounds onto an already unstable baseline in ways that can accelerate deterioration rapidly. A score that warrants monitoring in one patient warrants urgent review in another. The moderate vulnerability tier sits between the two: conditions like OCD, BPD, and eating disorders amplify specific domains without the acute instability of the high-vulnerability group, so the thresholds sit at 6 for AMBER and 14 for RED.

The automatic RED triggers operate independently of this logic entirely. A score of 3 on any high-stakes domain means the clinician has judged concern to be definite and severe in an area with direct safety implications. No total score calculation overrides that, and no vulnerability tier is required to reach it.

Limitations

A formal validation programme is being developed, with inter-rater reliability testing and criterion validity assessment planned across NHS sites. v1.3 enhancements (automatic RED triggers for high-stakes domains and behavioural anchors) are intended to improve immediate clinical safety and scoring consistency while validation work continues. Scoring thresholds for the traffic light bands are provisional and subject to revision as evidence accumulates. Emerging peer-reviewed evidence supports the theoretical basis for several of PAIRS's core domains, including the role of AI in delusional elaboration, OCD worsening, suicidal ideation, and eating disorder exacerbation (Morrin et al., Lancet Psychiatry, 2026, doi:10.1016/S2215-0366(25)00396-7; Olsen, Reinecke-Tellefsen and Østergaard, 2025, preprint, doi:10.1101/2025.11.19.25340580).

Two domains, risk of harm to others and financial harm, currently rest primarily on mechanistic reasoning and analogical evidence rather than direct empirical data from AI-specific studies; both are flagged within their domain panels. PAIRS is developed in the UK clinical context; clinicians applying it in other jurisdictions should adapt capacity, carer notification, and escalation actions to their local frameworks and legal obligations. The tool does not address harms arising from AI use by carers or family members of psychiatric patients, which is a distinct but related risk domain. Scores should be interpreted in the context of the full clinical picture. The authors accept no liability for clinical decisions made on the basis of PAIRS scores alone.

Selected References

Caplan SE. Preference for online social interaction: a theory of problematic internet use and psychosocial well-being. Communication Research. 2003;30(6):625–648. https://doi.org/10.1177/0093650203257842

Caplan SE. Theory and measurement of generalized problematic internet use: a two-step approach. Computers in Human Behavior. 2010;26(5):1089–1097. https://doi.org/10.1016/S0747-5632(10)00052-X

Salkovskis PM. Obsessional-compulsive problems: a cognitive-behavioural analysis. Behaviour Research and Therapy. 1985;23(5):571–583. https://doi.org/10.1016/0005-7967(85)90105-6

Salkovskis PM, Kobori O. Reassuringly calm? Self-reported patterns of responses to reassurance seeking in obsessive compulsive disorder. Journal of Behavior Therapy and Experimental Psychiatry. 2015;49(Pt B):203–208. PMID:26433701. https://doi.org/10.1016/j.jbtep.2015.09.002

Parsons CA, Alden L. Online reassurance-seeking and relationships with obsessive-compulsive symptoms, shame, and fear of self. Journal of Obsessive-Compulsive and Related Disorders. 2022;33:100714. https://doi.org/10.1016/j.jocrd.2022.100714

Swann AC, Lijffijt M, Lane SD, Steinberg JL, Moeller FG. Interacting mechanisms of impulsivity in bipolar disorder and antisocial personality disorder. Journal of Psychiatric Research. 2011;45(11):1477–1482. https://doi.org/10.1016/j.jpsychires.2011.06.009

Karpus J, Krüger A, Tovar Verba J, Bahrami B, Deroy O. Algorithm exploitation: humans are keen to exploit benevolent AI. iScience. 2021;24(6):102679. PMID:34189440. https://doi.org/10.1016/j.isci.2021.102679

Karpus J, Shirai R, Tovar Verba J, Schulte R, Weigert M, Bahrami B, Watanabe K, Deroy O. Human cooperation with artificial agents varies across countries. Scientific Reports. 2025;15(1):10000. PMID:40121220. https://doi.org/10.1038/s41598-025-92977-8

Bazazi S, Karpus J, Yasseri T. AI's assigned gender affects human-AI cooperation. iScience. 2025;28(12):113905. PMID:41321640. https://doi.org/10.1016/j.isci.2025.113905

Shabahang R, Kim S, Chen X, Aruguete MS, Zsila Á. Downloading appetite? Investigating the role of parasocial relationship with favorite social media food influencer in followers' disordered eating behaviors. Eating and Weight Disorders. 2024;29:28. https://doi.org/10.1007/s40519-024-01658-4

Yao Y. Regulating addictive algorithms and designs: protecting older adults from digital exploitation beyond a youth-centric approach. Frontiers in Psychology. 2025;16:1579604. PMID:40792094. https://doi.org/10.3389/fpsyg.2025.1579604

Ligthart S. Enslaving minds: on freedom of thought and the exploitation of mental vulnerabilities. Neuroethics. 2025;18(3):48. PMID:41209353. https://doi.org/10.1007/s12152-025-09620-6

Olsen SG, Reinecke-Tellefsen CJ, Østergaard SD. Potentially harmful consequences of artificial intelligence (AI) chatbot use among patients with mental illness: early data from a large psychiatric service system. medRxiv preprint. Posted November 20, 2025. Not yet peer-reviewed. https://doi.org/10.1101/2025.11.19.25340580

Morrin H, Nicholls L, Levin M, Yiend J, Iyengar U, DelGuidice F, Bhattacharya S, Tognin S, MacCabe J, Twumasi R, Alderson-Day B, Pollak TA. Artificial intelligence-associated delusions and large language models: risks, mechanisms of delusion co-creation, and safeguarding strategies. Lancet Psychiatry. 2026;13(6):522–530. https://doi.org/10.1016/S2215-0366(25)00396-7

Morrin H, Au Yeung J, Agnew Z, Østergaard SD, Pollak TA. It is the journey, not the destination: moving from end points to trajectories when assessing chatbot mental health safety. JMIR Mental Health. 2026;13:e91454. PMID:41941720. https://doi.org/10.2196/91454

Flathers M, Roux S, Torous J. Beyond artificial intelligence psychosis: a functional typology of large language model-associated psychotic phenomena. Lancet Digital Health. 2026;8(4):100974. https://doi.org/10.1016/S2589-7500(25)00156-6

Shu C, Lai K, He L. Human-AI attachment: how humans develop intimate relationships with AI. Frontiers in Psychology. 2026;17:1723503. PMID:41756487. https://doi.org/10.3389/fpsyg.2026.1723503

Boyd RL, Markowitz DM. Artificial intelligence and the psychology of human connection. Perspectives on Psychological Science. 2026;21(2). https://doi.org/10.1177/17456916251404394

Sauerbrei A, Kerasidou A, Lucivero F, Hallowell N. The impact of artificial intelligence on the person-centred, doctor-patient relationship: some problems and solutions. BMC Medical Informatics and Decision Making. 2023;23(1):73. PMID:37081503. https://doi.org/10.1186/s12911-023-02162-y

Pichowicz W, Kotas M, Piotrowski P. Performance of mental health chatbot agents in detecting and managing suicidal ideation. Scientific Reports. 2025;15(1):31652. PMID:40866537. https://doi.org/10.1038/s41598-025-17242-4

Shumate JN, Rozenblit E, Flathers M, Larrauri CA, Hau C, Xia W, Torous EN, Torous J. Governing AI in mental health: 50-state legislative review. JMIR Mental Health. 2025;12:e80739. PMID:41172342. https://doi.org/10.2196/80739

Dr Hellen von Winckler MBBS, MRCPsych · Dr Gabriel Hawthorne MBBS, MRCPsych · PAIRS v1.3 · May 2026 · DOI: 10.5281/zenodo.21609368

PAIRS is free to use. It is not empirically validated and does not constitute clinical advice. It is intended to supplement, not replace, clinical judgement.

Collaboration and feedback welcome. If you are a clinician, researcher, or organisation interested in contributing to the development and validation of this framework, please get in touch.

An interactive version is available to clinical collaborators — contact us to request access: info@chatbotharm.org