OpenAI Adds Teen Safety Rules Built Into AI Prompts

April 20, 20262 min read

TL;DR

New prompt-based policies help developers build age-appropriate protections with open-weight models, closing AI safety gaps for younger users.

As AI systems become more integrated into daily life, ensuring they are safe for younger users presents a unique that requires precise, operational guidelines. Developers often struggle to translate broad safety goals into actionable rules, especially for teens who have different needs than adults. This gap can lead to inconsistent protections or overly broad filtering, undermining both safety and user experience.

OpenAI has introduced a set of prompt-based safety policies designed to work with its open-weight safety model, gpt-oss-safeguard, to simplify creating age-appropriate protections for teens. These policies are tailored to common risks faced by teens and informed by a review of existing research about their developmental differences. They are structured as prompts that can be directly used with gpt-oss-safeguard and other reasoning models, enabling consistent safety standards across systems.

The development of these policies involved collaboration with external organizations, including Common Sense Media and everyone.ai, to incorporate expert insights into teen safety. Their input helped shape the scope of content covered, strengthen prompt structures, and refine edge cases for evaluation. This approach aims to translate high-level safety goals into precise, operational rules that developers can apply in real-time content filtering or offline analysis of user-generated content.

By structuring policies as prompts, developers can more easily integrate them into existing workflows, adapt them to specific use cases, and iterate over time. This addresses one of the biggest gaps in AI safety for teens: the lack of clear, operational policies that developers can build from, often leaving them to start from scratch. The policies are released as open source through the ROOST Model Community to encourage collaboration and adaptation across the ecosystem.

The policies are intended as a starting point, not a comprehensive guarantee of teen safety, as each application has unique risks, audiences, and contexts. Developers are encouraged to adapt and extend these policies based on their specific needs and combine them with other safeguards like product design decisions, user controls, and monitoring systems. This layered defense-in-depth approach is seen as essential for building safer AI systems that balance empowerment with protection for younger users.

Limitations include that these policies do not reflect the full extent of OpenAI's internal policies or safeguards, and they require developers to actively tailor them to their products. Over time, OpenAI hopes this release contributes to a more robust, shared foundation for implementing safety policies in AI systems, supporting broader industry efforts like the Teen Safety Blueprint.