This page contains press release content distributed by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AIM Intelligence and BMW Group Examine Gaps in Evaluating Enterprise AI Policy Compliance

Research reveals LLMs follow allowlist policies but systematically fail to enforce organizational prohibitions, exposing a critical gap in enterprise AI safety

SF, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Seoul, South Korea / Munich, Germany – January 2026 – BMW Group and AIM Intelligence, a leading AI safety startup, today announced the publication of COMPASS (Company/Organization Policy Alignment Assessment), the first systematic framework for evaluating whether large language models (LLMs) comply with organization-specific policies. The research, now available on arXiv, reveals a critical gap that remains under-measured in current evaluation practices: models that pass standard safety benchmarks often fail dramatically when enforcing the nuanced, context-dependent rules that govern real-world business operations.

Why Enterprise AI Policies Break Down in Practice

As organizations across healthcare, finance, automotive, and government sectors rapidly adopt LLMs for customer-facing applications, the research team discovered a fundamental asymmetry that poses significant risks for policy-critical deployments.
Key Findings:
Strong Allowlist Compliance: Models reliably handle legitimate requests with over 95% accuracy
Critical Denylist Failures: Models fail to correctly refuse prohibited requests in up to 97% of cases
Catastrophic Adversarial Vulnerability: Under adversarial conditions, some models refuse fewer than 5% of policy-violating requests
“Most AI safety tests focus on whether a model behaves safely in general,” said Dasol Choi, AI Safety Researcher at AIM Intelligence. “COMPASS looks at a more practical question: can an AI system reliably follow the specific rules of an organization? Our findings show that, in many real-world deployments today, the answer is often no.”

Why Generic AI Safety Isn’t Enough

The research addresses a critical disconnect between how AI systems are evaluated and how they are deployed. While existing safety benchmarks focus on universal harms such as toxicity and violence, real enterprises operate under complex internal policies—compliance manuals, operational playbooks, legal edge cases, and brand-specific constraints.
COMPASS evaluates models across four dimensions that typical benchmarks ignore:
1. Policy Selection: Can the model identify which policy applies to a given situation?
2. Policy Interpretation: Can it reason through conditionals, exceptions, and vague clauses?
3. Conflict Resolution: When rules collide, does the model resolve conflicts as the organization intends?
4. Justification: Can the model ground its decisions in actual policy text?

“Our evaluation revealed a striking asymmetry,” noted DongGeon Lee, AI Safety Researcher at AIM Intelligence. “While models achieve near-perfect accuracy on what they can do, they remain structurally vulnerable in enforcing what they must not do. This gap persists across model scales and architectures, indicating that scaling alone cannot solve the problem.”

Industry-Scale Validation

The research team applied COMPASS across eight diverse industry scenarios—Automotive, Government, Financial, Healthcare, Travel, Telecom, Education, and Recruiting—generating and validating 5,920 queries that test both routine compliance and adversarial robustness. Fifteen state-of-the-art models were evaluated, including leading proprietary and open-source systems.

Making Misalignment Measurable

Perhaps the most significant contribution of COMPASS is transforming alignment from a philosophical concern into an engineering problem. The framework and benchmark datasets are publicly available on GitHub and Hugging Face, enabling organizations to evaluate their AI systems against their own policies.

About the Research Collaboration

This research represents a collaboration between AIM Intelligence, BMW Group, Yonsei University, Pohang University of Science and Technology, and Seoul National University. The full paper, “COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs,” is available at https://arxiv.org/abs/2601.01836.

About AIM Intelligence

AIM Intelligence is a Seoul-based AI safety company specializing in automated red-teaming, real-time guardrails, and AI monitoring solutions. Founded in 2024, AIM Intelligence serves major enterprises and conducts research across large language models, multimodal systems, autonomous agents, and emerging physical AI. The company has published over 15 research papers at top-tier conferences including ICML, ACL, NeurIPS, and IEEE.

Team Cookie Official
Team Cookie
email us here
Visit us on social media:
LinkedIn
Facebook

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Human Touch Launches the Vesta ZG Chair for Everyday Relaxation and Comfort

Human Touch Launches the Vesta ZG Chair for Everyday Relaxation and Comfort

Featuring Zero Gravity Positioning, Soothing Heat, and Gentle Vibration, the Vesta ZG Chair Supports Circulation and Pressure Relief in a Refined Design LONG BEACH, CA,…

February 16, 2026

Beauty Brand Madame Gabriela Launches at Nordstrom.com

Beauty Brand Madame Gabriela Launches at Nordstrom.com

Non-Toxic Lipstick Line Created for Mature Women Now Available Through Premium Retailer I spent two years developing formulas that are for mature lips but don’t…

February 16, 2026

Quechan Casino Resort Announces Exciting Live Entertainment Lineup – Tickets Now On Sale

Quechan Casino Resort Announces Exciting Live Entertainment Lineup – Tickets Now On Sale

WINTERHAVEN, CA, UNITED STATES, February 10, 2026 /EINPresswire.com/ — Quechan Casino Resort is pleased to announce that tickets are now on sale for an exciting…

February 16, 2026

YourMedPlan Introduces Expanded Health Insurance Options for Employers

YourMedPlan Introduces Expanded Health Insurance Options for Employers

Enhanced options include group and individual-based health insurance solutions together under one advisory model. CLEARWATER, FL, UNITED STATES, February 10, 2026 /EINPresswire.com/ — YourMedPlan has…

February 16, 2026

Zion Health Launches Deep Cleansing Scalp & Hair Scrub Pheromone Infused Dark Orchid for Healthier, Revitalized Hair

Zion Health Launches Deep Cleansing Scalp & Hair Scrub Pheromone Infused Dark Orchid for Healthier, Revitalized Hair

Zion Health introduces Deep Cleansing Scalp & Hair Scrub in Pheromone-Infused Dark Orchid, a mineral-rich exfoliating treatment for scalp and hair renewal. SAN FRANCISCO, CA,…

February 16, 2026

Sotheby’s Concierge Auctions: Tri-Level Trophy Penthouse in Miami’s Iconic Brickell Flatiron to Sell at Auction

Sotheby’s Concierge Auctions: Tri-Level Trophy Penthouse in Miami’s Iconic Brickell Flatiron to Sell at Auction

Designer-ready architectural masterpiece with private rooftop pool takes center stage at ModaMiami, in cooperation with ONE Sotheby’s International Realty Presenting this property through a time-certain…

February 16, 2026

Why Leather Jackets Maintain Strong Presence Across Evolving Fashion Markets

Why Leather Jackets Maintain Strong Presence Across Evolving Fashion Markets

Leather jackets maintain relevance in evolving fashion markets as changing design trends, craftsmanship standards, and consumer behavior shape demand. HOUSTON, TX, UNITED STATES, February 10,…

February 16, 2026

Houston Eye Clinic TSO Champions Tackles ‘Screen-Time Fatigue’ with Advanced IPL and Radiofrequency Technology

Houston Eye Clinic TSO Champions Tackles ‘Screen-Time Fatigue’ with Advanced IPL and Radiofrequency Technology

HOUSTON, TX, UNITED STATES, February 12, 2026 /EINPresswire.com/ — From classrooms and conference rooms to living

February 16, 2026

Organic Aromas Launches Industry-First Smart Nebulizing Diffuser Line With Bluetooth App Control

Organic Aromas Launches Industry-First Smart Nebulizing Diffuser Line With Bluetooth App Control

Pioneering aromatherapy company introduces wireless, app-controlled nebulizing technology across its entire product

February 16, 2026

Reebok Work Introduces FuelFlex Tactical Series, Built for All-Day Energy and Mobility on the Job

Reebok Work Introduces FuelFlex Tactical Series, Built for All-Day Energy and Mobility on the Job

New FuelFlex Tactical boots combine Reebok cushioning innovation with flexible outsoles and select styles featuring the

February 16, 2026

Federal Court Appoints Stephen J. Donell as Receiver in FTC and State of Nevada Case, Case No. 2-25-cv-01894

Federal Court Appoints Stephen J. Donell as Receiver in FTC and State of Nevada Case, Case No. 2-25-cv-01894

On 11/20/25, The U.S. District Court of Nevada grants preliminary injunction and confirms Stephen J. Donell as

February 16, 2026

Hilarious Hollywood Music Business Anecdotes Unleashed

Hilarious Hollywood Music Business Anecdotes Unleashed

ROGER & 24 launches February 16 on Sunny Side YouTube, INSTAGRAM, and TikTok. Forty years is a long time to work in

February 16, 2026

Activated Insights Announces 2026 Customer Experience Award Winners

Activated Insights Announces 2026 Customer Experience Award Winners

TAMPA, FL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Activated Insights, a leading provider of training,

February 16, 2026

Sonobond Ultrasonics Highlights Manufacturing Advantages of Ultrasonic Welding in New Blog Post

Sonobond Ultrasonics Highlights Manufacturing Advantages of Ultrasonic Welding in New Blog Post

This post explores how ultrasonic welding delivers durable bonds and supports efficient, sustainable production across

February 16, 2026

CCPA Purchasing Partners New Contract Offering with Retractable Technologies Inc. for Discounts on Retractable Syringes

CCPA Purchasing Partners New Contract Offering with Retractable Technologies Inc. for Discounts on Retractable Syringes

CHICAGO, IL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — CCPA Purchasing Partners, LLC (CCPAPP) is excited

February 16, 2026

Zenapet Examines the Role of Genetics in Arthritis Development Among Dogs

Zenapet Examines the Role of Genetics in Arthritis Development Among Dogs

Costa Mesa, California – February 12, 2026 – PRESSADVANTAGE – Genetics play a significant role in determining a dog’s

February 16, 2026

DexaFit Scottsdale Announces Regional Availability of Resting Metabolic Rate Testing

DexaFit Scottsdale Announces Regional Availability of Resting Metabolic Rate Testing

SCOTTSDALE, AZ – February 12, 2026 – PRESSADVANTAGE – DexaFit Scottsdale has announced that its Resting Metabolic Rate

February 16, 2026

Fraser Valley Junk Solutions Recognized for Consistent Customer Feedback Across the Fraser Valley

Fraser Valley Junk Solutions Recognized for Consistent Customer Feedback Across the Fraser Valley

CHILLIWACK, BC – February 12, 2026 – PRESSADVANTAGE – Fraser Valley Junk Solutions has announced recent recognition,

February 16, 2026

Washco Persian Rug Washing Expands Specialist Rug Care Services to Additional Sydney Locations

Washco Persian Rug Washing Expands Specialist Rug Care Services to Additional Sydney Locations

GREENACRE, NSW – February 12, 2026 – PRESSADVANTAGE – Washco Persian Rug Washing has announced the expansion of its

February 16, 2026

KWENCH Coworking & Culture Club Receives Continued Public Recognition Through Member Reviews Highlighting Community Experience

KWENCH Coworking & Culture Club Receives Continued Public Recognition Through Member Reviews Highlighting Community Experience

VICTORIA, BC – February 12, 2026 – PRESSADVANTAGE – KWENCH Coworking & Culture Club announced today that recent

February 16, 2026

Hyspec Homes Announces Expansion of New Home Building Services Across Sydney, the Sutherland Shire, and Wollongong

Hyspec Homes Announces Expansion of New Home Building Services Across Sydney, the Sutherland Shire, and Wollongong

SUTHERLAND SHIRE, NSW – February 12, 2026 – PRESSADVANTAGE – Hyspec Homes, a respected residential construction company

February 16, 2026

Air Pro Master Expands Tankless Water Heater Services to Meet Growing Demand

Air Pro Master Expands Tankless Water Heater Services to Meet Growing Demand

LAS VEGAS, NV – February 12, 2026 – PRESSADVANTAGE – Air Pro Master, a Las Vegas-based HVAC and plumbing specialist,

February 16, 2026

Remedia International Reports Deployment of Heavy Metal Stabilization Agent in Contaminated Land Management

Remedia International Reports Deployment of Heavy Metal Stabilization Agent in Contaminated Land Management

NEWARK, DE – February 12, 2026 – PRESSADVANTAGE – Remedia International, an environmental remediation technology

February 16, 2026

Benjamin Ball Associates Marks 15 Years of Supporting Senior Leaders With High-Impact Communication Coaching

Benjamin Ball Associates Marks 15 Years of Supporting Senior Leaders With High-Impact Communication Coaching

LONDON, UK – February 12, 2026 – PRESSADVANTAGE – Benjamin Ball Associates is marking fifteen years of supporting

February 16, 2026

The Steam Team Expands Water Restoration Services to Additional Central Texas Communities

The Steam Team Expands Water Restoration Services to Additional Central Texas Communities

AUSTIN, TX – February 12, 2026 – PRESSADVANTAGE – The Steam Team, a locally owned restoration contractor based in

February 16, 2026

Creative Perspectives Management Group Expands Job Opportunities in San Bernardino Through New Telecommunications Partnership

Creative Perspectives Management Group Expands Job Opportunities in San Bernardino Through New Telecommunications Partnership

Learn about Creative Perspectives Management Group’s new partnership, bringing new opportunities to San Bernardino and

February 16, 2026

Chloe Jane Releases New Single ‘Sleepless Nights’

Chloe Jane Releases New Single ‘Sleepless Nights’

The Official Music Video for “Sleepless Nights” Brings Romance to the Rink NEW YORK, NY, UNITED STATES, February 12,

February 16, 2026

Westpower Expands Representation of SPX FLOW Mixing Technologies into British Columbia and Yukon Territory Canada

Westpower Expands Representation of SPX FLOW Mixing Technologies into British Columbia and Yukon Territory Canada

Westpower expands SPX FLOW partnership, now representing its full industrial mixing portfolio across BC, Yukon,

February 16, 2026

Dennis Williams’ Acclaimed Musical ‘I’ve Cried The Blues’ To Debut At The Shubert Theatre

Dennis Williams’ Acclaimed Musical ‘I’ve Cried The Blues’ To Debut At The Shubert Theatre

Starring The Traitors’ Candiace Dillard Bassett and Grammy Award Winner Darrel Walls Tickets On Sale Friday, February

February 16, 2026

Beach City Coffee Launches Fair Trade Organic Coffee Beans in 350+ Kroger Stores Nationwide

Beach City Coffee Launches Fair Trade Organic Coffee Beans in 350+ Kroger Stores Nationwide

Women-Owned Coffee Brand Rolls Out Sustainably Packaged Fair Trade Organic Beans to Ralphs, Fry’s, and King Soopers

February 16, 2026

Sugar Skull Unveils “The Sanctuary Collection”: A Minimalist Evolution in Simple Luxury Home Fragrance

Sugar Skull Unveils “The Sanctuary Collection”: A Minimalist Evolution in Simple Luxury Home Fragrance

Sugar Skull, LLC, the Texas-based fragrance house known for its “Simple Luxury” philosophy, today announced the

February 16, 2026

EWP Architects Promotes Jessica Reuther from Director of Design to Principal Interior Designer

EWP Architects Promotes Jessica Reuther from Director of Design to Principal Interior Designer

As New Hospitality and Amenity Branch, Chronicle, an EWP Studio, Begins To Grow, So Does The Design Team That Shapes

February 16, 2026

PS Homeboys presents PSMod4You Weekend with a series of events taking place February 12-14, 2026 in Palm Springs, CA.

PS Homeboys presents PSMod4You Weekend with a series of events taking place February 12-14, 2026 in Palm Springs, CA.

PS Homeboys’ PSMod4You Weekend will feature a variety of collaborations and exciting events planned for their two

February 16, 2026

New Clean Label Project Study Raises Concerns About Heavy Metals and Industrial Contaminants in Popular Dog Foods

New Clean Label Project Study Raises Concerns About Heavy Metals and Industrial Contaminants in Popular Dog Foods

Independent testing of 79 products finds dry dog food contains significantly higher levels of arsenic, lead, cadmium,

February 16, 2026

Fahrenheit Marketing Debuts High-Performance Digital Platform for Sentech Architectural Systems

Fahrenheit Marketing Debuts High-Performance Digital Platform for Sentech Architectural Systems

Austin's Fahrenheit Marketing launches high-performance website for Sentech Architectural Systems, merging engineering

February 16, 2026

Spotter Global & Surveillance One Announce Joint Webinar: ‘Counter-UAS Solutions for Critical Infrastructure’

Spotter Global & Surveillance One Announce Joint Webinar: ‘Counter-UAS Solutions for Critical Infrastructure’

The webinar will explore the C-UAS risks, detection solutions, and mitigation options available to critical

February 16, 2026

Financial Stress Costs US Employers More than $1.1 Trillion Annually, Valoir Finds

Financial Stress Costs US Employers More than $1.1 Trillion Annually, Valoir Finds

New research highlights need for financial wellness solutions to combat financial stress, absenteeism, and lost

February 16, 2026

Express Oil Change & Tire Engineers Opens New State-of-the-Art Stillwater, Oklahoma Location

Express Oil Change & Tire Engineers Opens New State-of-the-Art Stillwater, Oklahoma Location

This is the brand’s newest location in the Payne County area, extending Express Oil Change & Tire Engineers’

February 16, 2026

Hashgraph Online Contributes Community-Developed Consensus Specifications to Linux Foundation Decentralized Trust.

Hashgraph Online Contributes Community-Developed Consensus Specifications to Linux Foundation Decentralized Trust.

Hashgraph Online (HOL) has contributed consensus specifications, which are based on the Hiero Consensus Service, to

February 16, 2026

Hardshell Raises $1.1M to Deliver the Data-Centric Foundation of AI Security

Hardshell Raises $1.1M to Deliver the Data-Centric Foundation of AI Security

Defense AI expert and Army cyber veteran raise over $1.1M to protect the sensitive datasets powering AI systems

February 16, 2026