This page contains press release content distributed by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AIM Intelligence and BMW Group Examine Gaps in Evaluating Enterprise AI Policy Compliance

Research reveals LLMs follow allowlist policies but systematically fail to enforce organizational prohibitions, exposing a critical gap in enterprise AI safety

SF, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Seoul, South Korea / Munich, Germany – January 2026 – BMW Group and AIM Intelligence, a leading AI safety startup, today announced the publication of COMPASS (Company/Organization Policy Alignment Assessment), the first systematic framework for evaluating whether large language models (LLMs) comply with organization-specific policies. The research, now available on arXiv, reveals a critical gap that remains under-measured in current evaluation practices: models that pass standard safety benchmarks often fail dramatically when enforcing the nuanced, context-dependent rules that govern real-world business operations.

Why Enterprise AI Policies Break Down in Practice

As organizations across healthcare, finance, automotive, and government sectors rapidly adopt LLMs for customer-facing applications, the research team discovered a fundamental asymmetry that poses significant risks for policy-critical deployments.
Key Findings:
Strong Allowlist Compliance: Models reliably handle legitimate requests with over 95% accuracy
Critical Denylist Failures: Models fail to correctly refuse prohibited requests in up to 97% of cases
Catastrophic Adversarial Vulnerability: Under adversarial conditions, some models refuse fewer than 5% of policy-violating requests
“Most AI safety tests focus on whether a model behaves safely in general,” said Dasol Choi, AI Safety Researcher at AIM Intelligence. “COMPASS looks at a more practical question: can an AI system reliably follow the specific rules of an organization? Our findings show that, in many real-world deployments today, the answer is often no.”

Why Generic AI Safety Isn’t Enough

The research addresses a critical disconnect between how AI systems are evaluated and how they are deployed. While existing safety benchmarks focus on universal harms such as toxicity and violence, real enterprises operate under complex internal policies—compliance manuals, operational playbooks, legal edge cases, and brand-specific constraints.
COMPASS evaluates models across four dimensions that typical benchmarks ignore:
1. Policy Selection: Can the model identify which policy applies to a given situation?
2. Policy Interpretation: Can it reason through conditionals, exceptions, and vague clauses?
3. Conflict Resolution: When rules collide, does the model resolve conflicts as the organization intends?
4. Justification: Can the model ground its decisions in actual policy text?

“Our evaluation revealed a striking asymmetry,” noted DongGeon Lee, AI Safety Researcher at AIM Intelligence. “While models achieve near-perfect accuracy on what they can do, they remain structurally vulnerable in enforcing what they must not do. This gap persists across model scales and architectures, indicating that scaling alone cannot solve the problem.”

Industry-Scale Validation

The research team applied COMPASS across eight diverse industry scenarios—Automotive, Government, Financial, Healthcare, Travel, Telecom, Education, and Recruiting—generating and validating 5,920 queries that test both routine compliance and adversarial robustness. Fifteen state-of-the-art models were evaluated, including leading proprietary and open-source systems.

Making Misalignment Measurable

Perhaps the most significant contribution of COMPASS is transforming alignment from a philosophical concern into an engineering problem. The framework and benchmark datasets are publicly available on GitHub and Hugging Face, enabling organizations to evaluate their AI systems against their own policies.

About the Research Collaboration

This research represents a collaboration between AIM Intelligence, BMW Group, Yonsei University, Pohang University of Science and Technology, and Seoul National University. The full paper, “COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs,” is available at https://arxiv.org/abs/2601.01836.

About AIM Intelligence

AIM Intelligence is a Seoul-based AI safety company specializing in automated red-teaming, real-time guardrails, and AI monitoring solutions. Founded in 2024, AIM Intelligence serves major enterprises and conducts research across large language models, multimodal systems, autonomous agents, and emerging physical AI. The company has published over 15 research papers at top-tier conferences including ICML, ACL, NeurIPS, and IEEE.

Team Cookie Official
Team Cookie
email us here
Visit us on social media:
LinkedIn
Facebook

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

MarieBelle New York Unveils Valentine’s Day 2026 Chocolate Collection With Limited Edition Packaging and Assortments

MarieBelle New York Unveils Valentine’s Day 2026 Chocolate Collection With Limited Edition Packaging and Assortments

NEW YORK, NY, UNITED STATES, February 10, 2026 /EINPresswire.com/ — Just in time for Valentine’s Day, MarieBelle New York, the chocolatier known for artisanal craftsmanship…

February 17, 2026

Origin Detector (OD) Surpasses 457,000 Viewers in Record-Breaking Product Awareness Campaign

Origin Detector (OD) Surpasses 457,000 Viewers in Record-Breaking Product Awareness Campaign

The OD, innovative consumer awareness platform powered by QR codes, today announced that its latest product awareness campaign has reached 457,000 viewers This level of…

February 17, 2026

Senior Tech Executive Unveils ‘Media-SDN’ to Unleash Streaming Possibilities and Eliminate Betting Courtsiding

Senior Tech Executive Unveils ‘Media-SDN’ to Unleash Streaming Possibilities and Eliminate Betting Courtsiding

Media-SDN: A hardware-free protocol that synchronizes devices via audio to solve betting courtsiding and enable spoiler-free streaming globally. SãO PAULO, SãO PAULO, BRAZIL, February 10,…

February 17, 2026

Finland’s Health Authority Launches ‘2-4-2’ Gambling Risk Limits Ahead of Expected Advertising Boom

Finland’s Health Authority Launches ‘2-4-2’ Gambling Risk Limits Ahead of Expected Advertising Boom

THL is cautioning that gambling-related problems are increasing as Finland prepares for its 2027 licensing reform. To promote safer play, the institute has introduced new…

February 17, 2026

Art Melanated Presents SAVAGE – Opening Feb 14 in Los Angeles

Art Melanated Presents SAVAGE – Opening Feb 14 in Los Angeles

A Global Emerging Artist Exhibition Spotlighting the Future of Contemporary Art. Opening Feb 14 in Los Angeles LOS ANGELES, CA, UNITED STATES, February 9, 2026…

February 17, 2026

Doll Amir & Eley Welcomes Attorney Ryan H. Chan

Doll Amir & Eley Welcomes Attorney Ryan H. Chan

LOS ANGELES, CA, UNITED STATES, February 9, 2026 /EINPresswire.com/ — Doll Amir & Eley LLP announced today that Ryan H. Chan has joined the firm,…

February 17, 2026

SideHustlr.ai Reports Early Growth as Users Prioritize Modest Income Goals Over High-Risk Ambition

SideHustlr.ai Reports Early Growth as Users Prioritize Modest Income Goals Over High-Risk Ambition

New platform data shows most users seek financial breathing room rather than rapid wealth LOS ANGELES, CA, UNITED STATES, February 11, 2026 /EINPresswire.com/ — SideHustlr.ai,…

February 17, 2026

Municipal Waste Systems Leave a Gap in Residential Sanitation, Local Companies Are Stepping In

Municipal Waste Systems Leave a Gap in Residential Sanitation, Local Companies Are Stepping In

Communities are rethinking residential sanitation as local companies address gaps left by traditional waste systems. Communities are paying more attention to what happens inside the…

February 16, 2026

Arrested in Broward County, FL? Contact a Criminal Defense Lawyer

Arrested in Broward County, FL? Contact a Criminal Defense Lawyer

Broward County Criminal Defense Attorney Matthew Glassman Protecting Your Rights After New Year’s Arrests Fort Lauderdale, United States – February 6, 2026 / Law Offices…

February 16, 2026

Houston Attorney Husein Hadi Reaffirms Trial-First Approach to Personal Injury Representation

Houston Attorney Husein Hadi Reaffirms Trial-First Approach to Personal Injury Representation

HOUSTON, TX, UNITED STATES, February 10, 2026 /EINPresswire.com/ — Houston personal injury attorney Husein Hadi is reaffirming his commitment to a trial-first approach to legal…

February 16, 2026

Reputation Pros Named Among Best Reputation Management Companies in London

Reputation Pros Named Among Best Reputation Management Companies in London

Leading U.S.-Based Firm Recognized by Both Manchester Digital and London Post for Excellence in Online Reputation Management LONDON, LONDON, UNITED KINGDOM, February 10, 2026 /EINPresswire.com/…

February 16, 2026

Families Invited to Make Traditional Bánh Tét Together at San Diego Lunar New Year Festival

Families Invited to Make Traditional Bánh Tét Together at San Diego Lunar New Year Festival

Families are invited to make traditional bánh tét together at the 2026 San Diego Lunar New Year Festival, celebrating culture, memory, and connection. SAN DIEGO,…

February 16, 2026

Human Touch Launches the Vesta ZG Chair for Everyday Relaxation and Comfort

Human Touch Launches the Vesta ZG Chair for Everyday Relaxation and Comfort

Featuring Zero Gravity Positioning, Soothing Heat, and Gentle Vibration, the Vesta ZG Chair Supports Circulation and Pressure Relief in a Refined Design LONG BEACH, CA,…

February 16, 2026

Beauty Brand Madame Gabriela Launches at Nordstrom.com

Beauty Brand Madame Gabriela Launches at Nordstrom.com

Non-Toxic Lipstick Line Created for Mature Women Now Available Through Premium Retailer I spent two years developing formulas that are for mature lips but don’t…

February 16, 2026

Quechan Casino Resort Announces Exciting Live Entertainment Lineup – Tickets Now On Sale

Quechan Casino Resort Announces Exciting Live Entertainment Lineup – Tickets Now On Sale

WINTERHAVEN, CA, UNITED STATES, February 10, 2026 /EINPresswire.com/ — Quechan Casino Resort is pleased to announce that tickets are now on sale for an exciting…

February 16, 2026

YourMedPlan Introduces Expanded Health Insurance Options for Employers

YourMedPlan Introduces Expanded Health Insurance Options for Employers

Enhanced options include group and individual-based health insurance solutions together under one advisory model. CLEARWATER, FL, UNITED STATES, February 10, 2026 /EINPresswire.com/ — YourMedPlan has…

February 16, 2026

Zion Health Launches Deep Cleansing Scalp & Hair Scrub Pheromone Infused Dark Orchid for Healthier, Revitalized Hair

Zion Health Launches Deep Cleansing Scalp & Hair Scrub Pheromone Infused Dark Orchid for Healthier, Revitalized Hair

Zion Health introduces Deep Cleansing Scalp & Hair Scrub in Pheromone-Infused Dark Orchid, a mineral-rich exfoliating treatment for scalp and hair renewal. SAN FRANCISCO, CA,…

February 16, 2026

Sotheby’s Concierge Auctions: Tri-Level Trophy Penthouse in Miami’s Iconic Brickell Flatiron to Sell at Auction

Sotheby’s Concierge Auctions: Tri-Level Trophy Penthouse in Miami’s Iconic Brickell Flatiron to Sell at Auction

Designer-ready architectural masterpiece with private rooftop pool takes center stage at ModaMiami, in cooperation with ONE Sotheby’s International Realty Presenting this property through a time-certain…

February 16, 2026

Why Leather Jackets Maintain Strong Presence Across Evolving Fashion Markets

Why Leather Jackets Maintain Strong Presence Across Evolving Fashion Markets

Leather jackets maintain relevance in evolving fashion markets as changing design trends, craftsmanship standards, and consumer behavior shape demand. HOUSTON, TX, UNITED STATES, February 10,…

February 16, 2026

Houston Eye Clinic TSO Champions Tackles ‘Screen-Time Fatigue’ with Advanced IPL and Radiofrequency Technology

Houston Eye Clinic TSO Champions Tackles ‘Screen-Time Fatigue’ with Advanced IPL and Radiofrequency Technology

HOUSTON, TX, UNITED STATES, February 12, 2026 /EINPresswire.com/ — From classrooms and conference rooms to living

February 16, 2026

Organic Aromas Launches Industry-First Smart Nebulizing Diffuser Line With Bluetooth App Control

Organic Aromas Launches Industry-First Smart Nebulizing Diffuser Line With Bluetooth App Control

Pioneering aromatherapy company introduces wireless, app-controlled nebulizing technology across its entire product

February 16, 2026

Reebok Work Introduces FuelFlex Tactical Series, Built for All-Day Energy and Mobility on the Job

Reebok Work Introduces FuelFlex Tactical Series, Built for All-Day Energy and Mobility on the Job

New FuelFlex Tactical boots combine Reebok cushioning innovation with flexible outsoles and select styles featuring the

February 16, 2026

Federal Court Appoints Stephen J. Donell as Receiver in FTC and State of Nevada Case, Case No. 2-25-cv-01894

Federal Court Appoints Stephen J. Donell as Receiver in FTC and State of Nevada Case, Case No. 2-25-cv-01894

On 11/20/25, The U.S. District Court of Nevada grants preliminary injunction and confirms Stephen J. Donell as

February 16, 2026

Hilarious Hollywood Music Business Anecdotes Unleashed

Hilarious Hollywood Music Business Anecdotes Unleashed

ROGER & 24 launches February 16 on Sunny Side YouTube, INSTAGRAM, and TikTok. Forty years is a long time to work in

February 16, 2026

Activated Insights Announces 2026 Customer Experience Award Winners

Activated Insights Announces 2026 Customer Experience Award Winners

TAMPA, FL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Activated Insights, a leading provider of training,

February 16, 2026

Sonobond Ultrasonics Highlights Manufacturing Advantages of Ultrasonic Welding in New Blog Post

Sonobond Ultrasonics Highlights Manufacturing Advantages of Ultrasonic Welding in New Blog Post

This post explores how ultrasonic welding delivers durable bonds and supports efficient, sustainable production across

February 16, 2026

CCPA Purchasing Partners New Contract Offering with Retractable Technologies Inc. for Discounts on Retractable Syringes

CCPA Purchasing Partners New Contract Offering with Retractable Technologies Inc. for Discounts on Retractable Syringes

CHICAGO, IL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — CCPA Purchasing Partners, LLC (CCPAPP) is excited

February 16, 2026

Zenapet Examines the Role of Genetics in Arthritis Development Among Dogs

Zenapet Examines the Role of Genetics in Arthritis Development Among Dogs

Costa Mesa, California – February 12, 2026 – PRESSADVANTAGE – Genetics play a significant role in determining a dog’s

February 16, 2026

DexaFit Scottsdale Announces Regional Availability of Resting Metabolic Rate Testing

DexaFit Scottsdale Announces Regional Availability of Resting Metabolic Rate Testing

SCOTTSDALE, AZ – February 12, 2026 – PRESSADVANTAGE – DexaFit Scottsdale has announced that its Resting Metabolic Rate

February 16, 2026

Fraser Valley Junk Solutions Recognized for Consistent Customer Feedback Across the Fraser Valley

Fraser Valley Junk Solutions Recognized for Consistent Customer Feedback Across the Fraser Valley

CHILLIWACK, BC – February 12, 2026 – PRESSADVANTAGE – Fraser Valley Junk Solutions has announced recent recognition,

February 16, 2026

Washco Persian Rug Washing Expands Specialist Rug Care Services to Additional Sydney Locations

Washco Persian Rug Washing Expands Specialist Rug Care Services to Additional Sydney Locations

GREENACRE, NSW – February 12, 2026 – PRESSADVANTAGE – Washco Persian Rug Washing has announced the expansion of its

February 16, 2026

KWENCH Coworking & Culture Club Receives Continued Public Recognition Through Member Reviews Highlighting Community Experience

KWENCH Coworking & Culture Club Receives Continued Public Recognition Through Member Reviews Highlighting Community Experience

VICTORIA, BC – February 12, 2026 – PRESSADVANTAGE – KWENCH Coworking & Culture Club announced today that recent

February 16, 2026

Hyspec Homes Announces Expansion of New Home Building Services Across Sydney, the Sutherland Shire, and Wollongong

Hyspec Homes Announces Expansion of New Home Building Services Across Sydney, the Sutherland Shire, and Wollongong

SUTHERLAND SHIRE, NSW – February 12, 2026 – PRESSADVANTAGE – Hyspec Homes, a respected residential construction company

February 16, 2026

Air Pro Master Expands Tankless Water Heater Services to Meet Growing Demand

Air Pro Master Expands Tankless Water Heater Services to Meet Growing Demand

LAS VEGAS, NV – February 12, 2026 – PRESSADVANTAGE – Air Pro Master, a Las Vegas-based HVAC and plumbing specialist,

February 16, 2026

Remedia International Reports Deployment of Heavy Metal Stabilization Agent in Contaminated Land Management

Remedia International Reports Deployment of Heavy Metal Stabilization Agent in Contaminated Land Management

NEWARK, DE – February 12, 2026 – PRESSADVANTAGE – Remedia International, an environmental remediation technology

February 16, 2026

Benjamin Ball Associates Marks 15 Years of Supporting Senior Leaders With High-Impact Communication Coaching

Benjamin Ball Associates Marks 15 Years of Supporting Senior Leaders With High-Impact Communication Coaching

LONDON, UK – February 12, 2026 – PRESSADVANTAGE – Benjamin Ball Associates is marking fifteen years of supporting

February 16, 2026

The Steam Team Expands Water Restoration Services to Additional Central Texas Communities

The Steam Team Expands Water Restoration Services to Additional Central Texas Communities

AUSTIN, TX – February 12, 2026 – PRESSADVANTAGE – The Steam Team, a locally owned restoration contractor based in

February 16, 2026

Creative Perspectives Management Group Expands Job Opportunities in San Bernardino Through New Telecommunications Partnership

Creative Perspectives Management Group Expands Job Opportunities in San Bernardino Through New Telecommunications Partnership

Learn about Creative Perspectives Management Group’s new partnership, bringing new opportunities to San Bernardino and

February 16, 2026

Chloe Jane Releases New Single ‘Sleepless Nights’

Chloe Jane Releases New Single ‘Sleepless Nights’

The Official Music Video for “Sleepless Nights” Brings Romance to the Rink NEW YORK, NY, UNITED STATES, February 12,

February 16, 2026

Westpower Expands Representation of SPX FLOW Mixing Technologies into British Columbia and Yukon Territory Canada

Westpower Expands Representation of SPX FLOW Mixing Technologies into British Columbia and Yukon Territory Canada

Westpower expands SPX FLOW partnership, now representing its full industrial mixing portfolio across BC, Yukon,

February 16, 2026