Which AI Should I Trust?

Deciding whether to move from OpenAI models to Anthropic Models?

The Anthropic vs Pentagon public contract wrangling did us all a favor. It brought the AI security risk debate out of the black box and into the light.

Anthropic CEO Dario Amadei publicly disclosed on Friday, Feb. 27th, that when supporting fully autonomous use cases, “(AI) needs to be deployed with proper guardrails, which don’t exist today.”

Then, over the weekend, Anthropic’s Claude hit #1 in the Apple U.S. App Store. According to CNBC, Anthropic overtook OpenAI for the #1 spot on Saturday. The company stated that daily sign-ups “broke all-time records every day this week.’

Here at Optica Labs, many folks have told us they are switching AI model providers. They ask us ‘Which AI model is Secure?’ To answer that question, you have to understand where AI guradrails came from and how they work or don’t work today.

Guardrails and Frameworks’ inability to adapt to AI

Under the surface of the public debate on which company does what is a much larger trust and safety issue—the current state of AI Guardrails.

AI Guardrails are the safety, security and compliance steps that monitor inputs and model outputs. They exist to limit model reasoning errors, fabrication, confabulation, bias, and unintended harmful output while reducing susceptibility to jailbreak, prompt injection and hundreds of other types of adversarial behavior.

Think of these guardrails as bumpers in a bowling alley. From the bowler (user) to the pins (output) the purpose is to catch all root causes, sycophantic behaviors, perturbation, and false content generation. The goal is to get a strike every time (without telling the bowler they got a strike when they got a 7-10 split).

Frameworks and risk guides like NIST, OWASP, and MIT provide the taxonomy and concepts to help design the guardrails. Cloud Safety tooling layers like Amazon Bedrock or Azure AI Safety further increase the complexity. Many industries are required to comply with policies such as the EU AI ACT, Consumer protections, U.S. SEC, SR 11-7, HIPAA & data governance, and others. Cloud providers, foundation models, and enterprise tooling operate at completely different levels of the stack. Each has a different job and different failure modes.

Traditional cybersecurity approaches to guardrails do not work in AI.

The AI model is like the bowling lane itself, and the bowling ball is the prompt. The interaction is always in motion. While progress has been made on explainability and observability, AI models and AI systems design create an infinite number of potential vulnerabilities, and risk.

What are all of these Guardrails and AI security systems getting wrong?

They are all focused on whether the output is safe right now. In this scenario, under these test conditions. None of them answer critical questions like:

Which user is eroding my guardrails?
Where is my risk concentrating?
Which AI component will fail next?

AI Security and Assurance Require Frameworks, Guardrails, Policy Adherence + TEVV

What the debate has brought to light is how many people are operating under the assumption that adding AI models to their tech stack is like adding a new SaaS technology. They belive the security and output are guaranteed by the provider and overseen by their CISO or internal technology team.

The mistake enterprises make is assuming a SOC 2 Type II from AWS somehow covers the behavior of the AI application running on top of it. It doesn’t.

Each use case needs testing, evaluation, verification and validation (TEVV). TEVV in the design phase can help to increase the success of an AI implementation. TEVV in development gives MLOps, Data, and leadership teams confidence in deployment. TEVV in continuous monitoring can predict expensive AI system failures before they happen.

The surfacing of the conversation on AI guardrails is helping to educate the average user about their risk and make more informed buying decisions as well.

Optica Labs TEVV Approach

Our Nexus product lead and COO Nick Reese likes to say, “AI is a body in motion.” By that, he means that measurement and analysis are grounded in measurable semantic safety, but user and model responses are always moving. Like our bowling ball.

A model user or enterprise deploys the framework, policies and guardrails and chooses the foundation models as they prefer—they build the bowling lane. We monitor where the ball is and how fast it is headed toward the gutter.

We map the model’s scores for risk amplification, which includes harmful content, reasoning errors, fabrication, and confabulation. Scores are then normalized regardless of the use case, scenario, or persona involved in the AI system interaction. This isolates testing, evaluation and verification to define an arithmetic mean for scoring the user, the model and the system as a whole.

Conclusion

Ok, so Sam and Dario have their scorecards in and which of them won the golden pin? The answer for both is both and neither. We must have a normalized risk score based on measurement.

There is still work to be done in devising the taxonomies, frameworks, policies, and guardrails to protect users and enterprise when using AI. We must have a normalized risk score based on measurement. How else will we know when a model helps us bowl a turkey instead of turning the project into a turkey?

None of this article was written by AI. Claude was used to research and confirm concepts, and ChatGPT was used for statistics and conceptual verification. Gemini Nano-Banana was used for images.

Which AI Should I Trust?

Deciding whether to move from OpenAI models to Anthropic Models?

Guardrails and Frameworks’ inability to adapt to AI

Traditional cybersecurity approaches to guardrails do not work in AI.

AI Security and Assurance Require Frameworks, Guardrails, Policy Adherence + TEVV

Optica Labs TEVV Approach

Conclusion

Related Articles

Double Agents are Closer than You Think

Sycophancy and the Rise of AI Model Induced Delusions

The Sycophancy Syndrome: An Optica Labs Valentine’s Story

Optica Labs helps organizations deploy AI confidently by identifying risk, stress-testing systems, and enforcing clear, accountable governance.

Quick Links