Does using my own LLM cost more in Salesforce?

Not necessarily. Salesforce states that Bring Your Own LLM consumes roughly 30% fewer Einstein Requests than managed models, though you pay your external provider directly for token usage. The net cost depends on your volume and negotiated rates.

Is Claude better than GPT for Salesforce CRM tasks?

For regulated industries it has an edge: Anthropic Claude is the first LLM fully contained within the Salesforce trust boundary, which simplifies compliance. For general CRM tasks, GPT, Claude, and Gemini all perform well; the deciding factors are data residency, latency, and cost.

Do I lose the Einstein Trust Layer if I bring my own model?

No. The Einstein Trust Layer (data masking, zero-retention prompts, toxicity scoring, audit logging) wraps every supported model, including Bring Your Own LLM providers. The protections are applied by the platform, not the model vendor.

← BlogStrategy

OpenAI vs Claude vs Gemini for Salesforce: Which LLM Should Power Your CRM?

Q: Can I choose which LLM powers Agentforce?

Yes. Agentforce defaults to a Salesforce-managed model, but the Models API supports Bring Your Own LLM with OpenAI, Azure OpenAI, Google Gemini Pro, and Anthropic Claude on Amazon Bedrock as foundation providers. You configure the model without rebuilding your agents.

A practical guide to choosing the LLM behind Agentforce: when Salesforce Default is enough, what Bring Your Own LLM actually changes, and how OpenAI, Claude, and Gemini compare for CRM workloads in 2026.

June 15, 2026·8 min read

#agentforce#llm#byollm#einstein-trust-layer#anthropic-claude#openai#google-gemini#salesforce-ai#model-selection#ai-strategy#salesforce#agentforce-2026

OpenAI vs Claude vs Gemini for Salesforce: Which LLM Should Power Your CRM?

Most Salesforce teams never think about which large language model sits behind Agentforce. They accept the default, ship an agent, and move on. That works until the moment a compliance officer asks where customer data goes when the agent generates a response, or a CFO asks why the AI bill grew faster than the deflection rate.

The model behind your agents is a real decision now. Since June 2026, Salesforce supports four external LLM providers through Bring Your Own LLM, and Anthropic Claude became the first model fully contained inside the Salesforce trust boundary. This guide explains what actually changes when you pick a model, when the Salesforce Default is the right answer, and how OpenAI, Claude, and Gemini compare for the work a CRM actually does.

How Model Selection Works in Agentforce

Agentforce gives you three broad paths, and the difference between them matters more than the brand name on the model.

Salesforce Default. A managed mix of trusted models, currently built around GPT-4o, that Salesforce tunes for accuracy and trust. You do not manage keys, capacity, or provider contracts. For most service and sales agents, this is the starting point and often the ending point.

Salesforce-hosted alternatives. Salesforce also offers a model hosted on its own infrastructure through Amazon Bedrock, currently Anthropic Claude Sonnet 4. Your prompts stay inside the Salesforce trust boundary rather than travelling to a third-party API.

Bring Your Own LLM (BYOLLM). Through the Models API, you connect an external provider you already pay for. The four supported foundation providers are OpenAI, Azure OpenAI, Google Gemini Pro, and Anthropic Claude on Amazon Bedrock. You own the provider relationship, the capacity, and the bill.

Verify in your org: The exact list of supported models changes with each release. Confirm current options in Setup under the Einstein/Agentforce model configuration and against the official Supported Models developer documentation before committing to one.

The important point: switching models does not mean rebuilding agents. Your topics, actions, and prompt templates stay in place. The model is a configuration layer underneath them.

The Einstein Trust Layer Applies to Every Model

Before comparing models on quality, settle the question that actually decides most enterprise deployments: data governance.

The Einstein Trust Layer sits between your agent and whichever model you choose. It masks sensitive data before the prompt leaves Salesforce, enforces zero-retention agreements so providers do not train on your data, scores responses for toxicity, and writes an audit trail of every interaction. These protections are applied by the platform, not by the model vendor. That means you do not trade away governance when you switch from the default to Gemini or OpenAI.

What does change with BYOLLM is the contractual and network path. With the Salesforce Default or a Salesforce-hosted model, the data stays within Salesforce's negotiated trust boundary. With BYOLLM pointed at, say, your own Azure OpenAI deployment, the prompt travels to your tenant under your provider agreement. That is an advantage if you have already negotiated a strict data agreement with that provider, and a complication if you have not.

Anthropic Claude is the notable exception in the BYOLLM lineup. Salesforce describes Claude as the first LLM provider whose models are fully contained within the Salesforce trust boundary, which is why the June 2026 partnership specifically targets financial services, healthcare, cybersecurity, and life sciences. For a regulated org, that containment removes an entire category of compliance review.

OpenAI vs Claude vs Gemini: The CRM-Specific Comparison

Generic model benchmarks tell you very little about CRM performance. A model that tops a coding leaderboard may be overkill for summarizing a case. Here is how the three external options map to the work Salesforce agents actually do.

Factor	OpenAI (GPT)	Anthropic Claude	Google Gemini
Salesforce Default basis	Yes (managed mix)	Salesforce-hosted option on Bedrock	BYOLLM only
Inside Salesforce trust boundary	Via default/managed	Yes, fully contained	Via Trust Layer wrapping
Strength for CRM	Broad general reasoning, mature tool-calling	Long-context summarization, regulated-industry containment	Tight Google Workspace and data integration
Best fit	Default service/sales agents	Finance, healthcare, life sciences	Orgs standardized on Google Cloud
Provider relationship	Direct or Azure	Amazon Bedrock	Vertex AI

A few practical observations behind that table.

OpenAI / GPT is the safe generalist. Because it underpins the Salesforce Default, most existing Agentforce behavior was tuned against it. If your agents already work well, you are likely already running GPT and changing nothing is a defensible choice.

Claude earns its place on context handling and containment. For tasks that pull in long case histories, policy documents, or clinical content, its long-context behavior reduces the need to chunk and re-prompt. The trust-boundary containment is the real differentiator for regulated industries, where the procurement question is not "which model is smartest" but "which model lets us pass audit fastest."

Gemini is the integration play. If your organization runs on Google Workspace and Vertex AI, pointing BYOLLM at Gemini Pro keeps your AI stack consolidated under one cloud relationship, which simplifies billing, data residency, and security review.

What BYOLLM Actually Costs

The headline number is counterintuitive: Salesforce states that BYOLLM consumes roughly 30% fewer Einstein Requests than managed models. That sounds like a discount, and on the Salesforce side of the bill it is.

The catch is that you now pay your external provider directly for token usage on top of your Salesforce consumption. So the real comparison is:

Salesforce Default: higher Einstein Request consumption, no separate model bill, no capacity management.
BYOLLM: ~30% fewer Einstein Requests, plus your provider's token charges, plus the operational cost of managing keys and capacity.

For a mid-volume deployment, the Salesforce Default usually wins on total cost and simplicity. BYOLLM starts to pay off when you already have a heavily discounted enterprise contract with a provider, when data residency forces you onto your own tenant, or when a specific model measurably outperforms the default on your workload. Run the math on your actual conversation volume before assuming "bring your own" means "cheaper."

A Concrete Decision Path

Here is the order I would walk a Salesforce team through when they ask which model to use.

Start on Salesforce Default. Build the agent, measure quality on real cases, and do not touch the model until you have a reason. Most teams never need to.
If you are in a regulated industry, evaluate the Salesforce-hosted Claude option. The trust-boundary containment is usually worth more than any quality delta, because it shortens compliance review.
If you are standardized on Google Cloud, test Gemini Pro through BYOLLM to consolidate your stack and data residency under Vertex AI.
If you have a strict data-residency mandate or a deeply discounted provider contract, use BYOLLM to point at your own OpenAI or Azure OpenAI tenant.
Measure, then decide. Run the same set of representative cases through two models, compare resolution quality and latency, and let the numbers choose. The default is a strong baseline, not a compromise.

The mistake to avoid is switching models because a benchmark headline said one is "better." Better at what? For CRM, the deciding factors are almost always data governance, latency, and cost, not raw reasoning scores.

Who Should Care About This Decision

Architects own this call. The model choice ripples into data residency, latency budgets, and integration architecture, so it belongs in the design phase, not as an afterthought.

Compliance officers in finance, healthcare, and life sciences should know that the Salesforce-hosted Claude option exists specifically to keep model processing inside the trust boundary. That single fact can shorten an AI risk review by weeks.

Admins and developers can relax slightly: because the model is a configuration layer, you can change it without rebuilding agents. Test, compare, and switch with far less risk than a re-platform implies.

The Bottom Line

For most Salesforce teams, the Salesforce Default is the right answer and the model question never needs to be reopened. The teams that should care are the ones with a specific constraint: a compliance mandate that favors Claude's containment, a Google Cloud standardization that favors Gemini, or a provider contract and data-residency rule that favor pointing BYOLLM at your own OpenAI tenant.

Choose the model to satisfy a constraint, not to chase a leaderboard. And whichever you pick, the Einstein Trust Layer keeps the governance protections in place.

To understand why the model is only half the accuracy story, read Data Cloud as AI Prerequisite. A better LLM cannot compensate for ungrounded data. And for the reasoning architecture that wraps whichever model you choose, see How the Atlas Reasoning Engine Powers Agentforce.

Keep Reading

Model availability and Einstein Request consumption figures are based on Salesforce documentation current as of June 2026. Confirm the supported model list and consumption rates in your own org before making a production decision, as both change with each release.

📬 Enjoyed this article?

Subscribe to our free weekly digest — AI tools, Salesforce tips, and prompts every week.