Sovereign cloud or on-premise: the decision matrix for hosting your LLMs

The question comes up constantly in our engagements: "We want to host our LLM in a sovereign way, should we go with SecNumCloud or install our own GPUs?" The framing assumes two options. In practice, there are three, each with a distinct risk profile, regulatory constraints, and economics that depend heavily on the organization making the call.

The French sovereign cloud market has taken shape at an accelerating pace through 2025 and 2026. S3NS (Thales-Google) received SecNumCloud qualification in December 2025, becoming the first provider to cover IaaS, PaaS, and CaaS simultaneously under a single qualification. Bleu (Orange-Capgemini), which distributes Azure within a sovereign wrapper, passed the first stage of the qualification process in April 2025 and is targeting completion in the first half of 2026. OVHcloud and Scaleway hold qualification on specific service scopes. As of 2026, nine providers are fully qualified, and twelve additional applications are under review. The market exists, it is maturing, and commercial claims around it are multiplying.

What has not kept pace is clarity about what these qualifications actually cover. "Sovereign cloud" can mean very different things: data physically hosted in France with no special legal protection, a contractual data residency commitment, a sector-specific technical certification, or a full SecNumCloud qualification with immunity against extraterritorial laws. For an executive deciding where to host LLMs in 2026, these are not cosmetic distinctions. They determine what data the organization can process on that infrastructure, what it is legally exposed to, and what the infrastructure actually costs over three years.

What SecNumCloud qualification actually guarantees

The ANSSI SecNumCloud 3.2 framework is the most demanding qualification available in France for cloud providers. It imposes more than 360 compliance criteria across 19 chapters, covering technical, operational, legal, and organizational security dimensions. The central requirement that distinguishes it from every other label: the provider must have its registered office and capital in the European Union, and its infrastructure must be shielded against extraterritorial foreign laws, including the US CLOUD Act and Section 702 of FISA.

This is precisely the point that makes SecNumCloud relevant for hosting LLMs that process sensitive data. The Court of Justice of the EU established in 2020 that the CLOUD Act is incompatible with GDPR for transfers involving entities subject to US surveillance, even when servers are physically located in Europe. A SecNumCloud-qualified provider offers legal protection that the contractual commitments of US hyperscalers simply cannot match, regardless of how robust their Data Processing Agreements are or how many technical certifications they hold.

The qualification scope: the question that matters most

SecNumCloud qualification applies to a specific service, not to a provider as a whole. A cloud vendor can hold SecNumCloud qualification for its IaaS infrastructure while offering GPU services or AI inference APIs that are not yet within the qualified scope. OVHcloud, for example, announced in early 2026 that its AI services would be integrated into its SecNumCloud stack by end of 2026: its AI Endpoints were therefore not covered by the qualification at that point. The question to ask every provider, without exception: "Is the specific service I will use within the scope of the qualification, or just the underlying infrastructure?" That nuance changes the entire analysis.

Public cloud hyperscalers: real strengths and clear limits

AWS, Azure, and Google Cloud offer the broadest model catalog available, native integration with every development toolchain, robust SLAs, near-instant scalability, and prices that have continued to fall. AWS cut H100 GPU pricing by 44% in June 2025, dropping rates on P5 instances from roughly $7.57 to $3.90 per GPU-hour. For AI workloads without sovereignty constraints, it is a value proposition that is hard to beat outright.

The limitation is strictly legal. Microsoft, Google, and Amazon are US-incorporated entities subject to the CLOUD Act. A data residency contract localizing data in France or in the EU does not protect against a US legal order if American authorities choose to issue one. This is not a theoretical risk: for organizations processing identifiable health data, trade secrets, strategic plans, or contractual data involving third parties who have not consented to US jurisdiction, it is not an acceptable one.

The public hyperscaler remains the right choice for: proof-of-concept projects and exploration, base models accessed via API on generic data, classification or generation workloads on non-sensitive content, and teams that need iteration speed and scaling flexibility without formal regulatory constraints.

Fake sovereignty: what to watch for

The market is full of solutions that present themselves as "sovereign" without that claim resting on any precise legal or technical guarantee. Three formulations warrant critical scrutiny.

"Data center in France": physical hosting of data in France creates no legal immunity if the entity operating the infrastructure is a subsidiary of a US company. The server can be in Strasbourg; if the parent company is in Seattle, the CLOUD Act applies.

"GDPR-compliant": every company operating in the EU must comply with GDPR. This label implies neither SecNumCloud qualification, nor protection against extraterritorial laws, nor sovereign governance. It is the minimum legal floor, not an advanced level of protection.

"Sovereign cloud" without ANSSI qualification: the term is not legally defined. It can describe a contractual commitment with no real legal weight, a sector-specific certification (HDS, ISO 27001), or a full SecNumCloud qualification. These are not interchangeable. The answer to "qualified by whom?" and "for exactly which services?" must be verifiable, not just asserted in a sales deck.

In practice, real sovereignty comes down to two axes: immunity against extraterritorial laws (SecNumCloud in France, equivalent requirements in other EU member states) and the location of operational decision-making authority (who can access the data, who responds to legal orders, who controls the configuration). Both must be solid for a solution to be genuinely sovereign.

On-premise GPU: running the real economics

On-premise infrastructure, meaning the purchase and operation of your own GPUs, is often positioned as the ultimate in control and sovereignty. That is true in specific contexts. In many others, it is a calculation error.

The real three-year TCO

A server with 8 NVIDIA H100 GPUs costs between $250,000 and $350,000 to purchase in 2025-2026, with procurement lead times averaging 5 to 6 months. On top of hardware come power consumption (40 to 100 kW per rack, incompatible with most existing enterprise data centers without expensive retrofits for liquid cooling and upgraded power delivery), maintenance, and staff costs. The Lenovo TCO analysis published in 2025 quantifies the human layer precisely: a 0.5 FTE infrastructure engineer costs $75,000 to $100,000 per year, or $225,000 to $300,000 over three years. In organizations without a pre-existing infrastructure team, this line item often exceeds hardware costs.

The break-even logic is clear. A 2026 market analysis puts the break-even point for an 8x H100 configuration at approximately 8,500 to 9,000 hours of effective utilization, or roughly 11 to 12 months of continuous operation. Below 70% sustained utilization, the public cloud is cheaper over three years. Above 80% sustained utilization on predictable workloads, on-premise becomes the more economical option at that time horizon.

The three situations where on-premise makes sense

First situation: classification requirements leave no choice. Restricted-access government data, critical infrastructure systems (OIV in France), and defense environments requiring complete network isolation (air-gapped deployments) cannot use any external cloud, even SecNumCloud-qualified. The constraint is regulatory and non-negotiable.

Second situation: inference volume is high and predictable. A model running 18 to 20 hours per day, 7 days a week, on a stable production use case reaches the break-even point in under a year. Predictability of the load matters as much as the absolute volume.

Third situation: the infrastructure and expertise already exist. A large industrial group with its own data center and a seasoned infrastructure team faces a very different TCO picture than a mid-market company building everything from scratch. On-premise that integrates into an existing foundation is economically very different from on-premise as a greenfield project.

What on-premise does not solve

On-premise shifts the security responsibility, it does not reduce it. A GPU cluster that is misconfigured, unmonitored, physically accessible without proper access controls, or running on outdated firmware drivers is objectively less secure than a SecNumCloud-qualified cloud environment audited against 360 criteria. Physical sovereignty does not substitute for operational security maturity. This is a common and consequential confusion.

The decision matrix: four questions in order

Rather than an abstract criteria grid, here are the four questions we ask systematically in engagements to guide hosting architecture decisions.

Question 1: what is the regulatory classification of the data?

Does the data being processed fall into a regulated category: health data (HDS certification required in France), government or defense data (SecNumCloud or air-gapped on-premise), critical infrastructure data (OIV), or administratively classified data? If yes, the regulatory framework defines the viable options. Economic analysis comes after, not before.

Question 2: is there an unacceptable extraterritorial risk?

Does the data include trade secrets, strategic plans, sensitive competitive intelligence, or contractual data that you cannot risk exposing to a foreign legal order? If yes, public hyperscaler cloud is excluded. SecNumCloud or on-premise are the two viable options.

Question 3: is inference volume high and predictable?

If expected utilization is below 70% over time, cloud remains cheaper over three years. If it exceeds 80% on a stable and predictable load, on-premise becomes worth considering provided the expertise is in place. Between those thresholds, SecNumCloud often represents the best balance between protection and flexibility.

Question 4: does the organization have the operational capacity to run GPU infrastructure?

On-premise without solid infrastructure expertise generates a predictable cycle: difficult installation, persistent configuration issues, underutilized GPUs, gradual abandonment. If that expertise is not present and there is no budget to build it, cloud is the only realistic choice, regardless of the theoretical preference for sovereignty.

	Public cloud	SecNumCloud cloud	On-premise
Strong regulatory constraint	No	Yes	Yes (air-gapped)
Extraterritorial protection	No	Yes	Yes
Low or erratic volume	Yes	Yes	No
High, stable volume	Yes	Yes	Yes (if expertise present)
Infrastructure expertise required	No	No	Yes
Initial cost	Low	Medium	High
Model catalog breadth	High	Medium	Variable

What we actually implement in engagements

When clients come to us to choose a hosting architecture for their LLMs, we always start by mapping the real data that will flow through the system, not generic categories. In the vast majority of cases, two flows coexist: a sensitive data flow that requires strong protection (rarely more than 20 to 30% of total volume), and a routine operational flow with no significant regulatory constraint. This finding almost always points toward a hybrid architecture.

What organizations consistently underestimate: the coherence cost of a hybrid system. Two separate environments mean data gateways, classification policies applied upstream, and teams capable of operating both. It is manageable and often the right answer. But it needs to be factored into the decision from the start, not discovered as a surprise during integration.

What organizations consistently overestimate: the protection provided by simply choosing a "French" or "European" provider. The provider's nationality is not enough. What matters is that the specific AI service being used is within the scope of the active qualification, not just that the provider holds qualification for some of its other services. Verifying this before signing, by requesting the exact perimeter of the current qualification, is one of the most important and most frequently skipped steps in procurement.

Want to review your specific situation together? Book a slot, and we will spend 30 minutes mapping your data flows, assessing your regulatory constraints, and giving you an initial view on the hosting architecture that fits your context.

Sovereign cloud or on-premise: the decision matrix for hosting your LLMs

What SecNumCloud qualification actually guarantees

The qualification scope: the question that matters most

Public cloud hyperscalers: real strengths and clear limits

Fake sovereignty: what to watch for

On-premise GPU: running the real economics

The real three-year TCO

The three situations where on-premise makes sense

What on-premise does not solve

The decision matrix: four questions in order

What we actually implement in engagements

n8n workflows + AI agents: the combo replacing 80% of RPA projects

B2B Voice Agents: Why 2026 Is the Tipping Point

Ready to
automate everything

Sovereign cloud or on-premise: the decision matrix for hosting your LLMs

What SecNumCloud qualification actually guarantees

The qualification scope: the question that matters most

Public cloud hyperscalers: real strengths and clear limits

Fake sovereignty: what to watch for

On-premise GPU: running the real economics

The real three-year TCO

The three situations where on-premise makes sense

What on-premise does not solve

The decision matrix: four questions in order

What we actually implement in engagements

n8n workflows + AI agents: the combo replacing 80% of RPA projects

B2B Voice Agents: Why 2026 Is the Tipping Point

Ready toautomate everything

Ready to
automate everything