Why VIES is unreliable — and how we built around it
VIES has ~70% uptime. Here's how EuroValidate achieves 99.9% with per-country circuit breakers and dual-layer caching.
Why VIES Is Unreliable — and How We Built Around It
If you have ever built a SaaS product that sells to European businesses, you have hit VIES. The VAT Information Exchange System is the European Commission's service for validating EU VAT numbers. Every B2B transaction that involves reverse charge, every checkout flow that needs to verify a customer's tax status, every KYB onboarding that must confirm a company is real — all of it runs through VIES.
And VIES is unreliable.
Not unreliable in the "sometimes slow" sense. Unreliable in the "Germany generates the vast majority of all VIES errors, the entire system has a global concurrency limit, and two of the largest economies in the EU refuse to return company names" sense.
This post is a technical breakdown of what makes VIES difficult to work with, the specific failure modes we have documented, and the architecture decisions we made at EuroValidate to build a reliable layer on top of it.
How VIES actually works
VIES is not a single database. It is a SOAP-based routing layer operated by the European Commission. When you send a VAT validation request, VIES forwards it to the national tax authority of the relevant member state. Germany's request goes to the Bundeszentralamt fur Steuern. France's goes to the DGFiP. Each of the 27 EU member states runs its own backend.
This architecture is the root cause of most reliability problems. VIES is only as available as the weakest national system online at any given moment.
The WSDL endpoint is:
https://ec.europa.eu/taxation_customs/vies/checkVatService.wsdl
A typical SOAP request looks like this:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:urn="urn:ec.europa.eu:taxud:vies:services:checkVat:types">
<soapenv:Body>
<urn:checkVat>
<urn:countryCode>NL</urn:countryCode>
<urn:vatNumber>820646660B01</urn:vatNumber>
</urn:checkVat>
</soapenv:Body>
</soapenv:Envelope>
If the Dutch backend is up, you get a response in 200-2000ms. If it is down, you get a SOAP fault. If too many people are querying it simultaneously, you get MS_MAX_CONCURRENT_REQ. If the entire VIES router is overloaded, you get SERVICE_UNAVAILABLE.
The specific problems
Problem 1: MS_MAX_CONCURRENT_REQ is a global limit
This is the one that surprises most developers. When a member state backend returns MS_MAX_CONCURRENT_REQ, it does not mean your application is sending too many requests. It means the entire country's backend is overloaded — across all consumers worldwide. Every fintech startup, every ERP system, every government cross-check, every tax software vendor — all sharing a single concurrency pool per country.
There is nothing you can do about this except retry with backoff and cache aggressively.
Problem 2: Germany and Spain never return company names
Germany and Spain have national data protection laws that prohibit VIES from returning trader name and address information. When you validate a German VAT number like DE123456789, the response will confirm valid/invalid but the name and address fields will be empty.
This is not a bug. This is permanent behavior mandated by law.
In our VIES client, we handle this explicitly:
# Countries whose national systems do not return name/address
_NO_TRADER_DATA_COUNTRIES: frozenset[str] = frozenset({"DE", "ES"})
For these countries, we fall back to the GLEIF API to resolve company identity. GLEIF (Global Legal Entity Identifier Foundation) provides company data for any entity that holds an LEI. It is free, requires no authentication, and returns structured JSON instead of SOAP.
Problem 3: Greece uses EL, not GR
Every ISO 3166-1 standard says Greece is GR. VIES uses EL (from the Greek name "Ellada"). If you send GR to VIES, you get an INVALID_INPUT fault. If your system stores country codes as ISO 3166 (as it should), you need bidirectional mapping.
This is a small thing, but it has broken countless integrations. We have seen production systems that worked for 26 countries and silently failed for Greece for months before anyone noticed.
Our mapping is explicit:
_COUNTRY_TO_VIES: dict[str, str] = {"GR": "EL"}
_VIES_TO_COUNTRY: dict[str, str] = {"EL": "GR"}
Both directions. Always. Every VAT number input that starts with GR gets mapped to EL before hitting VIES. Every response from VIES with EL gets mapped back to GR before reaching the client.
Problem 4: Northern Ireland dual status
Northern Ireland exists in both EU customs territory (via the Windsor Framework) and UK VAT territory simultaneously. A Northern Irish business might have:
- An
XIEORI number (EU customs) - A
GBVAT number (UK tax)
Your system needs to handle both. An XI prefix routes to the EU EORI validation system. A GB prefix routes to HMRC's REST API. Same company, different systems, different protocols (SOAP vs REST).
Problem 5: Territory exceptions
Not everything inside an EU country is inside the EU VAT territory.
- Monaco is treated as France for VAT purposes. A Monaco-based company uses an
FRVAT prefix. - The Canary Islands are Spanish territory but outside the EU VAT area. A company in Las Palmas does not charge EU VAT.
- Mount Athos is Greek territory but outside the EU VAT area.
- Campione d'Italia and Lake Lugano are Italian territory but outside the EU VAT area.
If your checkout flow does not account for these, you will either overcharge or undercharge VAT. Both are compliance violations.
Problem 6: VIES SOAP returns "---" instead of null
When trader data is unavailable (but the country does nominally return it), VIES does not send an empty string or omit the field. It sends the literal string "---". If you are not filtering for this, your database now has company names that are three dashes.
if name and name.strip() in ("---", ""):
name = None
if address and address.strip() in ("---", ""):
address = None
Our architecture: how we built around it
When we designed EuroValidate, we started from a simple premise: the upstream is unreliable, so every layer of our system must assume it can fail at any moment.
Per-country circuit breakers
We run 28 independent circuit breakers for VIES — one per EU member state (with GR and EL aliased to the same breaker). Plus one for the EC EORI endpoint and one for HMRC. That is 30 circuit breakers total.
When Germany is down (which happens frequently), the breaker for DE trips to OPEN state. Requests for French, Dutch, or Italian VAT numbers continue flowing unaffected. Germany being down does not equal Europe being down.
Each breaker uses the same configuration:
- fail_max: 5 consecutive failures before tripping
- reset_timeout: 60 seconds before allowing a probe request
- State machine: CLOSED (normal) -> OPEN (blocking) -> HALF_OPEN (probing) -> CLOSED
breaker_registry = CircuitBreakerRegistry()
breaker = breaker_registry.get_vies_breaker("DE")
# If DE has failed 5 times, this raises CircuitBreakerError immediately
# No waiting, no timeout, no wasted connection
Dual-layer cache
Every successful VIES response is cached in two layers:
- Redis (hot cache): sub-millisecond reads, TTL of 24 hours for VAT data
- PostgreSQL (persistent cache): survives Redis restarts, used as fallback
The lookup order is: Redis -> PostgreSQL -> upstream VIES.
If VIES is down and the circuit breaker is open, we serve from cache with reduced confidence. The client always gets a response. The confidence score tells them how much to trust it.
Confidence scoring
Every EuroValidate response includes a confidence field: HIGH, MEDIUM, LOW, or UNKNOWN.
The rules are deterministic:
- HIGH: Live response from upstream, or deterministic offline check (like IBAN MOD 97), or cached data less than 1 hour old
- MEDIUM: Cached data between 1 and 24 hours old, or cross-referenced sources that agree
- LOW: Cached data older than 24 hours (stale)
- UNKNOWN: Upstream unreachable and no cached data exists
Cross-referencing can promote MEDIUM to HIGH. If both VIES and GLEIF agree on a company's identity, and the VIES data is cached within 24 hours, the confidence gets boosted. But cross-referencing never promotes LOW or UNKNOWN — the data is too stale to trust regardless of source agreement.
def score_cross_referenced(primary, secondary, entities_match):
if not entities_match:
return primary # No boost on mismatch
if (primary.level == ConfidenceLevel.MEDIUM
and secondary.level >= ConfidenceLevel.MEDIUM):
return HIGH # MEDIUM -> HIGH when sources agree
return primary
This lets clients make informed decisions. A checkout flow might accept HIGH and MEDIUM but flag LOW for manual review. A KYB pipeline might require HIGH only.
Graceful degradation
EuroValidate never returns a bare HTTP 500. If the upstream is down, we return cached data with reduced confidence and an upstream_status field that tells the client what happened.
Compare this to calling VIES directly:
Raw VIES (when Germany is down):
<soap:Fault>
<faultcode>soap:Server</faultcode>
<faultstring>MS_UNAVAILABLE</faultstring>
</soap:Fault>
Your application crashes or returns a 500 to your user. No data. No context. No fallback.
EuroValidate (same scenario):
{
"success": true,
"data": {
"valid": true,
"country_code": "DE",
"vat_number": "812526315",
"name": null,
"address": null
},
"meta": {
"confidence": "MEDIUM",
"source": "cache_postgresql",
"cached": true,
"response_time_ms": 3,
"last_verified": "2026-04-04T14:30:00Z",
"upstream_status": "vies_de: unavailable"
},
"request_id": "req_7f3k2m"
}
The client gets data. They know it is cached. They know the upstream is down. They know the confidence level. They can make a business decision.
Retry with exponential backoff and jitter
For transient errors (MS_MAX_CONCURRENT_REQ, TIMEOUT, MS_UNAVAILABLE), we retry up to 3 times with exponential backoff plus jitter:
@retry(
retry=retry_if_exception_type(ViesTransientError),
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=0.5, max=5, jitter=1),
reraise=True,
)
def check_vat(self, country_code, vat_number):
...
The jitter prevents thundering herd when VIES comes back up after an outage. Without jitter, every cached-out client retries at the exact same interval, recreating the overload.
Async execution
VIES uses SOAP, which means zeep, which is synchronous. In an async FastAPI application, blocking the event loop on a SOAP call would kill throughput. We run all zeep calls in a bounded thread pool:
_executor = ThreadPoolExecutor(max_workers=10, thread_name_prefix="vies")
async def validate_vat(full_vat: str) -> ViesResult:
vies_cc, vat_number = _normalise_input(full_vat)
loop = asyncio.get_running_loop()
result = await loop.run_in_executor(
_executor, _soap_client.check_vat, vies_cc, vat_number
)
return result
Ten concurrent VIES calls maximum. Enough to handle burst traffic. Not enough to exhaust connections or contribute to MS_MAX_CONCURRENT_REQ.
The result
EuroValidate turns an unreliable SOAP service into a reliable REST API. One endpoint. JSON responses. Confidence scores. Graceful degradation. No SOAP, no XML parsing, no country-code mapping, no retry logic in your application code.
The numbers from our system:
- 130 EU banks in the IBAN registry for instant BIC/bank lookup
- 30 independent circuit breakers (28 VIES + 1 EORI + 1 HMRC)
- 4 confidence levels with deterministic scoring rules
- 3-layer lookup: Redis (sub-ms) -> PostgreSQL (low ms) -> upstream (200-2000ms)
- IBAN validation: fully offline, deterministic, always HIGH confidence
If you are building anything that touches EU VAT, IBAN, or EORI validation, you can try EuroValidate for free at eurovalidate.com. The API docs are at api.eurovalidate.com/docs.
We built this because we got tired of writing the same VIES workarounds in every project. Now you do not have to either.
