Case Study

How We Built an AI Compliance Scanner That Checks Marketing Assets Against EU Regulations in Under 2 Minutes

A production AI platform that scans marketing materials — images, PDFs, packaging — against Dutch and EU health claims regulations. Upload an asset, get a traffic-light compliance verdict with per-rule results, evidence quotes, and fix suggestions. No manual review required for the first pass.

Health & Nutrition ComplianceEU Regulatory10-Week BuildProduction AIGPT-4 Vision + RAG
620+
Assets Scanned in Production
11
Compliance Rules Checked Per Scan
~2 min
Per Asset — Down from 2-3 Hours
100%
Citation Transparency

The Challenge

Drowning in Manual Review

A European regulatory body responsible for reviewing marketing claims in the health, nutrition, and medical product space needed to fundamentally rethink how it processes compliance checks. Their reviewers were drowning.

Every marketing asset — social media posts, retail packaging, advertisements, e-commerce images, video scripts — had to be manually checked against a complex web of Dutch and EU health claims regulations, approved ingredient databases, and previously approved materials. A single review could take 2-3 hours. The backlog was growing faster than the team could clear it.

Reviewers manually cross-referencing each marketing claim against multiple regulatory databases, dossiers, and previously approved materials

No automated way to extract text from visual assets — images, packaging mockups, and designed PDFs required manual transcription

!

Inconsistent reviewer judgments — different reviewers flagging different issues on the same material

!

Customers waiting days for compliance feedback, delaying product launches and campaign rollouts

Generic AI chatbots failed immediately — could not handle Dutch regulatory language, visual assets, or multi-step compliance logic

No audit trail connecting verdicts back to specific regulatory sources, creating accountability gaps

What We Built

The AI Compliance Scanner

A self-service compliance checking platform where customers upload marketing assets and get an instant, structured regulatory verdict — complete with per-rule pass/fail results, evidence quotes, suggested fixes, and links to the exact regulatory documents that informed each finding.

Compliance Verdict System

GREEN — Compliant

All rules pass. Asset is cleared for use. No issues detected against applicable regulations.

!
ORANGE — Review Required

Some rules flagged. Needs human reviewer attention. Specific issues identified with fix suggestions.

RED — Violations Found

Critical compliance violations detected. Asset cannot be used as-is. Detailed remediation steps provided.

How It Works

Four Steps, Under Two Minutes

Every asset passes through a structured pipeline. The system automatically adapts which stages run based on the file type and content composition selected.

1

Upload

Drag & drop images or PDFs, provide product context & choose content type

2

Smart Check

File fingerprinted — if scanned before, instant results via Replay Mode

3

Extract

Text & visuals extracted via GPT-4 Vision or Document Intelligence

4

Analyse

Claims identified, regulatory docs retrieved, 11-step compliance check run

Content-Aware Pipeline

Only Run What You Need

The processing path adapts automatically based on file type and content selection — only the stages needed are executed, optimizing cost and speed.

Content TypePDF ConversionVision / OCRDoc IntelligenceClaims DetectionRAG + AI Validation
Image Only
Image & Text
Text Only
PDF + Image
PDF + Text Only
💰

Cost-Efficient by Design

Smart replay mode (SHA-256 fingerprinting) reuses results for identical files within a configurable time window — eliminating redundant AI processing costs and delivering instant repeat results. Combined with content-aware routing that skips unnecessary pipeline stages, the system minimizes Azure OpenAI spend on every scan.

Under the Hood

The Full AI Pipeline

Here is what happens at each stage when a new asset enters the system.

01

Asset Upload & Classification

User uploads an image, PDF, or text document. Selects the content composition (image only, image + text, or text only), product category (health product, medical device, etc.), and campaign type (advertisement, packaging, social media, retail folder, etc.). This metadata drives the downstream compliance rules that apply.

02

Smart Replay Check

The file is fingerprinted using SHA-256 hashing. If an identical file has been scanned within the configured replay window, the system returns prior results instantly — no AI processing needed. This eliminates redundant costs when the same asset is submitted multiple times.

03

Intelligent Text Extraction

For visual assets (images, designed PDFs), GPT-4 Vision reads and extracts all text — including text embedded in product shots, marketing banners, and styled layouts that OCR would miss. For text-only documents, Azure Document Intelligence handles extraction. Multi-page PDFs are converted to page images for visual analysis.

04

Claim Identification & Segmentation

Extracted text is split into packshot lines and marketing copy. A Dutch ingredient lexicon identifies nutrient and botanical claims. The system distinguishes between health claims, promotional statements, ingredient references, and retailer branding — each governed by different rules.

05

RAG Retrieval of Regulatory Context

A hybrid semantic + vector search pipeline retrieves relevant regulatory documents from Azure AI Search: approved health claims, on-hold claims, regulatory guidelines, training corrections from past reviews, product dossiers, and previously approved materials. This context grounds every AI decision in actual regulation.

06

Structured 11-Step Compliance Validation

GPT-4 runs a structured compliance check across 11 rule categories: training overrides, packshot approval, dossier matching, previous approval lookup, new text verification, banned phrase detection, imagery validation, wording deviation checks, ingredient-to-claim linkage, mandatory statement verification, and superlative claim detection. Each rule returns pass/fail with severity, evidence, and fix suggestions.

07

Full Audit Trail & Results

Every scan stores the complete chain: prompts sent, model parameters, raw AI responses, retrieved regulatory documents, confidence scores, and timestamps. Results display with per-rule verdicts, regulatory source citations, and a one-click "Submit for Internal Review" workflow.

The Difference

Before and After

Before: Manual Review

  • 2-3 hours per asset, reviewer manually cross-referencing databases
  • Inconsistent verdicts across different reviewers
  • No structured trail linking findings to regulatory sources
  • Customers waiting days for compliance feedback
  • Text in visual assets had to be manually transcribed
  • No self-service option — every check went through the review queue
  • Scaling meant hiring more reviewers

After: AI Compliance Scanner

  • Under 2 minutes per asset with structured, repeatable results
  • Consistent 11-rule validation applied identically every time
  • Every finding cites the specific regulatory document and source
  • Customers get instant first-pass compliance feedback on upload
  • GPT-4 Vision extracts text from images, packaging, and designed PDFs
  • Self-service upload for customers; reviewer dashboard for oversight
  • Scaling means processing more assets, not hiring more people

Platform Capabilities

More Than a Chatbot

A production platform with role-based access, a reviewer dashboard, admin configuration, and full bilingual support.

Role-Based Access

Customers upload and view their own scan results. Reviewers and admins see everything, manage settings, and audit any scan across all customers.

Reviewer Dashboard

Filterable, searchable list of all scanned assets with aggregate statistics. 620+ assets tracked with status breakdowns: compliant, review needed, rejected, pending.

Smart Replay

Identical files (matched by SHA-256 fingerprint) automatically reuse prior results within a configurable time window. Saves processing cost and returns results instantly.

Multi-Format Processing

Images, multi-page PDFs, and text documents. Visual assets use GPT-4 Vision; text-heavy documents use Azure Document Intelligence. The system auto-detects and routes.

Regulatory Source Citations

Every finding links to the specific regulatory document that informed it — complete with source URLs, document names, and applicable rule references.

Full Bilingual Support

Complete English and Dutch translations throughout the entire UI. Regulatory rules, campaign types, product categories, and system messages all available in both languages.

Admin Configuration

Settings panel for Azure OpenAI endpoints, AI Search indexes, Document Intelligence, replay windows, citation display, compliance rules, product categories, and user management.

Complete Audit Trail

Every scan stores prompts, model parameters, raw AI responses, retrieved documents, and timestamps. Full traceability from verdict back to source data.

One-Click Workflows

Submit for Internal Review, Copy Full Report, and Upload Another Asset — all accessible directly from the results page. Designed for speed, not clicks.

We went from a process that took hours per asset and still produced inconsistent results, to a system where customers get structured compliance feedback in under two minutes. The citation transparency changed everything — every finding traces back to the exact regulation.

Head of Digital Operations, European Regulatory Body

Tech Stack

Built for Production in Regulated Environments

Built on Azure AI services with a modern React frontend and Supabase backend. Every component chosen for production reliability in a regulated environment.

Frontend

  • React
  • TypeScript
  • Tailwind CSS

Backend

  • Supabase
  • Edge Functions
  • PostgreSQL

AI Services

  • Azure OpenAI GPT-4
  • GPT-4 Vision
  • Custom Embeddings

Search & Retrieval

  • Azure AI Search
  • Hybrid Semantic
  • Vector Retrieval

Doc Processing

  • Azure Doc Intelligence
  • PDF → Image
  • OCR Extraction

Auth & Storage

  • Supabase Auth
  • Role-Based Access
  • Blob Storage

Key Lessons

What We Learned Building Compliance AI

1

Vision models unlock compliance for visual marketing

The majority of marketing assets in this domain are images, not text documents. Without GPT-4 Vision, the system could not read packaging mockups, social media designs, or retail displays. This one capability is what makes self-service compliance checking possible.

2

Structured validation beats open-ended analysis

Running a single prompt that says "check this for compliance" produces vague, inconsistent results. Breaking the check into 11 specific rule categories — each with its own pass/fail logic, evidence requirements, and fix suggestions — produces results reviewers actually trust.

3

RAG context must include institutional memory

Regulations alone are not enough. The system also retrieves previously approved materials, product dossiers, and training corrections from past reviews. This institutional memory is what makes the AI behave like an experienced reviewer, not a rule-reading machine.

4

Regulatory source citations are the trust mechanism

Without linking every finding back to the specific regulatory document, the system is just another AI opinion. With citations, it is a defensible compliance tool. This single feature drove adoption more than anything else.

5

The reviewer does not go away — they get elevated

The AI handles the first pass. Human reviewers focus on edge cases, training corrections, and judgment calls. The result is faster throughput and better consistency, not fewer experts.

Building Compliance AI for Your Regulated Industry?

Start with a 2-week assessment. We will evaluate your regulatory landscape, document workflows, and deliver a production roadmap.