Read Every Document. Type None of It.
Every business has the same data-entry tax. Quotes get retyped into the CRM. Engineering drawings get translated into bills of materials by hand. Customer emails get summarised into ticket systems. We build AI that reads any document, extracts what matters, and puts it where it belongs — at machine speed.
The promise
Where it hurts
The real friction we hear about.
OCR alone isn't enough
Old-school OCR gives you text. You don't want text. You want structured data — supplier, line items, totals, dates, references — organised the way your downstream system expects.
Documents arrive in every format imaginable
PDFs, scans, photos taken on a phone, Excel files renamed as PDFs, emails with the actual data in the signature. Brittle template-based tools collapse the moment something looks different.
The interesting documents are the hardest
Engineering drawings, contracts, spec sheets, RFQs — these are where the time is. They're also exactly the documents that off-the-shelf tools refuse to touch.
You don't want a black box
When the AI gets something wrong, you need to see why, fix the rule, and trust it next time. Tools that just return a JSON blob with no audit trail are unusable for anything that affects money.
How it works
What we actually build.
Schema-first design
Before any AI runs, we agree on the exact data shape you need. Every field has a type, a validation rule, and a confidence threshold. This is what stops the model going off-piste.
Multi-modal vision + language
Modern multi-modal models read PDFs the way a human does — text, tables, images, handwriting, signatures. No template setup. New supplier formats work day one.
Verification layer
Extracted data is checked against your master records (suppliers, customers, part numbers, accounts). Anything that doesn't reconcile gets flagged before it hits your live system.
Human-in-the-loop where it matters
Above the confidence threshold: straight through to your system. Below: a one-click review queue with the original document side-by-side with the extracted fields. The system learns from every correction.
Proof
Real outcomes, not slideware.
Engineering drawings (PDF) → bill of materials with cut lengths, weld details, plate thickness, and finish. Estimating quotes that used to take an hour now take under five minutes per drawing pack.
Supplier invoices → matched, coded, and posted to Sage with VAT and CIS handled correctly. 95%+ straight-through processing in production.
Pricing
Fixed fee, phased delivery.
£3,200–£12,000 for build (per document type), ~£200–£600/month for hosting + accuracy monitoring
From £3,200
Typical document extraction build
- Workflow audit + spec phase
- Build, integrations, and tuning to your data
- Live deployment in your environment
- Monthly support and accuracy monitoring
Questions
Things people ask us about this.
01 How do you handle handwriting and signatures?
Modern vision-language models are surprisingly good at British handwriting on forms, delivery notes, and timesheets — better than human readers in some cases. Signatures get verified against a reference where you need that (HR onboarding, contracts).
02 Can it read drawings and specs, not just paperwork?
Yes. For Kingsland Fabrications we built a pipeline that reads engineering drawings — extracting cut lists, weld symbols, material grades, and finishes — and produces a bill of materials directly. It handles drawings in different conventions and from different drafting tools.
03 What if a document doesn't fit our schema?
It gets flagged for review with a reason. You decide whether to extend the schema, route it to a different pipeline, or handle it manually. Unknown structure never silently corrupts your data.
04 How much training data do you need?
Usually 5-20 examples per document type is enough to validate quality. We don't fine-tune the underlying model — we use prompt engineering, schema validation, and verification layers. That means new document types ship in days, not months.
05 Where does our data live?
In your environment, on your infrastructure, or on UK/EU regions of the cloud platforms you already use. Document content is not used to train external models — we sign DPAs and zero-retention API contracts.
Related
Other AI patterns we deliver.
Let's see if we can help.
A 15-minute chat with Chris & Kay. No slides. No pitch deck. You tell us what's on your plate; we follow up by email with real thinking.