What is the difference between OCR and AI for document processing?

OCR (Optical Character Recognition) converts an image of text into machine-readable text, it reads characters. Modern document AI reads meaning. Given a Bill of Lading, OCR returns a string of words in reading order; a document AI returns a structured object with shipper, consignee, container numbers and Incoterms correctly labelled, even when the field labels differ across carriers. OCR is a building block inside many AI pipelines, but by itself it does not solve the logistics backoffice problem.

Why does OCR fail on Bills of Lading?

Ocean carriers each use their own B/L template. MSC, Maersk, CMA CGM, Hapag-Lloyd and COSCO lay out fields in different positions, with different labels, and sometimes in multiple languages. OCR can read the text on each form but has no reliable way to know which block is the shipper and which is the consignee. Rule-based templates solve this for one carrier and break on the next. AI pipelines understand the semantics independently of layout, which is why they scale across the long tail of carrier-specific templates.

Can AI handle handwritten notes on a CMR?

Yes. Box 18 of a CMR (the reservations and observations field) is routinely handwritten by the driver at pickup, 'two cartons wet', 'seal broken', 'late arrival'. OCR struggles with handwriting; modern vision AI handles it well, extracting the exact wording needed later for liability disputes. Signatures in boxes 22-24 are detected as present/absent rather than read, which is enough to gate POD automation.

Is classic OCR still useful?

Yes, for two cases: (1) documents with a stable, predictable layout, like a single-carrier invoice flowing through a template that never changes, here OCR plus field-position rules is faster and cheaper than an AI model; (2) as a pre-processing step inside an AI pipeline, turning a scanned PDF into text that the AI then reasons over. OCR is not dead; it is a component. What has changed is that no-one should still be buying an OCR-only product for variable logistics documents.

How does AI perform on poor-quality scans?

Better than OCR in practice, because AI uses context to reconstruct ambiguous fields. A smudged container number that OCR reads as 'MSKU1234557' can be corrected by the AI when the booking and the vessel manifest both show 'MSKU1234567'. This context-aware correction is out of reach for text-level OCR and is a large part of why AI extraction has higher straight-through-processing rates on real-world documents.

Learn · Technology

AI vs OCR for logistics documents

AI and OCR look similar on a vendor slide — both take a PDF in and return data out. In logistics operations the two behave very differently, and the difference is the reason many document-automation projects quietly failed in 2018-2022 and are being re-done in 2024-2026. This guide explains where OCR still fits, where it breaks on real shipping documents, and what "document AI" actually changes.

The short version

OCR reads characters. AI reads meaning. Running OCR on a Bill of Lading gives you a page of words in reading order. Running a document AI on the same B/L gives you { shipper, consignee, containers[], incoterms, freight_terms } — each field correctly labelled even when the carrier's form has no matching label on the page. OCR is useful, but it is a component, not a product. The complete solution needs the AI layer on top.

Why the distinction matters in logistics specifically

Logistics documents are the worst-case scenario for traditional OCR and the best-case scenario for document AI. Five features compound:

Template sprawl. A forwarder sees Bills of Lading from MSC, Maersk, CMA CGM, Hapag-Lloyd, Evergreen, ONE, COSCO, ZIM and two dozen smaller lines — each with their own form. OCR plus rules handles one format well. Handling twenty-four formats with rules is a maintenance treadmill nobody wins.
Multilingual fields. A German CMR uses German field names; a French one uses French; a Polish one uses Polish. A single Amsterdam-based forwarder touches all three in the same morning.
Handwritten exceptions. CMR box 18 is handwritten. Driver notes, damage reservations, delay reasons — all scribbled. These are the fields that matter most when a claim lands.
Cross-document consistency. The booking says one Incoterm; the B/L says another; the customs entry says a third. Someone has to notice. OCR never notices because it sees documents one at a time.
Downstream systems with strict schemas. CargoWise, Softpak and Descartes want specific field shapes. Extraction that is "mostly right" creates downstream problems for weeks.

What each technology is good at

OCR is good at

Fixed-layout forms, printed text, standard fonts. Producing low-cost text from scans at high volume. Being a building block inside a larger pipeline. Historical archive digitisation.

OCR is bad at

Variable layouts, handwriting, language-dependent fields, reasoning across fields, detecting missing data, understanding abbreviations or domain jargon, resolving entities ("MSC" → Mediterranean Shipping Company).

Document AI is good at

Variable layouts, multilingual content, handwriting, cross-document reasoning, detecting missing fields, correcting small errors in context, choosing the right action when the document is ambiguous.

Document AI is bad at

Explaining why it chose an answer without help. Being cheap on very high volume simple extractions (OCR+rules can still win on cost for stable templates).

How they handle a real B/L — worked example

Imagine an MSC Bill of Lading scanned to PDF. The shipper block is top-left, the consignee block is top-right, the container list is a table in the middle, freight terms are in the bottom panel.

OCR returns: a flat text blob with all words in roughly reading order. "SHIPPER / EXPORTER NEDLLOYD MACHINERY BV BLAAK 16 ROTTERDAM NETHERLANDS CONSIGNEE VDR EXP 5501 BUSAN KOREA PORT OF LOADING ROTTERDAM PORT OF DISCHARGE BUSAN..." — you get every character but no structure.

Document AI returns:

{
  "carrier": "MSC",
  "mbl_number": "MEDUH1234567",
  "shipper": {
    "name": "NEDLLOYD MACHINERY BV",
    "address": "Blaak 16, 3011 TA Rotterdam, Netherlands"
  },
  "consignee": {
    "name": "VDR EXP 5501",
    "address": "Busan, Korea"
  },
  "pol": "NLRTM",
  "pod": "KRPUS",
  "containers": [
    { "number": "MSKU1234567", "type": "40HC", "seal": "FX889142" }
  ],
  "incoterms": "CIF BUSAN",
  "freight_terms": "prepaid"
}

The structured output is what the TMS actually wants. Getting from OCR output to the structured output — with acceptable accuracy on 24 different carrier templates — is exactly the problem that document AI solves and that rule-based OCR cannot.

Why so many OCR logistics projects failed 2018-2022

Between 2018 and 2022 many forwarders bought OCR-based "document automation" products. A typical story: a pilot on MSC and Maersk shows 90% field accuracy, project goes into production, the forwarder adds CMA CGM — new template, rules break, accuracy drops to 60%, ops team re-keys everything. A year later the tool is a shelfware line item on the IT budget.

The technical reason: OCR plus rules is fragile to layout. Every new template needed a new rule set. Vendor support queues backed up. Maintenance of the rule set became a full-time job nobody budgeted for.

The organisational reason: the product was sold as "automation" but delivered "partial automation requiring constant rule updates". The promise-to-delivery gap drove internal champions out and killed budgets.

What changed in 2023-2026

Two technical shifts changed the maths. First, large language models (GPT-4 class and beyond) became good enough to reason about document layout and semantics in a single pass, without per-template rules. Second, vision-language models (Claude Vision, GPT-4V, Gemini Vision and open alternatives) started handling scanned PDFs end-to-end — no separate OCR step needed for most cases. What used to require a team of engineers maintaining rules for 24 carrier templates now requires a prompt, a schema and test data.

The commercial shift: forwarders who tried OCR and got burned are now the fastest to adopt AI, because they have a concrete baseline to measure against. AI projects in 2026 tend to run pilots against the OCR tool that failed, and the lift is usually visible in two weeks.

Does AI entirely replace OCR?

No. OCR lives on inside modern pipelines for three reasons:

Cost. Running a text OCR pass before the AI layer is often cheaper than asking the AI to read characters visually — especially on very high volumes of standard typed invoices.
Latency. Pre-OCR reduces the token count the AI has to process, which speeds up inference on batch jobs.
Fallback. For documents with stable layouts (a single supplier's invoices, for example), rules-on-OCR can match AI accuracy at a fraction of the per-document cost.

Practical builders combine the two: fast OCR for the routine cases, AI for the variable and hard ones, confidence thresholds deciding which pipeline handles a given document. The vendor marketing term for this is "hybrid extraction"; the reality is just "pick the cheaper tool that works for each case".

The evaluation playbook — how to compare

When assessing a document-extraction vendor, ask for five numbers on your own documents (not the vendor's curated sample):

Field-level accuracy per critical field (shipper, consignee, container, Incoterm, freight terms). Aggregate document accuracy hides gaps.
Straight-through-processing rate: what percentage of documents need zero human touch.
Latency per document, p50 and p95. Ops cares as much about p95 as average.
New-template onboarding time: how long from "first MSC B/L" to "90% accuracy on MSC B/Ls".
Maintenance cost: who fixes the rules/prompts when accuracy drops, and what does it cost.

If a vendor can't give you these five numbers on a real sample, you are buying a brochure, not a product.

Where Logentic fits

Logentic uses document AI (Claude-class vision-language models) fronted by lightweight OCR where it helps. More importantly: extraction is only the first step. Our agent Alex then validates the extracted fields against the booking, fills the shipment in the TMS, surfaces exceptions to the human operator, and closes the loop with the customer. OCR vendors sell extraction; Logentic sells "the task is done". The distinction is the point.

Frequently asked questions

Is document AI the same as "IDP" (Intelligent Document Processing)?

IDP is an older umbrella term that historically meant "OCR plus rules plus some machine learning for classification". Modern document AI built on vision-language models is a step-change above what the IDP label used to mean. Some incumbent IDP vendors have added LLM capabilities; others still sell the 2018-era stack with an AI-washed badge.

Can AI read a bad scan better than a human?

On average, yes, for routine fields. Humans still edge out AI on unusual handwriting and heavily damaged scans. The practical result: AI handles the ~95% of documents that are fine, humans handle the 5% that aren't. Throughput per human operator rises sharply.

Does AI get confused by two-column PDFs?

Modern vision models handle multi-column layouts well. Traditional OCR often misread two-column layouts by running the text left-to-right across both columns, which scrambled the output.

How do I start comparing OCR and AI vendors for my forwarder?

Pick 20 real documents representative of your traffic (mix of carriers, templates, languages, including some bad scans). Send them to each vendor with the same schema and measure the five metrics above. Two weeks of effort saves 12 months of the wrong procurement.