Why is OCR so Bad?

The plans we received from the EOR are a mess when it comes to hidden text, so doing things like creating page labels, bookmarks, or Set labels from Page Regions doesn't work and we basically have to manually edit everything, which can be a pain on a 770+ page plan set. At support's suggestion, they had me Print to Bluebeam PDF to clean all that up and then run OCR. It did improve things, but I still need to correct a lot of sheet numbers. It thinks 1's are 7 or I, it thinks B is 8, thinks W is V \ /, and S is 5. That's just the sheet numbers I'm working with. I can't image how it's messing up numbers on the sheets in general. Here are just a few examples, I fixed most of them before thinking to post.

Comments

  • Scott Cavendish
    Scott Cavendish Posts: 4
    edited August 8

  • Luke Shiras
    Luke Shiras Posts: 332

    That is frustrating. I've had issues with this before but luckily for me it's only been a couple sheets per set. Honestly, I just kind of assumed that OCR was a third-party tool, kind of like auto-correct on phones and that Bluebeam just licensed it.