OCR filter between text and numbers
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
OCR filter between text and numbers
Hi.
Is there a filter option to make the OCR better at recognize mixed text and numbers?
Usually it tries to recognize the whole string as either text or numbers, but i am working with technical parts.
Example 1:
The string "TS14052" turns into "TSHoSZ"
Example 2:
The string "14H21" turns into "14421"
Kind regards,
Alex.
Is there a filter option to make the OCR better at recognize mixed text and numbers?
Usually it tries to recognize the whole string as either text or numbers, but i am working with technical parts.
Example 1:
The string "TS14052" turns into "TSHoSZ"
Example 2:
The string "14H21" turns into "14421"
Kind regards,
Alex.
-
- User
- Posts: 2393
- Joined: Wed Jan 18, 2006 12:10 pm
Re: OCR filter between text and numbers
Hi Alex,
The most important option here, in OCR, is the "Accuracy"-setting with choices "Auto / Low / Medium / High".
Myself, I get the best results (not yet perfect) with "Low" (using the Default OCR Engine).
In case you have a licensed version, you can try the Enhanced OCR Engine, with the different Accuracy settings.
After applying OCR in PDF-XChange Editor, you can check the result via View > Panes > Content pane.
Kind regards.
Willy
The most important option here, in OCR, is the "Accuracy"-setting with choices "Auto / Low / Medium / High".
Myself, I get the best results (not yet perfect) with "Low" (using the Default OCR Engine).
In case you have a licensed version, you can try the Enhanced OCR Engine, with the different Accuracy settings.
After applying OCR in PDF-XChange Editor, you can check the result via View > Panes > Content pane.
Kind regards.
Willy
- Attachments
-
- PDF-XChange Editor - OCR - Low.pdf
- (28.64 KiB) Downloaded 142 times
- TrackerSupp-Daniel
- Site Admin
- Posts: 8592
- Joined: Wed Jan 03, 2018 6:52 pm
Re: OCR filter between text and numbers
Hello, Alex
If you could send us a copy of a document you can reproduce this issue in reliably, it would greatly help us with troubleshooting, and eventually resolving the issue for you.
I should also note that, as Willy mentioned, the "accuracy" setting will be of great import here. Accuracy is used to define the quality of the Document however, and not the quality of the scanning that occurs (frankly, if it was the latter, why would we even offer options, of course everyone would want the highest quality OCR scan). In brief, you should generally only NEED to choose Auto, or Low accuracy, choosing normal or high is often unnecessary as the Auto setting is usually able to determine where to use each as it goes.
~ If your document is old, damaged, stained, speckled, has a low contrast background or faded text, etc, you should be using low or normal accuracy.
~ If your document is a normal, decent quality scanned image, without many blemishes, and fairly clear text, Normal or Auto accuracy should be used.
~ If your document is pristine quality, either a completely unblemished very high quality scan, or a file which has never left the digital format, you should use Auto, or in some cases, High accuracy.
Kind regards,
If you could send us a copy of a document you can reproduce this issue in reliably, it would greatly help us with troubleshooting, and eventually resolving the issue for you.
I should also note that, as Willy mentioned, the "accuracy" setting will be of great import here. Accuracy is used to define the quality of the Document however, and not the quality of the scanning that occurs (frankly, if it was the latter, why would we even offer options, of course everyone would want the highest quality OCR scan). In brief, you should generally only NEED to choose Auto, or Low accuracy, choosing normal or high is often unnecessary as the Auto setting is usually able to determine where to use each as it goes.
~ If your document is old, damaged, stained, speckled, has a low contrast background or faded text, etc, you should be using low or normal accuracy.
~ If your document is a normal, decent quality scanned image, without many blemishes, and fairly clear text, Normal or Auto accuracy should be used.
~ If your document is pristine quality, either a completely unblemished very high quality scan, or a file which has never left the digital format, you should use Auto, or in some cases, High accuracy.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: OCR filter between text and numbers
OMG, I never guessed this from the GUI!!!TrackerSupp-Daniel wrote: ↑Fri Oct 15, 2021 12:14 am Accuracy is used to define the quality of the Document however, and not the quality of the scanning that occurs (frankly, if it was the latter, why would we even offer options, of course everyone would want the highest quality OCR scan).
PLEASE change the GUI text from "Accuracy" to "Image quality" or "Input quality" similar!
Alternatively invert and change to "Image imperfections" or similar.
Alternatively, instead of naming that panel "Recognition options", rename it to "Input document settings". That will then contrast well with the existing lowermost panel, "Output Options".
In answer to your (rhetorical?) question of why "Accuracy" might be interpreted as quality of OCR analysis [paraphrasing what I think you mean], there are some clear answers for me:
- high-quality analyses can require long computational times, large memory capacity, and so on; lower-quality analyses are often considered acceptable if the higher-quality analyses are prohibitively slow or otherwise computationally demanding;
- the option appears in a dialogue box containing a whole lot of settings about how the OCR should be performed, produced after the user as clicked the "OCR Page(s)" button.
- Paul - Tracker Supp
- Site Admin
- Posts: 6897
- Joined: Wed Mar 25, 2009 10:37 pm
- Location: Chemainus, Canada
- Contact:
Re: OCR filter between text and numbers
Give us some time on this one and we will have further discussion here about this.
Things are not normal at the moment, it may be this week, or maybe next.
regards
Things are not normal at the moment, it may be this week, or maybe next.
regards
Best regards
Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Re: OCR filter between text and numbers
Thanks, Paul.
Even next month would be great, as far as I'm concerned
Actually, the suggestion's intended more for the benefit of new/other users (and hence also the benefit of Tracker).
I didn't mean to stress you out with my all-caps 'yelling'
—DIV
Even next month would be great, as far as I'm concerned
Actually, the suggestion's intended more for the benefit of new/other users (and hence also the benefit of Tracker).
I didn't mean to stress you out with my all-caps 'yelling'
—DIV
- Paul - Tracker Supp
- Site Admin
- Posts: 6897
- Joined: Wed Mar 25, 2009 10:37 pm
- Location: Chemainus, Canada
- Contact:
Re: OCR filter between text and numbers
No worries, as you are aware, the discussion is happening: viewtopic.php?f=63&t=37543
Lets see what comes from that percolating...
Lets see what comes from that percolating...
Best regards
Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Paul O'Rorke
Tracker Support North America
http://www.tracker-software.com
Sign of the times
Paul,Paul - Tracker Supp wrote: ↑Mon Mar 07, 2022 4:04 pm Things are not normal at the moment, it may be this week, or maybe next.
People around the world are horrified by the Russian military's attack on Ukraine,
search.php?keywords=Ukraine
https://www.pdf-xchange.com/index. ... s/view/274
My thoughts are with you all, and I encourage others to dig deep to oppose the war.
https://www.defendukraine.org/donate
—DIV
P.S. I realise that this is a technical support forum, so I apologise for posting on another matter, and I understand that this forum is not set up for political discussion.
- Tracker Supp-Stefan
- Site Admin
- Posts: 17910
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
Re: OCR filter between text and numbers
Hello DIV,
Many thanks for the kind words and consideration!
Kind regards,
Stefan
Many thanks for the kind words and consideration!
Kind regards,
Stefan