Correcting OCR errors?
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
Re: Correcting OCR errors?
Thanks, Daniel. The timeline is not so important.
They can keep me in the loop. Maybe, it is important for debugging: I can't remember that I had similar problems when I enhanced the current page (there is such an option in the Enhanced dialog).
saschu
They can keep me in the loop. Maybe, it is important for debugging: I can't remember that I had similar problems when I enhanced the current page (there is such an option in the Enhanced dialog).
saschu
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
Indeed I had made a reference to this thread in that ticket, so if the Devs need additional input I am sure they will reach out here to get if from you.
Have a good day!
Indeed I had made a reference to this thread in that ticket, so if the Devs need additional input I am sure they will reach out here to get if from you.
Have a good day!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Is it possible to synchronize viewing an element as a tree element in the Content Pane and as a text marked in blue? For example, in the snapshot the text 'text generation' should be selected (synchronized) both in the tree in the content pane and the text viewer in blue. Now the element "text generation" is selected in the text, but not in the content pane
- Tracker Supp-Stefan
- Site Admin
- Posts: 17943
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
Re: Correcting OCR errors?
Hello Sashu,
If you select an element in the content pane - this should set the focus in the main document rendering area so that this element and it's position in the file are visible, but the other is not possible to be activated I am afraid.
Regards,
Stefan
If you select an element in the content pane - this should set the focus in the main document rendering area so that this element and it's position in the file are visible, but the other is not possible to be activated I am afraid.
Regards,
Stefan
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
Do note that this search will work if you are searching for text in the bookmarks themselves. But only so long as the Include bookmarks option is active. Incases where there are multiple results, the next button will cycle through highlighting the bookmarks and the page content in the order they appear.
Do note that this search will work if you are searching for text in the bookmarks themselves. But only so long as the Include bookmarks option is active. Incases where there are multiple results, the next button will cycle through highlighting the bookmarks and the page content in the order they appear.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Great. Does it mean that first I have to create bookmarks for all text lines holding text content of the whole pdf if I want to search simultaneously in text and ind in the content? How can I do it automatically?
Sashu
Sashu
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
HI Sashu,
I may be a bit confused here, so I apologize if this is not the answer you wanted.
In my above screenshot there is a display of the options for the Find dialog. If you have all the options ticked off, include [page text, bookmarks, comments, form fields, external links] you should, as implied, be able to search all of those types of text and content simultaneously.
I may be a bit confused here, so I apologize if this is not the answer you wanted.
In my above screenshot there is a display of the options for the Find dialog. If you have all the options ticked off, include [page text, bookmarks, comments, form fields, external links] you should, as implied, be able to search all of those types of text and content simultaneously.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Was it your of my question at.Fri Jun 15, 2018 9:50 am? I thoght you've found an answer, but you haven't?
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
What I meant was that I was unsure if I understood your last question.
You've said:
You've said:
I thought you were asking how to search the bookmarks and the text simultaneously, to which I answered:sashu wrote:Great. Does it mean that first I have to create bookmarks for all text lines holding text content of the whole pdf if I want to search simultaneously in text and ind in the content? How can I do it automatically?
If you did not mean how to search both simultaneously, please clarify what the issue was, as I seem to have misunderstood what you are asking.TrackerSupp-Daniel wrote:In my above screenshot there is a display of the options for the Find dialog. If you have all the options ticked off, include [page text, bookmarks, comments, form fields, external links] you should, as implied, be able to search all of those types of text and content simultaneously.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
It seems you misunderstood my question because I was primarily asking if it is possible to search in the content pane and in the pdf editor simultaneously. I thought it is strictly impossible, but maybe it is not the case.
sashu
sashu
-
- User
- Posts: 2394
- Joined: Wed Jan 18, 2006 12:10 pm
Re: Correcting OCR errors?
(Currently) you can not search in, or via, the Content pane.
However, the Search function offers you a way to search through the whole text-content of a PDF.
Via Search > Options..., you can select (via check-marks) where you would like to search in (like already mentioned here above):
- page text
- bookmarks
- comments
- form fields
- external links
- attachments
- document info
However, the Search function offers you a way to search through the whole text-content of a PDF.
Via Search > Options..., you can select (via check-marks) where you would like to search in (like already mentioned here above):
- page text
- bookmarks
- comments
- form fields
- external links
- attachments
- document info
Re: Correcting OCR errors?
Hi,
I have my next questions:
1. In optimized saving, does visual information get lost?
2. Can I rotate a page by several degrees (not +/- 90 degrees)?
3. Is it possible to edit a particular area of a page, for example, to brighten an area up?
Best, sashu
I have my next questions:
1. In optimized saving, does visual information get lost?
2. Can I rotate a page by several degrees (not +/- 90 degrees)?
3. Is it possible to edit a particular area of a page, for example, to brighten an area up?
Best, sashu
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
for your next questions,
1. This depends on the settings chosen. If you choose to unembed fonts, or heavily compress the images in the PDF than yes, there will be some noticeable visual changes. Namely being that the originally used font (if unavailable on the recipients machine) will use a similar default font during viewing.
2. No, currently we only offer the ability to rotate a page as you have mentioned.
3. Technically, No, however you could try using a transparent rectangle tool over the area to "Highlight" it as such. This article details how to customize tool palettes: https://www.pdf-xchange.com/knowle ... nge-Editor
Regards,
for your next questions,
1. This depends on the settings chosen. If you choose to unembed fonts, or heavily compress the images in the PDF than yes, there will be some noticeable visual changes. Namely being that the originally used font (if unavailable on the recipients machine) will use a similar default font during viewing.
2. No, currently we only offer the ability to rotate a page as you have mentioned.
3. Technically, No, however you could try using a transparent rectangle tool over the area to "Highlight" it as such. This article details how to customize tool palettes: https://www.pdf-xchange.com/knowle ... nge-Editor
Regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Thanks, Daniel. My next next questions:
1. I saw Image objects in the content pane that represent scanned pages. Is it possible to replace these objects through other, for example, through rotated images?
2. I assume, the OCR engine considers everything also garbage on a scanned page when performing OCR. This garbage can however influence recognition. Is my assumption correct?
Best, sashu
1. I saw Image objects in the content pane that represent scanned pages. Is it possible to replace these objects through other, for example, through rotated images?
2. I assume, the OCR engine considers everything also garbage on a scanned page when performing OCR. This garbage can however influence recognition. Is my assumption correct?
Best, sashu
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
1. I am unsure what you mean to replace objects through other rotated images, please clarify with an example.
2. Indeed, the OCR function considers everything on the page while processing. This can sometimes leave extra characters on the sheet if the scan is not perfectly clear.
1. I am unsure what you mean to replace objects through other rotated images, please clarify with an example.
2. Indeed, the OCR function considers everything on the page while processing. This can sometimes leave extra characters on the sheet if the scan is not perfectly clear.
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
As you've said, PDF-XChange can rotate images for OCR only by 90 degrees and I thought up a workaround how to rotate these images by several degrees using a graphic editor. The only thing that I miss now is how to substitute the original page image with the rotated one.
On the attached snapshot, you see the image that is lurched by several degrees. It can be, however, the source of recognition errors in OCR, for example, if the font of the image is small. I have a graphic editor that can rotate images by an arbitrary degree. The only thing I don't know now how to delete the lurched image and to add the new rotated one.
On the attached snapshot, you see the image that is lurched by several degrees. It can be, however, the source of recognition errors in OCR, for example, if the font of the image is small. I have a graphic editor that can rotate images by an arbitrary degree. The only thing I don't know now how to delete the lurched image and to add the new rotated one.
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
Subsituting an image is done by right clicking an image with the Edit content tool selected, and then selecting Replace: Another method of rotating images is by grabbing the rotation note handle that you can see coming out of the top portion in this screenshot. this allows for slightly more fine tuning.
Beyond that, so long as the document is not too heavily skewed, do note the deskew feature that our enhanced scanned pages function offers. I hope this helps!
Subsituting an image is done by right clicking an image with the Edit content tool selected, and then selecting Replace: Another method of rotating images is by grabbing the rotation note handle that you can see coming out of the top portion in this screenshot. this allows for slightly more fine tuning.
Beyond that, so long as the document is not too heavily skewed, do note the deskew feature that our enhanced scanned pages function offers. I hope this helps!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Hello Daniel,
thank you for your suggestions. The first one is exactly what I am looking for; I knew about the second suggestion and frankly speaking don't like it. 1) I don't know how the algorithm works and can't influence processing; 2) the processing is invasive, the original image is edited after enhancement what may be not desirable; 3) the solution is not flexible -- images often need more effects, for example, they not only have to be descewed, but need more contrast.
Best, sashu
thank you for your suggestions. The first one is exactly what I am looking for; I knew about the second suggestion and frankly speaking don't like it. 1) I don't know how the algorithm works and can't influence processing; 2) the processing is invasive, the original image is edited after enhancement what may be not desirable; 3) the solution is not flexible -- images often need more effects, for example, they not only have to be descewed, but need more contrast.
Best, sashu
- Tracker Supp-Stefan
- Site Admin
- Posts: 17943
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
Re: Correcting OCR errors?
Hello Sashu,
Glad to hear you like the solution Dan offered!
Cheers,
Stefan
Glad to hear you like the solution Dan offered!
Cheers,
Stefan
Re: Correcting OCR errors?
The problem has been mentioned on June 1, but I didn't understand the answer:
I have a pdf that has several invisible text layers. How can I remove some of them?
I have a pdf that has several invisible text layers. How can I remove some of them?
- Patrick-Tracker Supp
- Site Admin
- Posts: 1645
- Joined: Thu Mar 27, 2014 6:14 pm
- Location: Vancouver Island
- Contact:
Re: Correcting OCR errors?
Hello,
Thank you for your post. To remove invisible text, or indeed any text, you may select the blocks from within the content pane (View> Panes> Content)
Some documents may have many kinds of content. You can select them by type:
Once selected simply use the Delete key to remove the text objects. Of course, this will remove all the text. You will otherwise need to select the Edit content tool, then select the invisible text blocks individually (hold CTRL to select multiple at once) and use the Delete key to remove them.
I hope this helps!
Thank you for your post. To remove invisible text, or indeed any text, you may select the blocks from within the content pane (View> Panes> Content)
Some documents may have many kinds of content. You can select them by type:
Once selected simply use the Delete key to remove the text objects. Of course, this will remove all the text. You will otherwise need to select the Edit content tool, then select the invisible text blocks individually (hold CTRL to select multiple at once) and use the Delete key to remove them.
I hope this helps!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
Thank you.
Cheers,
Patrick Charest
Tracker Support North America
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
I am pleased to say that I have just received confirmation this issue (#4386 these Files re negatively affeceted by enhanced OCR) will be resolved with the upcoming build 328, which we plan to release on December 10th.
Please Let us know if you find anything out of place after the update!
Kind regards!
Please Let us know if you find anything out of place after the update!
Kind regards!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Great! If you tell me how I can be registered to be notified of your releases, I will of course verify it.
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
I am afraid that we no longer send out notifications upon releases, the best way to keep up to date is to set our updater to automatically check for updates periodically.
Kind regards,
I am afraid that we no longer send out notifications upon releases, the best way to keep up to date is to set our updater to automatically check for updates periodically.
Kind regards,
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Re: Correcting OCR errors?
Daniel,
it is also impossible since Tracker Update 7.0.325.1 says that the file "TrackerUpdate.zip" can't be downloaded if I want to check for updates.
Best, sashu
it is also impossible since Tracker Update 7.0.325.1 says that the file "TrackerUpdate.zip" can't be downloaded if I want to check for updates.
Best, sashu
- Tracker Supp-Stefan
- Site Admin
- Posts: 17943
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
- Contact:
Re: Correcting OCR errors?
Hello Sashu,
There were some older builds of the updater tool itself that will indeed not find the newer versions.
Please update to 327.1 manually - and from here on the check for updates should work properly.
Regards,
Stefan
There were some older builds of the updater tool itself that will indeed not find the newer versions.
Please update to 327.1 manually - and from here on the check for updates should work properly.
Regards,
Stefan
Re: Correcting OCR errors?
Hello Stefan,
I manually installed 327.1 and could see the number of the current version in the about box. However, I couldn't set up the updater because of the known error message and had to reboot.
After I rebooted the system, I could finally set up the updater. However, it couldn't install automatically Pdf-Xchange Pro because I wouldn't have administrator permissions. It is definitely not true. I downloaded the Pro version manually and wanted to install it. Although the 325.1 version is registered in the system, this was not enough to install 327.1 -- if I start the pro installation I need to specify the serial number that is not the serial number that I used to install the editor.
Cheers, sashu
I manually installed 327.1 and could see the number of the current version in the about box. However, I couldn't set up the updater because of the known error message and had to reboot.
After I rebooted the system, I could finally set up the updater. However, it couldn't install automatically Pdf-Xchange Pro because I wouldn't have administrator permissions. It is definitely not true. I downloaded the Pro version manually and wanted to install it. Although the 325.1 version is registered in the system, this was not enough to install 327.1 -- if I start the pro installation I need to specify the serial number that is not the serial number that I used to install the editor.
Cheers, sashu
- TrackerSupp-Daniel
- Site Admin
- Posts: 8610
- Joined: Wed Jan 03, 2018 6:52 pm
Re: Correcting OCR errors?
Hello Sashu,
This would likely mean that you do not have a license which covers the use of all PRO products installed. Might I ask, from your description it sounds as if you have multiple packages installed, which, as PRO overlaps with nearly all others, can cause problems during updates. If you are going to be using the PRO suite, I suggest removing all Tracker-Software Products, and then only installing PRO, if you only need PDF Tools, or our Standard printer, these are available as separate downloads that do not overlap with the Editor or Editor Plus installers.
I hope this helps!
This would likely mean that you do not have a license which covers the use of all PRO products installed. Might I ask, from your description it sounds as if you have multiple packages installed, which, as PRO overlaps with nearly all others, can cause problems during updates. If you are going to be using the PRO suite, I suggest removing all Tracker-Software Products, and then only installing PRO, if you only need PDF Tools, or our Standard printer, these are available as separate downloads that do not overlap with the Editor or Editor Plus installers.
I hope this helps!
Dan McIntyre - Support Technician
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com
Tracker Software Products (Canada) LTD
+++++++++++++++++++++++++++++++++++
Our Web site domain and email address has changed as of 26/10/2023.
https://www.pdf-xchange.com
Support@pdf-xchange.com