PDF-XChange - Tracker PDF Viewer - TIFF-XChange - Image-XChange - XMF-XChange - Raster-XChange - Support

Moderators: TrackerSupp-Daniel, Tracker Support, Chris - Tracker Supp, Vasyl-Tracker Dev Team, Paul - Tracker Supp, Ivan - Tracker Software, Sean - Tracker, Tracker Supp-Stefan

 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Sat Jun 09, 2018 10:05 am

Thanks, Daniel. The timeline is not so important.

They can keep me in the loop. Maybe, it is important for debugging: I can't remember that I had similar problems when I enhanced the current page (there is such an option in the Enhanced dialog).

saschu
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Mon Jun 11, 2018 7:01 pm

Hello Sashu,
Indeed I had made a reference to this thread in that ticket, so if the Devs need additional input I am sure they will reach out here to get if from you.
Have a good day!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Fri Jun 15, 2018 7:50 am

Is it possible to synchronize viewing an element as a tree element in the Content Pane and as a text marked in blue? For example, in the snapshot the text 'text generation' should be selected (synchronized) both in the tree in the content pane and the text viewer in blue. Now the element "text generation" is selected in the text, but not in the content pane
snapshot-15.06.2018.jpg
 
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 12737
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Correcting OCR errors?

Fri Jun 15, 2018 12:36 pm

Hello Sashu,

If you select an element in the content pane - this should set the focus in the main document rendering area so that this element and it's position in the file are visible, but the other is not possible to be activated I am afraid.

Regards,
Stefan
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Fri Jun 15, 2018 5:23 pm

Hello Sashu,
Do note that this search will work if you are searching for text in the bookmarks themselves. But only so long as the Include bookmarks option is active.
180615885.png

Incases where there are multiple results, the next button will cycle through highlighting the bookmarks and the page content in the order they appear.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Fri Jun 15, 2018 6:07 pm

Great. Does it mean that first I have to create bookmarks for all text lines holding text content of the whole pdf if I want to search simultaneously in text and ind in the content? How can I do it automatically?

Sashu
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Fri Jun 15, 2018 6:32 pm

HI Sashu,
I may be a bit confused here, so I apologize if this is not the answer you wanted.
In my above screenshot there is a display of the options for the Find dialog. If you have all the options ticked off, include [page text, bookmarks, comments, form fields, external links] you should, as implied, be able to search all of those types of text and content simultaneously.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Fri Jun 15, 2018 7:29 pm

Was it your of my question at.Fri Jun 15, 2018 9:50 am? I thoght you've found an answer, but you haven't?
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Fri Jun 15, 2018 8:04 pm

What I meant was that I was unsure if I understood your last question.
You've said:
sashu wrote:
Great. Does it mean that first I have to create bookmarks for all text lines holding text content of the whole pdf if I want to search simultaneously in text and ind in the content? How can I do it automatically?

I thought you were asking how to search the bookmarks and the text simultaneously, to which I answered:
TrackerSupp-Daniel wrote:
In my above screenshot there is a display of the options for the Find dialog. If you have all the options ticked off, include [page text, bookmarks, comments, form fields, external links] you should, as implied, be able to search all of those types of text and content simultaneously.

If you did not mean how to search both simultaneously, please clarify what the issue was, as I seem to have misunderstood what you are asking.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Sat Jun 16, 2018 4:16 am

It seems you misunderstood my question because I was primarily asking if it is possible to search in the content pane and in the pdf editor simultaneously. I thought it is strictly impossible, but maybe it is not the case.
sashu
 
Willy Van Nuffel
User
Posts: 1286
Joined: Wed Jan 18, 2006 12:10 pm

Re: Correcting OCR errors?

Sun Jun 17, 2018 9:13 am

(Currently) you can not search in, or via, the Content pane.

However, the Search function offers you a way to search through the whole text-content of a PDF.
Via Search > Options..., you can select (via check-marks) where you would like to search in (like already mentioned here above):
- page text
- bookmarks
- comments
- form fields
- external links
- attachments
- document info
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Mon Jun 18, 2018 7:19 am

Hi,

I have my next questions:
1. In optimized saving, does visual information get lost?
2. Can I rotate a page by several degrees (not +/- 90 degrees)?
3. Is it possible to edit a particular area of a page, for example, to brighten an area up?

Best, sashu
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Mon Jun 18, 2018 3:57 pm

Hello Sashu,
for your next questions,
1. This depends on the settings chosen. If you choose to unembed fonts, or heavily compress the images in the PDF than yes, there will be some noticeable visual changes. Namely being that the originally used font (if unavailable on the recipients machine) will use a similar default font during viewing.
2. No, currently we only offer the ability to rotate a page as you have mentioned.
3. Technically, No, however you could try using a transparent rectangle tool over the area to "Highlight" it as such. This article details how to customize tool palettes: https://www.tracker-software.com/knowle ... nge-Editor

Regards,
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Tue Jun 19, 2018 6:16 am

Thanks, Daniel. My next next questions:

1. I saw Image objects in the content pane that represent scanned pages. Is it possible to replace these objects through other, for example, through rotated images?
2. I assume, the OCR engine considers everything also garbage on a scanned page when performing OCR. This garbage can however influence recognition. Is my assumption correct?

Best, sashu
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Tue Jun 19, 2018 5:17 pm

Hello Sashu,

1. I am unsure what you mean to replace objects through other rotated images, please clarify with an example.
2. Indeed, the OCR function considers everything on the page while processing. This can sometimes leave extra characters on the sheet if the scan is not perfectly clear.
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Wed Jun 20, 2018 7:18 am

As you've said, PDF-XChange can rotate images for OCR only by 90 degrees and I thought up a workaround how to rotate these images by several degrees using a graphic editor. The only thing that I miss now is how to substitute the original page image with the rotated one.

On the attached snapshot, you see the image that is lurched by several degrees. It can be, however, the source of recognition errors in OCR, for example, if the font of the image is small. I have a graphic editor that can rotate images by an arbitrary degree. The only thing I don't know now how to delete the lurched image and to add the new rotated one.

snapshot-20.06.2018.jpg
 
User avatar
TrackerSupp-Daniel
Site Admin
Posts: 999
Joined: Wed Jan 03, 2018 6:52 pm

Re: Correcting OCR errors?

Wed Jun 20, 2018 6:34 pm

Hello Sashu,
Subsituting an image is done by right clicking an image with the Edit content tool selected, and then selecting Replace:
180620943.png
180620943.png (22.66 KiB) Viewed 147 times

Another method of rotating images is by grabbing the rotation note handle that you can see coming out of the top portion in this screenshot. this allows for slightly more fine tuning.

Beyond that, so long as the document is not too heavily skewed, do note the deskew feature that our enhanced scanned pages function offers.
180620944.png


I hope this helps!
Daniel McIntyre
Support Technician
Tracker Software Products (Canada) LTD

Sales: +1 (250) 324-1621
Fax: +1 (250) 324-1623
 
sashu
User
Posts: 51
Joined: Mon May 14, 2018 11:53 am

Re: Correcting OCR errors?

Fri Jun 22, 2018 6:42 am

Hello Daniel,

thank you for your suggestions. The first one is exactly what I am looking for; I knew about the second suggestion and frankly speaking don't like it. 1) I don't know how the algorithm works and can't influence processing; 2) the processing is invasive, the original image is edited after enhancement what may be not desirable; 3) the solution is not flexible -- images often need more effects, for example, they not only have to be descewed, but need more contrast.

Best, sashu
 
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 12737
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Correcting OCR errors?

Fri Jun 22, 2018 9:36 am

Hello Sashu,

Glad to hear you like the solution Dan offered!

Cheers,
Stefan
 
kmwittko
User
Posts: 1
Joined: Wed Aug 08, 2018 5:17 pm

Re: Correcting OCR errors?

Wed Aug 08, 2018 5:22 pm

The problem has been mentioned on June 1, but I didn't understand the answer:
I have a pdf that has several invisible text layers. How can I remove some of them?
 
User avatar
Patrick-Tracker Supp
Site Admin
Posts: 1668
Joined: Thu Mar 27, 2014 6:14 pm
Location: Vancouver Island
Contact:

Re: Correcting OCR errors?

Wed Aug 08, 2018 5:57 pm

Hello,

Thank you for your post. To remove invisible text, or indeed any text, you may select the blocks from within the content pane (View> Panes> Content)

Some documents may have many kinds of content. You can select them by type:

Image

Once selected simply use the Delete key to remove the text objects. Of course, this will remove all the text. You will otherwise need to select the Edit content tool, then select the invisible text blocks individually (hold CTRL to select multiple at once) and use the Delete key to remove them.

I hope this helps!
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.

Cheers,

Patrick Charest
Tracker Support North America

Who is online

Users browsing this forum: No registered users and 1 guest