OCR for non-English / editor confused
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
-
- User
- Posts: 19
- Joined: Fri Feb 19, 2016 8:34 pm
OCR for non-English / editor confused
Hi,
1) I created new document with File / NewDocument / FromScanner, color, 300dpi. That is some non-English text (Serbian Cyrillic).
2) Then I did Document / Crop pages to leave a few paragraphs.
3) Then I clicked Document / OCRpages / Language=Serbian.
4)After that when choosing EditContent, double click on one paragraph, typing some text gives nothing: the letters if any are invisible, the cursor is moving. When typing I am using standard Serbian Cyrillic keyboard implemented on Windows 8.1.
Also find text (Ctrl+F) do not give any positive result when searching some text.
Do you have some ideas how to overcome this? I attached the file.
1) I created new document with File / NewDocument / FromScanner, color, 300dpi. That is some non-English text (Serbian Cyrillic).
2) Then I did Document / Crop pages to leave a few paragraphs.
3) Then I clicked Document / OCRpages / Language=Serbian.
4)After that when choosing EditContent, double click on one paragraph, typing some text gives nothing: the letters if any are invisible, the cursor is moving. When typing I am using standard Serbian Cyrillic keyboard implemented on Windows 8.1.
Also find text (Ctrl+F) do not give any positive result when searching some text.
Do you have some ideas how to overcome this? I attached the file.
- Attachments
-
- test.zip
- (1.36 MiB) Downloaded 88 times
- Radi - Tracker Supp
- Site Admin
- Posts: 600
- Joined: Tue Mar 03, 2015 12:46 pm
Re: OCR for non-English / editor confused
Hello slaviŠa neŠi?,
Thank you for the post.
Please take a look at the following knowledge base article to find out how to edit the text in an OCR'd document:
How do I edit text in OCR'd document with the Editor
Please note that using a combination of Slavic languages might get you better results in the OCR. For example, if Serbian is not supported very well, using Russian or Bulgarian alongside Serbian, should produce better character recognition.
In your example file I tried with a combination of Serbian and Bulgarian - the results were far better than using only Serbian.
Regards,
Radi
Thank you for the post.
Please take a look at the following knowledge base article to find out how to edit the text in an OCR'd document:
How do I edit text in OCR'd document with the Editor
Please note that using a combination of Slavic languages might get you better results in the OCR. For example, if Serbian is not supported very well, using Russian or Bulgarian alongside Serbian, should produce better character recognition.
In your example file I tried with a combination of Serbian and Bulgarian - the results were far better than using only Serbian.
Regards,
Radi
- Attachments
-
- OCR.zip
- (71.66 KiB) Downloaded 84 times
-
- User
- Posts: 19
- Joined: Fri Feb 19, 2016 8:34 pm
Re: OCR for non-English / editor confused
Yes, that seems to do the magic, thank you. But the procedure do not leave any pictures behind. But in my example the picture should exist on the right side of document and still remove it's noisy image part on the left.
Is it possible to crop only picture and leave only the right part of picture along with text?
I see the other way: I can adjust the transparency of text layer to 0% and it will override the left image part. Still, it would be beneficial to know if picture itself could be cropped for some other cases?
Is it possible to crop only picture and leave only the right part of picture along with text?
I see the other way: I can adjust the transparency of text layer to 0% and it will override the left image part. Still, it would be beneficial to know if picture itself could be cropped for some other cases?
- Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
- Location: London, UK
- Contact:
Re: OCR for non-English / editor confused
Hi slaviŠa neŠiĆ,
Thanks for the post, but I'm not sure that I understand what you're looking to do here. Can you please upload a sample file and send any screen-shots that may help us to better understand.
Thanks,
Thanks for the post, but I'm not sure that I understand what you're looking to do here. Can you please upload a sample file and send any screen-shots that may help us to better understand.
Thanks,
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
-
- User
- Posts: 19
- Joined: Fri Feb 19, 2016 8:34 pm
Re: OCR for non-English / editor confused
Let me be precise: how do I crop an image object on a PDF page? For example: my downloaded zip file posted above.
- Radi - Tracker Supp
- Site Admin
- Posts: 600
- Joined: Tue Mar 03, 2015 12:46 pm
Re: OCR for non-English / editor confused
Hi slaviŠa neŠiĆ,
Thanks for the post.
It is not possible to crop the image itself, but if you enable the 'Remove the content outside the crop box area' option located in the Crop Page Tool screen (see the attached screenshot), the result will be same as if you cropped the image.
Regards,
Radi
Thanks for the post.
It is not possible to crop the image itself, but if you enable the 'Remove the content outside the crop box area' option located in the Crop Page Tool screen (see the attached screenshot), the result will be same as if you cropped the image.
Regards,
Radi
- Attachments
-
- Crop Pages.zip
- (113.35 KiB) Downloaded 89 times
-
- User
- Posts: 19
- Joined: Fri Feb 19, 2016 8:34 pm
Re: OCR for non-English / editor confused
Hi Radi,
Thank you anyway for the effort.
Best Regards,
-Slavisa
Thank you anyway for the effort.
Best Regards,
-Slavisa
- Radi - Tracker Supp
- Site Admin
- Posts: 600
- Joined: Tue Mar 03, 2015 12:46 pm
Re: OCR for non-English / editor confused
Hi Slavisa,
You can also try to 'crop' the picture using the redaction tool. Just mark the picture on the four sides and apply the redaction. This will remove all marked content and preserve the page size.
If you would like to read more about the redaction tool, please read the following knowledge base article:
How to use Redaction
Regards,
Radi
You can also try to 'crop' the picture using the redaction tool. Just mark the picture on the four sides and apply the redaction. This will remove all marked content and preserve the page size.
If you would like to read more about the redaction tool, please read the following knowledge base article:
How to use Redaction
Regards,
Radi
-
- User
- Posts: 19
- Joined: Fri Feb 19, 2016 8:34 pm
Re: OCR for non-English / editor confused
Thank you Radi.
- Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
- Location: London, UK
- Contact:
Re: OCR for non-English / editor confused
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com