Zonal OCR Position is off at certain PCs
Posted: Thu Dec 01, 2016 11:10 am
I use the selection of an Area in the pdf for text recognition, now on most PCs this works but on certain PCs I have the Problem
that not the selected Area is analyzed but the Area slightly above the selection. Now I first expected the DPI settings of the Monitor but this
Seem not the be the reason, I am not shure what is causing the problem.
The user also told me it worked till a certain point in time but now not anymore.
Is there a known issue which can cause the problem, the only thing I found was this:
Also any Ideas what can cause the problem? On the client maschine I already tried to reduce the windows DPI from 125% to 100% but the Problem stayed.
https://www.pdf-xchange.com/forum3 ... creenPoint
This is my code (C#)
The problem maschine in example is a Windows 10 Maschine.
The Problem can't be the PDF by the way, since I downloaded the PDF the customer had problems with and ony my maschine it worked without problems.
that not the selected Area is analyzed but the Area slightly above the selection. Now I first expected the DPI settings of the Monitor but this
Seem not the be the reason, I am not shure what is causing the problem.
The user also told me it worked till a certain point in time but now not anymore.
Is there a known issue which can cause the problem, the only thing I found was this:
Also any Ideas what can cause the problem? On the client maschine I already tried to reduce the windows DPI from 125% to 100% but the Problem stayed.
https://www.pdf-xchange.com/forum3 ... creenPoint
This is my code (C#)
The problem maschine in example is a Windows 10 Maschine.
The Problem can't be the PDF by the way, since I downloaded the PDF the customer had problems with and ony my maschine it worked without problems.
Code: Select all
// first I get the mouse position.
vwr.GetProperty("Notifications.Mouse.Page", out oPage);
// First I track the mouse position pf start and and end by the Notifications.Mouse event.
object mouseX = null;
object mouseY = null;
_viewer.GetProperty("Notifications.Mouse.x", ref mouseX);
_viewer.GetProperty("Notifications.Mouse.y", ref mouseY);
// This part of the method translates them into PDF coordinates.
mouseCoordinates ret = new mouseCoordinates();
// get the current page
object oPage = null;
int iPage;
vwr.GetProperty("Notifications.Mouse.Page", out oPage);
iPage = int.Parse(oPage.ToString());
ret.page = (uint)iPage;
// translate start mouse coordinates
int[] cInputStart = new int[2];
cInputStart[0] = int.Parse(startX.ToString());
cInputStart[1] = int.Parse(startY.ToString());
Object cOutStart = new Object();
vwr.DoDocumentVerb(iDoc, String.Format("Pages[{0}]", iPage.ToString()), "TranslateScreenPoint", cInputStart, out cOutStart, 0);
ret.startX = ((double[])cOutStart)[0];
ret.startY = ((double[])cOutStart)[1];
// get and translate coordinates of act mouse pos
int[] cInputEnd = new int[2];
cInputEnd[0] = endX;
cInputEnd[1] = endY;
Object cOutEnd = new Object();
vwr.DoDocumentVerb(iDoc, String.Format("Pages[{0}]", iPage.ToString()), "TranslateScreenPoint", cInputEnd, out cOutEnd, 0);
ret.endX = ((double[])cOutEnd)[0];
ret.endY = ((double[])cOutEnd)[1];
return ret;
// Running the ocr:
PDFXOCR_Funcs.PXO_Options options = new PDFXOCR_Funcs.PXO_Options();
options.blacklist = "";
options.whitelist = "";
options.raster_dpi = 600; // DPI for resolution of document after precessing it (less DPIs can make the document ugly)
// coordinate calculcation of selection rectangle
double nl, nr, nt, nb;
nl = Math.Min(ret.startX, ret.endX);
nr = Math.Max(ret.startX, ret.endX);
nt = Math.Max(ret.startY, ret.endY);
nb = Math.Min(ret.startY, ret.endY);
// Create input field
// set zone to ocr
PDFXOCR_Funcs.OCR_NewInputFields(out inFields);
PDFXOCR_Funcs.PXO_InputField nif = new PDFXOCR_Funcs.PXO_InputField();
nif.left = nl;
nif.right = nr;
nif.top = nt;
nif.bottom = nb;
nif.nPage = page;
PDFXOCR_Funcs.OCR_AddInputField(inFields, nif);
// extract the text at the given position
string ret = "";
hResult = PDFXOCR_Funcs.OCR_GetFields(pdf, ref options, out ret, inFields, "\n", (int)PDFXOCR_Funcs.PXO_FieldInputFlags.PXO_Origin_BottomLeft);
if (PDFXOCR_Funcs.IS_DS_FAILED(hResult))
{
throw new OCRException(string.Format("Unable to extract region OCR: {0}{1}ErrorCode: {2}", sourcePdf, Environment.NewLine, hResult.ToString()));
}
try
{
PDFXOCR_Funcs.OCR_Delete(out pdf);
}
catch { }
return ret.Replace("\n", " ").Replace(Environment.NewLine, " ").Trim();