Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 3 of 3 total
Thread OCR with scanned image / camera
Wed, Oct 31 2018 11:09 AMPermanent Link

Huseyin Aliz

myBiss ApS

Avatar

Hi All,

I think it would be very nice if i can implement a function to let the
user "scan" a receipt with his/her smartphone and the app could
recognize the text and automatically assing/save values to dataset
fields. I think it's not an easy task and i have thought maybe following
can be used:

http://tesseract.projectnaptha.com/
Javascript ocr engine or other javascript solutions.

or something like using tserverrequest and let delphi control the ocr
process, but i am not sure camera access can be done using a delphi
module without using firemonkey.

Does anyone had this request, and maybe tried to implement something
like this?

Regards,
Hüseyin

Tue, Jun 14 2022 6:50 AMPermanent Link

erickengelke

Avatar

bcseyin_Aliz?= wrote:
>I think it would be very nice if i can implement a function
>to let the user "scan" a receipt with his/her smartphone and
> the app could recognize the text and automatically assing/save
>values to dataset fields. I think it's not an easy task and i have
>thought maybe following can be used

True, it was not easy.  But it's now available in my Nice toolkit.

It includes the camera component and the OCR code running right in the browser.

It outputs a text string you can display or pass to the database.

See:
https://www.erickengelke.com/nice/samples.html

Contact me for details
Erick
EWB Programming Books and Component Library
http://www.erickengelke.com
Tue, Jun 14 2022 8:25 PMPermanent Link

erickengelke

Avatar

erickengelke wrote:

> True, it was not easy.  But it's now available in my Nice toolkit.

I've received a couple of queries about this.

The code was quite tricky because EWB doesn't have await, async and other modern JavaScript
features useful for camera and other library promise support.  I'm hopeful they will make their way into EWB soon.  In the meantime, I kludged together EWB compliant JS which does the job reasonably well most of the time.

OCR is accomplished using the Tesseract libraries, originally written bd by HP, then made open source around 2005 and subsequently improved upon by Google.   They are surprisingly good - the standard used by many products now.

It executes as C compiled to WASM code, so it's faster than normal Javascript.   You really need that, because it must scan a lot of data to detect letters of various sizes, orientations, etc.

It supports many different source languages.  I set the default to Eng(lish) but changing it is one assignment.  Still, it had little trouble reading English/French on my Canadian product boxes.

The first time you use it, it must download more than 8 MB of code and data, so that will be sluggish, but subsequent uses are much faster as the files are cached.  Cell phones on cellular data could be expensive to use for that reason, but with Wifi they are no problem.

I've used it on my Android phone, iPad and MacBook, and it works pretty well.  Try to use the back camera if you have a choice, usually the rear camera has higher resolution and thus a clearer picture, they are also easier to set up your video shot.

Performance varies with brand of browser, CPU capabilities, image size and complexity, etc.  It is typical for a page to take 10 to 40 seconds to be scanned and converted to text after the files are loaded.  

OCR is never 100% reliable there are occasional lost characters and false positive characters that are really other symbols.  But it can convert the bulk of your document more efficiently than most users can type the same data.  Have the user verify the data and make any corrections necessary before accepting the data.

Try it at:
https://www.erickengelke.com/nice

Contact me for details
Erick
EWB Programming Books and Component Library
http://www.erickengelke.com
Image