Applications

Plato receipt recognition engine

Example receipt with wrinkles, smudges, a lighting gradient, bent text-lines, transparency issues

low-light, low-quality receipt image with extremely blurry text

For my work at Shopitize Inc. I am responsible for the design, implementation and maintenance of the Plato receipt recognition engine, a system for recognizing consumer-snapped shopping receipt images. Plato recognizes the retailer, date, total, as well as the products purchased.

Having users snap receipts on their smartphones often results in quite challenging images to recognize. Common issues include camera/motion blur, poor lighting and shadows, wrinkles, smudges and writing on receipts, as well as perspective issues leading to global as well as local skew. To meet the challenge, a suite of image processing tools, OCR, fuzzy text matching, and image-based retailer classification was implemented.

The system consists of identical, independent masterless nodes and is fully scalable. Plato is mainly written in Python, utilizing many open-source libraries for computer vision, linear algebra, worker queues, a web frontend, a distributed database, etc.

Principal tech: Scikit-learn, OpenCV, Numpy, Scipy, Tesseract, Django, Riak, Rabbitmq, Celery, Tornado, Ansible

MediaDB

Search page listing various searchable attributes

Attribute scanners are defined with regular expressions for filenames and text file contents

The system scans the database's files for attribute values

The mediadb system is a framework that allows researchers to disseminate their large multi-modal databases with minimal setup, whilst providing a rich user experience.

The researcher points the system to the dataset's disk location and can then define a set of 'scanners' based on regular expressions, that collect file information and scrape searchable attributes from any metadata files into an in-memory database. On the front-end, users can instantly search through thousands of items in the dataset and download only those items and file types that they are interested in.

Several audio-visual research databases are now using this system:

Principal tech: Django, Redis, Postgres, jQuery, Twitter bootstrap