Applications
Plato receipt recognition engine
For my work at Shopitize Inc. I am responsible for the design, implementation and maintenance of the Plato receipt recognition engine, a system for recognizing consumer-snapped shopping receipt images. Plato recognizes the retailer, date, total, as well as the products purchased.
Having users snap receipts on their smartphones often results in quite challenging images to recognize. Common issues include camera/motion blur, poor lighting and shadows, wrinkles, smudges and writing on receipts, as well as perspective issues leading to global as well as local skew. To meet the challenge, a suite of image processing tools, OCR, fuzzy text matching, and image-based retailer classification was implemented.
The system consists of identical, independent masterless nodes and is fully scalable. Plato is mainly written in Python, utilizing many open-source libraries for computer vision, linear algebra, worker queues, a web frontend, a distributed database, etc.
Principal tech: Scikit-learn, OpenCV, Numpy, Scipy, Tesseract, Django, Riak, Rabbitmq, Celery, Tornado, Ansible
MediaDB
The mediadb system is a framework that allows researchers to disseminate their large multi-modal databases with minimal setup, whilst providing a rich user experience.
The researcher points the system to the dataset's disk location and can then define a set of 'scanners' based on regular expressions, that collect file information and scrape searchable attributes from any metadata files into an in-memory database. On the front-end, users can instantly search through thousands of items in the dataset and download only those items and file types that they are interested in.
Several audio-visual research databases are now using this system:
- MMI Facial expression database
- MAHNOB Laughter database
- MAHNOB HCI-Tagging database
- MHI Mimicry database
- SEMAINE database
Principal tech: Django, Redis, Postgres, jQuery, Twitter bootstrap