philosopher bagpiper

date/2014/03

one day coding binge: OpenCV, Tesseract and MtG

Last weekend I got to see these guys. They were incredible.

I’ve been wanting to try OpenCV for a while. I did some Computer Vision work back in Uni and had a great time at it, and recently realised I had a cool fun project I could do to save me some time.

If you ever played TCGs (Trading Card Games), you know collections quickly become unmanageable, taking hours on end of inventory tracking if you’re serious about it. In my case, when I was playing TCGs here in Sydney (mostly MtG), I had the discipline to type in every new card, but every now and then at a big tournament I’d lose track of what was in. As the new cards piled up, the time it takes to type them in increased so much I gave up. I also stopped playing a while back, but that box is still there, rotting away.

One of the uses for that big box I have in my closet is that it can be sold—most of the cards I own are worth money. But if I am to sell them off, I need an inventory first. Hence this project.

Typing in a card name takes me about half a minute, but the strain is the worst. Typing is exhausting. So instead I coded a detector in python that grabs the card name and puts it in the clipboard or an output file while making a rewarding ‘beep’ sound (kid you not, I love that feature so much it’s on by default). Here is the detector running in clipboard mode. As it is now, it takes about 12s to detect a card, the minimum I’ve seen was about 3s and there is no maximum (it can sit there until it figures it out).

The principle is simple: instead of trying to find out where the card is, it shows the user where it is looking. Once the user puts the card in the right place, it will attempt to figure out which card it is.

I tried other OCR ideas, like training Tesseract with the MtG font and card names. After all that, I decided to keep it simple and go with the default Tesseract detector with only a-zA-Z characters. This way card names are just a single word with no spaces. From there, I use the Hunspell spell checker with a custom dictionary to spell check these ‘words’ and give me the most likely candidate. Once the system is confident enough that it found a match, it will output data in whichever format was selected.

The results are incredibly good (and fast) considering my rig is a $10 Kmart cam, a jar, some wire and a white sheet of paper. One of the cool things I’ll be looking into is how to make a nicer rig that has a ‘place’ for the card, so I can drop it straight in and it will be properly lit and in the right detection place. Maybe once my Peachy Printer arrives I can add some OpenSCAD to the project.

photo of the test rig

My scanning setup

The GPLv3 source code is up on my GitHub. It is true, I finally gave up and joined GitHub until I can run my own git server (which, depending on my budget, may or may not be soon). This is my first ‘official’ public open source project!

1 of 1