Hi there, I do speech recognition out of personal interest. Nowadays mainly for the Greek language but I started with English. I use the HTK and CMU (Sphinx) toolkits and libraries. Speech recognition is more of an art rather than science, incredibly prone to error but once you get it right you are left wandering what took you so long.
I have never tried recognition of singing audio but I guess is something more or less similar to normal speech. Building however the acoustic model for such a system is not a simple task. For starters, you need a large training database unless you want just a proof of the concept. For a practical system you will need at least 200 hours of singing speech for a general multi voice system. Training will then take nearly a month. I have never done pitch recognition either, but googling on the subject I found out that there are indeed some open source products that include pitch recognition modules. The theory and the algorithms involved is not rocket science, anyway. As a matter of fact speech recognition does include pitch detection as part of the signal processing of the audio input. Maybe the two packages could share processing up to that point and go their own way after that. All in all, what you want can be done but not for 250 dollars. Regards Achilles.