Hi,
I can do it in a day for AVR architecture and since you want it to be time economic, I'll use the assembly programming (asm volatile block) in the code to have the minimum clk cycles used for addition/subraction.
I've implemented digital filters in ATMEL microcontrollers using these assembly constructs as they need to process a lot of calculation intensive programming in real time and the code generated by C compiler can never be that optimized.
Thanks,
Ashish