Thursday, June 30, 2011

Natural User System - What android@home should have been



Natural User System Project

Timeline: 2 weeks
Cost: $150
Lines of code: 50,000+

Info:
The platform we built allows for an ecosystem of automated devices that can work on a standard
powerline system, or wireless system. It can be integrated into current networks.

The system was built using a combination of Ruby and C++. We looked to the elegance and simplicity of ruby on rails to give developers the freedom to write complex applications for our system in literally seconds. The lights application, as seen in the video, was written in under a minute.

The server has three abstraction layers, devices, generators, and apps.

Devices are like microphones and speakers, while developers can easily come in and create new devices, say a washing-machine, we have created some standard devices which automatically integrate into the basic home automation system. For example, let's say you develop a microphone for our system. The driver you write simply states that it's capable of being a microphone. The server then handles all the noise-canceling, auto-correlation, etc. and listens in on the microphone..

Generators are "listeners" to the devices. Once the server sees a microphone, it will automatically listen in. A standard text-to-speech command generator is included in the "basic" system. The tts (text-to-speech) command generator listens in on all microphones, and then can spawn apps.

Apps register themselves with "meta tags". Much like a search engine, generators push a search to a dispatcher, which in-turn tries to find the correct app to run. Once an application has control, it can either rely on simple built-in functions like "talk ____", or "listen" or
it, the application, can call devices directly. The system is so elegant that it knows the location of devices, meaning that, a "talk" command will only make the computer talk in the same room as the person who issued the initial conversation.

Take note that this system is not only voice-recognition capable, but gesture recognition technology will also be integrated in the "basic" system.

What we are working on now is the next level of automation systems. As we use the terminology "NUS" (natural user system) instead of "NUI" (natural user interface) , we mean an intelligent system.

We don't want to disclose to much information about how we are integrating intelligence into the system, but it will truly be awe-inspiring once completed. We are working towards the most accurate, fastest, smartest, and groundbreaking automation system to date. We are not looking to expand on current technologies, we are developing a new technology.


7 comments:

  1. So, is this going to be open source? If so, how long till we see the code? Let's not forget - the speed of code release with Wave went a long way to killing it.

    ReplyDelete
  2. As a sidenote, here's a trackback for you: http://gqo.be/c?p=117

    ReplyDelete
  3. I did this for my car in the early 90's with one simple app called game commander, for remote controlling rpg's or running programs/scripts.

    Not so sure why yours seems to be so slow though. Mine took about a second or less to run the commands. Even with regular windows speech recognition libraries you could have it work better/faster.

    ReplyDelete
  4. @Gray Marchiori-Simpson
    I can promise you that we will keep this project as open as possible, we want to encourage its implementation and improvement, we are committed to listening to the community. We strongly believe in this approach. As a time frame, I am hoping to have this open by next week, possibly sooner.

    Thanks for your feedback and trackback!

    ReplyDelete
  5. @Kanaidia
    To answer your question, the response time for this prototype is a few seconds because of 2 main reasons, the first being that our VR is not local, we are actually using googles VR, so we have to send ALL speech over to googles online VR, the next version of our NUS will have a local VR (for common/uncomplex speech) and a remote VR (for more complex/less common speech.) The second reason for the delay is that we didn’t have enough redbull to get everything done we wanted to.. remember we had less than 2 weeks (of after normal work hours) to prove the concept.

    Thanks for your feedback!

    ReplyDelete
  6. Well that is even more impressive. I've been doing voice recognition myself with HTK/Julius and stuff, it leaves a bitter memory ^^.
    Good job!

    Just a question: why Ruby _on Rails_? Do you have web-related needs?

    ReplyDelete
  7. I am extremely impressed with your writing skills as well as with the layout on your blog. Is this a paid theme or did you customize it yourself? Anyway keep up the nice quality writing, it is rare to see a nice blog like this one nowadays.
    top home alarm systems

    ReplyDelete