Speech Recognition on Raspberry Pi for Voice Controlled Home Automation

Voice Controlled Home Automation using Raspberry Pi

“Ok Google Do my homework” – If this command worked, our childhood would have been simple, isn’t it? But some things are meant to be done by ourselves and of course, we are not going to design something that does your homework. Nevertheless, we can all admit that getting things done with voice commands is fun!!. This is why we have already built voice-controlled home automation projects like the Alexa Controlled Home Automation using Arduino and Google Assistant Home Automation using ESP32.

 

So, in this project, we are going to build a Raspberry-Pi based Voice Controlled Home Automation System that can listen, respond, and control AC loads as per our voice commands. We have directly performed Speech Recognition on Raspberry Pi, so we can directly connect a microphone to our Pi and speak into it. This avoids the need for external devices like a mobile phone. Also, the system can be kept turned on all the time, waiting for a particular voice command. Here, we have programmed the Pi to respond for a keyword “hello” after which we can control our lights to be on or off. You can also check other IoT based Home Automation projects here.

 

To begin with, how does google or any other voice assistant understands our words and respond to it? At first, when you speak, you are creating vibrations in the air. This energy is converted into an electric signal by using a mic, this electric signal is an analog signal and if it is given directly to a computer, it wouldn’t understand as the computer understands only digital signals. So, we have to convert the analog signal into a digital signal using an analog to digital converter (A/D Converter). It then filters the digitized sound signal to remove noises and using complex natural language processing systems. In the end, based on the algorithms it runs through the system and comes up with the optimal output. In our project, we are going to use google speech API with Raspberry Pi, which utilizes machine learning algorithms to convert our speech into text and then we will use Espeak to convert text to speech for our Pi to respond back to our commands.

 

Components Required

  1. Raspberry Pi
  2. Mic
  3. Speaker
  4. Relay
  5. Jumper wires

 

Voice Controlled Home Automation Circuit Diagram

We are using a 3.5 mm jack male connector to connect with the speaker and USB mic to connect with the raspberry pi. The bulb’s connection with the relay module is simple, one terminal of the bulb is connected to the AC supply (neutral) and the phase of the AC supply is connected to the “NO” of the relay. The common terminal of the relay is connected to the bulb’s other terminal.

Raspberry Pi Voice Controlled Home Automation Circuit Diagram

 

The components used in this project for building the above circuit are in the below picture.

Voice Controlled Home Automation using Raspberry Pi Components           

 

USB Microphone

There are different types of microphones namely dynamic, ribbon, condenser, crystal, electret condenser, etc. When sound waves strike the diaphragm, it moves to and fro which leads to a change in the distance between the two parallel plates. When the distance between the capacitor is increased/decreased, the capacitance decreases/increases and the changes in the current can be seen. These changes in current are proportional to the input (sound waves). These currents are allowed to flow to the resistor, which is connected in series and output is collected parallel to the resistor.

 

In our project, the USB microphone, which we are using, does this (A/D) precise operation and it also has an inbuilt amplifier which makes it unnecessary to use an external preamplifier circuit. So, by using a USB microphone, we can directly connect it with a computer, and in our case to a raspberry pi.

USB Microphone

 

Connecting USB Microphone with Raspberry Pi

We need to find whether the microphone is connected to the raspberry pi or not. The following command is used in LX-terminal to check it.

alsamixer

Enter the command, and you will get the following dialog box.

Raspberry Pi Home Automation

 

From there, access the UI by controlling the arrow keys (up/down arrow keys). Select the microphone from the given list using F6 and set the recording volume by using the arrow keys.

Raspberry Pi Voice Control Home Automation

 

To test the microphone, use the LX-terminal and record your clip. Use the following command to start recording. The recording will be saved in the test.wav file.

arecord -D plughw:1,0 test.wav

 

To play the test.wav file, enter the following commands in the terminal.

aplay test.wav

This way, you can check if the microphone is connected and working properly with your Raspberry Pi.

 

Speaker

The speaker works on the same mechanism as a microphone but in reverse. A microphone converts the sound waves to electrical signals while the speaker converts the electrical signals to soundwaves. Cone, an electromagnetic coil, and a permanent magnet are the main components of the speaker. The permanent magnet is fixed to one end while the electromagnet is movable. The electromagnet is placed in front of a permanent magnet. The electromagnet is attached to a cone made of flexible material (paper or plastic), which is used to amplify the vibrations. When pluses are given to electromagnet, it gets attracted and repelled from the permanent magnet. As the electromagnet is vibrating to and fro, the cone attached to it also produces vibrations, thereby producing sound. The pitch of the sound produced depends upon the frequency of vibrations, and the volume depends on the amplitude of the vibrations. Here, we are going to use a 3.5 mm jack pin to connect the speaker to our raspberry pi. The speaker shown below has an AUX option, you can use any speaker that works with raspberry pi.

Speaker

 

Libraries required for Speech Recognition on Raspberry Pi

Before we enter into the coding, we need to install certain libraries that will ease our coding. The Espeak library is used to convert text to speech on Raspberry Pi and the Speech Recognition library is used to perform Speech to Text with Pi. The PyAudio library is needed to get data from the USB microphone. The following commands are used to download the necessary libraries.

sudo apt-get install espeak
sudo pip3 install SpeechRecognition
sudo pip3 install PyAudio

 

Use the following command to test espeak. If it is installed correctly, you will hear ‘hello world’.

espeak “Hello world”

 

Raspberry Pi Speech Recognition Program

The complete program for speech recognition with Pi can be found at the bottom of this page, an explanation of the code is as follows. We begin by importing the speech recognition modules and other needed modules, which are used to convert speech to text and text to speech. After importing these modules, we have to import the GPIO module, which controls the pins of the raspberry pi.

from subprocess import call
import speech_recognition as sr
import serial
import RPi.GPIO as GPIO    

 

The code given below is a function, which deals with the listening of the phrases that we speak. This program waits until the user gives input (speech). When the user says something, it stores that information in the “audio” variable and returns that information.

def listen1():
    with sr.Microphone(device_index = 2) as source:
               r.adjust_for_ambient_noise(source)
               print("Say Something");
               audio = r.listen(source)
               print("got it");
    return audio

 

The below code is a function that accepts the audio1 variable. It recognizes our voice using google speech API and then prints our speech in string format on the screen.

def voice(audio1):
       try:
         text1 = r.recognize_google(audio1)
##         call('espeak '+text, shell=True)
         print ("you said: " + text1);
         return text1;
       except sr.UnknownValueError:
          call(["espeak", "-s140  -ven+18 -z" , "Google Speech Recognition could not understand"])
          print("Google Speech Recognition could not understand")
          return 0
       except sr.RequestError as e:
          print("Could not request results from Google")
          return 0

 

The code which is written in the main function is used to deal with the listening of the phrases, which is then converted to text using speech to text module, and then gives feedback using Espeak.

def main(text):
       audio1 = listen1()
       text = voice(audio1);
       text = {}

 

The if and elseif conditions given below are used to check whether the string in the text variable is either “light on” or “light off”. If the string inside the text variable is light on, then the if function gets satisfied.

 

The code inside if function is used to send high value to the pin name led (PIN 27). After sending the high value to the pin, we use espeak that transfers text to speech, which is used as feedback. If the string inside the text variable is light off, then if condition will not be satisfied leading the program to check for the elseif condition. If the elseif condition gets satisfied (if the string inside the variable “text” is light off), the program enters the code which is written inside the elseif condition. The code inside the elseif function is used to send low value to the pin named led (PIN 27). This pin is connected to a relay to control any required AC load similar to what we did in Blynk Home Automation and Adafruit IO Home Automation Projects.      

       if 'light on' in text:
          GPIO.output(led , 1)
          call(["espeak", "-s140  -ven+18 -z" , "okay  Sir, Switching ON the Lights"])
          print ("Lights on");
       elif 'light off' in text:
          GPIO.output(led , 0)
          call(["espeak", "-s140  -ven+18 -z" , "okay  Sir, Switching off the Lights"])
          print ("Lights Off"); 
       text = {}

 

The code given below is the one that runs first. When the python interpreter is running the module, it sets the __name__ variable to a value “ __main__”. The below code deals with listening and comparing the “text” variable. The code is given below acts as the code to keep the main program in standby mode until the raspberry pi listens to the triggering phrase. When the raspberry pi captures the triggering phrase, it allows the program to enter the main code, which is defined in another function named main().

if __name__ == '__main__':
 while(1):
     audio1 = listen1()
     text = voice(audio1)
     if text == 'hello':
         text = {}
         call(["espeak", "-s140  -ven+18 -z" ," Okay master, waiting for your command"])
         main(text)
     else:
         call(["espeak", "-s140 -ven+18 -z" , " Please repeat"])

 

Controlling AC Loads through Voice Commands on Pi

At idle conditions, the raspberry pi keeps on checking for the phrase which triggers the code. In our case, our triggering phrase will be “hello”. We wrote our code in such a way that when the user speaks the triggering phrase- hello, it triggers the remaining part of the code. The program will further run, which deals with the listening of the audio and executing the commands (which turns on/off the lights depending on the commands it listens).

Voice Controlled Home Automation using Raspberry Pi

The complete working is also shown in the video at the bottom of this page. If the user says the predetermined phrase, the function to switch on/off the lights will be satisfied and runs the inner code and the output will be either switching the bulb on or off, which depends on the command. After raspberry pi performs an action to switch on or off, we can hear feedback in the form of audio from the speaker. On listening to the first phrase after the triggering phrase, the raspberry pi will again get back to its initial condition and wait for the triggering command and the same process gets repeated again and again. By this, we can add this voice-controlled system using raspberry pi to our project arsenal. It may not be the same as google home mini, but it is good to have our own voice-controlled device, which can even give feedback in the form of voice.

Code

from subprocess import call
import speech_recognition as sr
import serial
import RPi.GPIO as GPIO      
import os, time
r= sr.Recognizer()
led=27
text = {}
text1 = {}
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BCM)
GPIO.setup(led, GPIO.OUT)
def listen1():
    with sr.Microphone(device_index = 2) as source:
               r.adjust_for_ambient_noise(source)
               print("Say Something");
               audio = r.listen(source)
               print("got it");
    return audio
def voice(audio1):
       try: 
         text1 = r.recognize_google(audio1) 
##         call('espeak '+text, shell=True) 
         print ("you said: " + text1);
         return text1; 
       except sr.UnknownValueError: 
          call(["espeak", "-s140  -ven+18 -z" , "Google Speech Recognition could not understand"])
          print("Google Speech Recognition could not understand") 
          return 0
       except sr.RequestError as e: 
          print("Could not request results from Google")
          return 0
def main(text):
       audio1 = listen1() 
       text = voice(audio1);
       if 'light on' in text:
          GPIO.output(led , 1)
          call(["espeak", "-s140  -ven+18 -z" , "okay  Sir, Switching ON the Lights"])
          print ("Lights on");
       elif 'light off' in text:
          GPIO.output(led , 0)
          call(["espeak", "-s140  -ven+18 -z" , "okay  Sir, Switching off the Lights"])
          print ("Lights Off");  
       text = {}
if __name__ == '__main__':
 while(1):
     audio1 = listen1() 
     text = voice(audio1)
     if text == 'hello': 
         text = {}
         call(["espeak", "-s140  -ven+18 -z" ," Okay master, waiting for your command"])
         main(text)
     else:
         call(["espeak", "-s140 -ven+18 -z" , " Please repeat"])

Video