This article describes how to implement an AI Speaker on the ODROID-HC1 using Google Assistant SDK.
- ODROID-HC1 (http://bit.ly/2wjNToV) with 5V/4A power supply
- Bluetooth Module 2 (http://bit.ly/2gNybJW)
- Bluetooth Speaker including microphone (http://amzn.to/2z6dBz5)
- MicroSD Card for OS, 8GB Class 10 or higher version is required
- LAN cable
Insert the Bluetooth dongle into the USB port of the ODROID-HC1, then turn on the ODROID-HC1 and Bluetooth speaker to begin.
To access the ODROID-HC1 console, get the IP address of the board as described at http://bit.ly/2yXFLwp. This guide is based on the latest Ubuntu 16.04 minimal OS. To get the OS image, download it from http://bit.ly/2yXYs3h. Before starting the system settings, add the new user account ODROID as a sudo user, because Ubuntu minimal does not have any user accounts:
# adduser odroid # usermod -aG sudo odroid # su - odroid
Install the alsa and pulseaudio sound related packages:
$ sudo apt update $ sudo apt install libasound2 libasound2-plugins alsa-utils alsa-oss $ sudo apt install pulseaudio pulseaudio-utils pulseaudio-module-bluetooth
Add the pulseaudio permission to the user account. Add the “load-module module-switch-on-connect” line to the pulseaudio configuration file. This setting changes the audio output to the Bluetooth speaker automatically:
$ sudo usermod -aG pulse,pulse-access odroid $ sudo nano /etc/pulse/default.pa
.ifexists module-bluetooth-discover.so load-module module-bluetooth-discover load-module module-switch-on-connect # this is new! .endif
$ pulseaudio --start
Install the Bluetooth related package. In this instance, use the bluez package for bluetooth:
$ sudo apt install bluez $ bluetoothctl
If the bluetoothctl command does not work on the user account, modify the dbus configuration file by adding the following configurations to the file:
$ sudo nano /etc/dbus-1/system.d/bluetooth.conf
< policy user="odroid"> < allow send_destination="org.bluez"/> < allow send_interface="org.bluez.Agent1"/> < allow send_interface="org.bluez.GattCharacteristic1"/> < allow send_interface="org.bluez.GattDescriptor1"/> < allow send_interface="org.freedesktop.DBus.ObjectManager"/> < allow send_interface="org.freedesktop.DBus.Properties"/> < /policy>
Enter the following commands on the bluetoothctl console. Note that the MAC address of the Bluetooth speaker should be changed from 00:11:67:AE:25:C6 to your own MAC address. This address will be different for each Bluetooth device, so be sure to replace the address by adding yours:
[bluetooth]# agent on [bluetooth]# default-agent [bluetooth]# scan on [bluetooth]# pair 00:11:67:AE:25:C6 [bluetooth]# trust 00:11:67:AE:25:C6 [bluetooth]# connect 00:11:67:AE:25:C6 [bluetooth]# quit
The Bluetooth speaker must have a set default. In order to set the A2DP (Advanced Audio Distribution Profile) as the default, change the profile to HSP (Head Set Profile) because A2DP cannot use the microphone.
$ pacmd ls
Check the card index of the Bluetooth speaker, and assume the index is 1:
$ pacmd set-card-profile 1 headset_head_unit
To verify sound and Bluetooth setup was done correctly, play a test sound:
$ speaker-test -t wav
Record and playback some audio using the ALSA command-line tools:
$ arecord --format=S16_LE --duration=5 --rate=16k --file-type=raw out.raw $ aplay --format=S16_LE --rate=16k --file-type=raw out.raw
To easily use Bluetooth speaker, some configurations are necessary:
pulseaudio --start echo "connect 00:11:67:AE:25:C6" | bluetoothctl
Enable Google Assistant API
In order to enable the Google Assistant API, refer to the Google Assistant SDK Guides page at http://bit.ly/2pXwqfC. Use a Google account to sign in. If a Google account has yet to be produced, create one. Trying the Google Assistant API is free for personal use.
Configure a Google Developer Project
A Google Developer Project allows any ODROID device access to the Google Assistant API. The project tracks quota usage and gives valuable metrics for the requests made from ODROID devices on the network.
To enable access to the Google Assistant API, first go to the Projects page in the Cloud Platform Console at and select an existing project or create a new project. Go to the Projects page at http://bit.ly/2gY7pSV. Next, enable the Google Assistant API on the project you selected and click Enable. More information about enabling the API is available at http://bit.ly/2A1ewic.
Next, create an OAuth Client ID by first creating the client ID, as described at http://bit.ly/2xBjII6. You may need to set a product name for the product consent screen. On the OAuth consent screen tab, give the product a name and click Save, then click Other and give the client ID a name, and click Create. A dialog box appears that shows you a client ID and secret. There’s no need to remember or save this, just close the dialog. Next, click at the far right of screen for the client ID to download the client secret JSON file (client_secret_.json). The client_secret_.json file must be located on the device to authorize the Google Assistant SDK sample to make Google Assistant queries, and should not be renamed. Finally, copy client_secret_.json to the ODROID-HC1:
$ scp ~/Downloads/client_secret_client-id.json odroid@:~/
Set activity controls for your Google account
In order to use the Google Assistant, certain activity data must be shared with Google. The Google Assistant needs this data to function properly, and it is not specific to the SDK. To do this, open the Activity Controls page for the Google account to be used with the Assistant at http://bit.ly/2ig4QIB. Any Google account has this option, and it does not need to be your developer account. Ensure the following toggle switches are enabled (blue):
- Web & App Activity
- Device Information
- Voice & Audio Activity
Download and run the Google Assistant API sample
Use a Python virtual environment to isolate the SDK and its dependencies from the system Python packages:
$ sudo apt update $ sudo apt install python-dev python-virtualenv git portaudio19-dev libffi-dev libssl-dev $ virtualenv env --no-site-packages
If you face the locale problem as shown below, set the LC_ALL environment variable:
Complete output from command /home/odroid/env/bin/python2 - setuptools pkg_resources pip wheel: Traceback (most recent call last): File "", line 24, in File "/usr/share/python-wheels/pip-8.1.1-py2.py3-none-any.whl/pip/__init__.py", line 215, in main File "/home/odroid/env/lib/python2.7/locale.py", line 581, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting $ export LC_ALL=C $ virtualenv env --no-site-packages Activate Python virtual environment. $ env/bin/python -m pip install --upgrade pip setuptools $ source env/bin/activate
After activating the Python virtual environment, the “(env)” string is added in front of the prompt.
Authorize the Google Assistant SDK sample to make Google Assistant queries for the given Google Account. Reference the JSON file that was copied over to the device in a previous step and install the authorization tool:
(env) $ python -m pip install --upgrade google-auth-oauthlib[tool]
Run the tool, making sure to remove the –headless flag if you are running this from a terminal on the device (not an SSH session):
(env) $ google-oauthlib-tool --client-secrets /path/to/client_secret_client-id.json --scope https://www.googleapis.com/auth/assistant-sdk-prototype --save --headless
You should see a URL displayed in the terminal:
Please go to this URL: https://
Copy the URL and paste it into a browser. This can be done on a development machine, or any other machine. After it is approved, a code will appear in the browser, such as “4/XXXX”. Copy and paste this code into the terminal:
Enter the authorization code:
If authorization was successful, OAuth credentials will be initialized in the terminal. If InvalidGrantError shows instead, then an invalid code was entered. If this occurs, try again, taking care to copy and paste the entire code. If the correct authorization code is entered, then the credentials.json file is generated:
credentials saved: /home/odroid/.config/google-oauthlib-tool/credentials.json
Get the sample codes from the github repository:
$ git clone https://github.com/googlesamples/assistant-sdk-python $ cd assistant-sdk-python
Install Python packages requirements for the sample program. We use pushtotalk sample.
$ cd google-assistant-sdk $ python setup.py install $ cd googlesamples/assistant/grpc $ pip install --upgrade -r requirements.txt $ nano pushtotalk.py
To run the sample, we have to modify the sample code. Change the exception type SystemError to ValueError in the sample code (line 35):
except ValueError: import assistant_helpers import audio_helpers
Run and test the pushtotalk sample. If the sample program is working well, this work is almost done:
(env) $ python pushtotalk.py INFO:root:Connecting to embeddedassistant.googleapis.com Press Enter to send a new request...
Copy the sample to the working directory. Deactivate the Python virtual environment. There are additional steps to take to produce a useful AI speaker. In order to that, navigate to the $(HOME)/ai_speaker directory:
(env) $ cd .. (env) $ cp -r grpc ~/ai_speaker (env) $ cd ~/ai_speaker (env) $ cp pushtotalk.py ai_speaker.py (env) $ deactivate $ cd
The push-to-talk sample looks like it will interact with the AI assistant. However, before communicating with the AI assistant, press the enter key first. To detect a Wake-Up-Word like “Okay, Google”, “Alexa” or “Jarvis”, use CMUSphinx at https://cmusphinx.github.io , which is the open source local speech recognition toolkit. It is best to build and install SphinxBase, because SphinxBase provides common functionality across all CMUSphinx projects:
$ sudo apt install libtool bison swig python-dev autoconf libtool automake $ git clone --depth 1 https://github.com/cmusphinx/sphinxbase.git $ cd sphinxbase $ ./autogen.sh $ make -j8 $ sudo make install $ cd
Sphinxbase will be installed in the “/usr/local/” directory by default. Not all systems load libraries from this folder automatically. In order to load them, configure the path to look for shared libraries. This can be done either in the “/etc/ld.so.conf” file, or by exporting the environment variables:
export LD_LIBRARY_PATH=/usr/local/lib export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
Build and install PocketSphinx. PocketSphinx is a lightweight speech recognition engine specifically tuned for handheld and mobile devices, although it works equally well on the desktop:
$ git clone --depth 1 https://github.com/cmusphinx/pocketsphinx.git $ cd pocketsphinx $ make -j8 $ sudo make install $ cd
To test the installation, run pocketsphinx_continuous and check that it recognizes words you speak into your microphone:
$ pocketsphinx_continuous -inmic yes
For more information about building PocketSphinx, please refer to the “Building an application with PocketSphinx” page at http://bit.ly/2gZhHT5.
Add the pocketsphinx_continuous program as a subprocess in the AI speaker program. The program pocketsphinx_continuous is a good tool for detecting hotwords because it recognizes speech asynchronously. Remove the wait_for_user_trigger related lines, because the hotwords are the trigger:
$ source env/bin/activate (env) $ pip install --upgrade subprocess
$(HOME)/ai_speaker/ai_speaker.py """Sample that implements gRPC client for Google Assistant API.""" # Add subprocess module import subprocess import json import logging import os.path (......) # Add below's routines in the 'While True:' loop while True: p = subprocess.Popen(args = ['pocketsphinx_continuous','-inmic', 'yes', '-kws_threshold', '1e-16', '-keyphrase', 'hey dude'], stdin = subprocess.PIPE, stdout = subprocess.PIPE, universal_newlines=True) while p.poll() is None: data = p.stdout.readline() if data.find("hey dude") is not -1: print "Detected Hotwords" p.stdout.flush() break p.terminate()
The Wake-Up-Word is “hey dude”. Run the program, say “hey dude,” and then state anything desired to the AI assistant:
(env) $ cd ai_speaker (env) $ python ai_speaker.py
There is a problem after initially adding Wake-Up-Words, because there is no apparatus in place to detect whether the AI speaker detects hotwords or not. The timing must be known in order to command the AI assistant by voice. This can be fixed by adding the detection sound to the program. Download the sample detection sound at http://bit.ly/2zkSV3b, then copy the detect.wav file to the ODROID-HC1:
$ scp ~/Downloads/detect.wav odroid@:~/
Use the pyaudio and wave module in order to play the .wav file in the Python source code:
(env) $ pip install --upgrade pyaudio wave
Add the detection sound play routine to the program. Full differences including the Wake-Up-Words routines are the following:
(env) $ nano ai_speaker.py
diff file between original sample code pushtotalk.py and modified program ai_speaker.py
--- pushtotalk.py 2017-10-19 15:42:12.164741800 +0000 +++ ai_speaker.py 2017-10-19 15:41:49.644811151 +0000 @@ -14,6 +14,9 @@ """Sample that implements gRPC client for Google Assistant API.""" +import pyaudio +import wave +import subprocess import json import logging import os.path @@ -310,14 +313,38 @@ # keep recording voice requests using the microphone # and playing back assistant response using the speaker. # When the once flag is set, don't wait for a trigger. Otherwise, wait. - wait_for_user_trigger = not once + chunk = 1024 + pa = pyaudio.PyAudio() + while True: - if wait_for_user_trigger: - click.pause(info='Press Enter to send a new request...') + p = subprocess.Popen(args = ['pocketsphinx_continuous','-inmic', 'yes', '-kws_threshold', '1e-16', '-keyphrase', 'hey dude'], + stdin = subprocess.PIPE, + stdout = subprocess.PIPE, + universal_newlines=True) + while p.poll() is None: + data = p.stdout.readline() + if data.find("hey dude") is not -1: + print "Detected Hotwords" + p.stdout.flush() + break + p.terminate() + + # Play the detection sound + f = wave.open(r"/home/odroid/detect.wav","rb") + stream = pa.open(format = pa.get_format_from_width(f.getsampwidth()), + channels = f.getnchannels(), + rate = f.getframerate(), + output = True) + wav_data = f.readframes(chunk) + + while wav_data: + stream.write(wav_data) + wav_data = f.readframes(chunk) + stream.stop_stream() + stream.close() + f.close() + continue_conversation = assistant.converse() - # wait for user trigger if there is no follow-up turn in - # the conversation. - wait_for_user_trigger = not continue_conversation # If we only want one conversation, break. if once and (not continue_conversation): # Run the AI speaker program. (env) $ python ai_speaker.py
To view the speaker in action, check out the video at https://youtu.be/6Ez782BxxdQ.
The final step
The detection rate of the Wake-Up-Words is less than ideal. Whether using pocketsphinx or another solution, the Wake-Up-Words routine needs improvement, so adding custom commands is useful for this particular project. For example, it is easy to control IoT devices by voice by using the Google Assistant SDK. Different solutions can be found by entering the search query “action on google” to learn more about extending the Google Assistant.
To save time, an easy custom command solution can be used by simply adding the custom command to the ai_speaker.py program. In the pushtotalk sample, find the request text which is already recognized by voice:
--- pushtotalk.py 2017-10-19 16:07:46.753689882 +0000 +++ pushtotalk_new.py 2017-10-19 16:09:58.165799271 +0000 @@ -119,6 +119,15 @@ logging.info('Transcript of user request: "%s".', resp.result.spoken_request_text) logging.info('Playing assistant response.') + #Add your custom voice commands here + #Ex> + #import os + #r_text = resp.resut.spoken_request_text + #if r_text.find("play music") is not -1: + # os.system("mplayer ~/Music/*&") + #if r_text.find("turn on light") is not -1: + # os.system("echo 1 > /sys/class/gpio/gpio1/value") + if len(resp.audio_out.audio_data) > 0: self.conversation_stream.write(resp.audio_out.audio_data) if resp.result.spoken_response_text:
After this modification has been saved, you can begin experimenting with controlling the home electronic devices using the IOT controller with voice commands. For comments, questions, and suggestions, please visit the original article at http://bit.ly/2iQ629K.