Python实现简易计算机语音控制

Python练习册，每天一个小程序
第 0025 题： 使用 Python 实现：对着电脑吼一声,自动打开浏览器中的默认网站。
例如，对着笔记本电脑吼一声“百度”，浏览器自动打开百度首页。
关键字：Speech to Text

1. 申请API使用权限

参考：

API Keys
利用Google Speech API实现Speech To Text（注意：API链接已经变为v2了）

2. 使用Curl命令进行测试

如果你能无障碍使用Google，那么：

curl -X POST \
--data-binary @record.flac \
--header 'Content-Type: audio/x-flac; rate=8000;' \
'https://www.google.com/speech-api/v2/recognize?output=json&amp;lang=zh-CN&amp;key=yourkey'

或者，你可以代理：

curl --socks5 127.0.0.1:1080 -X POST \
--data-binary @record.flac \
--header 'Content-Type: audio/x-flac; rate=8000;' \
'https://www.google.com/speech-api/v2/recognize?output=json&amp;lang=zh-CN&amp;key=yourkey'

参考：google-speech-v2_Github

3. 使用PyAudio进行录音

import wave
import pyaudio
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 8000
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = &quot;output.wav&quot;
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print(&quot;* recording&quot;)
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print(&quot;* done recording&quot;)
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

参考：

4. 调用API进行语音识别

在进行识别之前，还需要将录音文件转换为Flac格式。目前没有找到可用于转换Flac格式的Python包，所以只好调用外部命令进行转换：

1	os.system('flac output.wav')

这会生成同名的flac文件

若Google连接不顺畅，则可设置requests代理，仅支持http & https
Socks5代理需要使用额外的包，相关教程

import re
import requests
def speech2text(flac_file, rate, yourkey):
    &quot;&quot;&quot;
    使用Google Speech API进行语音识别
    &quot;&quot;&quot;
    url = 'https://www.google.com/speech-api/v2/recognize?output=json&amp;lang=zh-CN&amp;key=' + yourkey
    files = {'file': open(flac_file, 'rb')}
    headers = {'Content-Type': 'audio/x-flac; rate='+str(rate)+';'}
    return_text = requests.post(url, files=files, headers=headers).text
    # 获取识别的文本内容
    trans_text = re.findall('transcript\&quot;:\&quot;(.*?)\&quot;', return_text)
    return ''.join(trans_text)

5. 执行指令

极简单，应该扩展下

def text2cmd(cmds):
    if cmds.find(&quot;谷歌&quot;) &gt; -1:
        os.system(&quot;chrome https://www.google.com&quot;)
    if cmds.find(&quot;百度&quot;) &gt; -1:
        os.system(&quot;chrome https://www.baidu.com&quot;)

参考：对着电脑吼一声,自动打开谷歌网站或者自动打开命令行终端-CSDN

1. 申请API使用权限

2. 使用Curl命令进行测试

3. 使用PyAudio进行录音

4. 调用API进行语音识别

5. 执行指令

源码：