Alexa Portable Platform Design and Baidu DuerOS

Alexa Portable Platform
and DuerOS
Victor Sue

Agenda
• Amazon Alexa Voice Service
• Alexa Portable Platform Design
• DuerOS Introduction

Amazon Alexa Voice Service (cont.)

Voice
Text
Intent
Feedback Activity
Device
Cloud

Voice
Intent

Alexa Connected Home (CoHo) Skills
WeMo Lighting Skills
Intent
Activity

Alexa Portable Platform Design
• Alexa Bracelet
• Alexa Voice Service on Wi-Fi MCU
• Portable Voice Service

Alexa Portable Platform Design (cont.)
• HW Block Diagram
RTL8195
ARM Cortex M3
Audio Codec
SGTL5000
I2C
I2S
Push Button
GPIO

• SW Architecture
HAL
WLAN Driver
Device Driver
(SPI/USB/SDIO/U
ART/I2C/I2S/GPI
O)
Wi-Fi
WPS
Wi-Fi
API
Arduino API Layer
Network Stack
IP
TCP/IP
Socket
HTTP/SSL
Main Task
AWS Access
Manager
Codec Driver
Record/Playback Agent
AVS Client

• AVS setting
https://developer.amazon.com/edw/home.html#/

• Demo
• https://youtu.be/deCTRRs3iEI

• Next Steps
• Low Power Mode Design
• Connect to Chinese Voice Service
• Any Idea for application?

DuerOS Introduction
• DuerOS Open Platform
• http://developer.dueros.baidu.com/openduer/main/index
• 2017年1月由百度度秘推出
• 結合人工智慧對話系統與智慧家電的作業系統
• DIDP(DuerOS Intelligent Devices Platform)
• 為提供智慧設備可對話的能力，通過集成DuerOS
智能硬件與開放接口的能力，提供使用者以下的
操作體驗
1. 通過語音控制設備來播放音樂、查詢天氣及
最新新聞，獲取交通情況以及通用知識詢問
2. 通過語音來設置鬧鐘、提醒
3. 通過語音來獲取服務，如叫車、訂外賣等
4. 通過語音來獲取來自百度第三方合作夥伴創
建的技能

DuerOS Introduction (cont.)
• 技術架構
應用層
小度智能設備
開放平台
核心層
小度對話
核心系統
能力層
小度技能
開放平台
場景應用參考設計
核心接入組件
晶片模組
開發套件
SDK
麥克風陣列
機構設計
工業設計
音響設計
對話服務(DuerOS Conversational Service)
技能框架 (DuerOS Bot Framework)
語音識別語音播報屏幕顯示
原生技能第三方技能
技能開發工具

• 開發流程
開發者認證
選擇
場景
手機
音箱
冰箱
電視
故事機
輕量設備
Android
Linux
mbedOS
服務配置
裝置名稱
基本配置
OAUTH
配置
下載
SDK

• DCS Protocol
• DCS協議是DuerOS服務端與設備端之間的
通訊協議，是一套把DuerOS的智能語音
交互能力向所有設備開放的API
• 基於HTTP/2的傳輸層協議
• DCS協議由指令、事件、端狀態三個部分
組成
• 指令（directive）是服務端下發給設備端，
設備端需要執行的操作
• 事件（event）是設備端上報給服務端，通知
服務端在設備端發生的事情
• 端狀態（clientContext）是設備端在上報事件
時，需要帶上設備端的狀態信息
DuerOS服務
DuerOS
裝置
DCS
Protocol

語音輸入
• dialogRequestId
• 每一次的語音請求，設備端需要為其生成一個唯一的dialogRequestId，以唯一辨識這
一次對話
• 服務端下發本對話對應的指令時，回復中將攜帶這個id
• Audio Payload
• "AUDIO_L16_RATE_16000_CHANNELS_1": 16bit線性PCM音頻，16kHz採樣率，單聲道，
Little endian byte order
• 改善語音的響應速度
• 為了提高用戶請求的響應速度，在用戶開始語音請求之時就發起ListenStarted事件
HTTP請求，並在用戶邊說話時實時的將音頻數據流進行流式上傳，而不是等用戶說完
之後再進行請求。音頻數據流，我們建議以每10毫秒為一個數據塊(chunk)，進行流式
上傳，意味著每10毫秒的音頻數據寫到HTTP請求流中，並flush緩衝區。

• 開發套件
Raspberry Pi 3/2 MIC MTK MT8516 Realtek Ameba/2 MIC
HF-LPB200U RK3229/6 MIC 全志R16/6 MIC

Alexa Portable Platform Design and Baidu DuerOS

Alexa Portable Platform Design and Baidu DuerOS

Recomendados

Recomendados

Más contenido relacionado

Similar a Alexa Portable Platform Design and Baidu DuerOS

Similar a Alexa Portable Platform Design and Baidu DuerOS (12)

Más de Victor Sue

Más de Victor Sue (6)

Alexa Portable Platform Design and Baidu DuerOS