You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+54-10Lines changed: 54 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,20 +26,20 @@ Some examples what you can do with this:
26
26
* Build voice-enabled chatbot services (for example, IVR systems)
27
27
* see the [Rasa Custom Voice Channel](./connectors/rasa)
28
28
* Classification of audio file transcriptions
29
-
*[Automated Testing](https://chatbotslife.com/testing-alexa-skills-with-avs-mocha-and-botium-f6c22549f66e) of Voice services with [Botium](https://medium.com/@floriantreml/botium-in-a-nutshell-part-1-overview-f8d0ceaf8fb4)
29
+
*[Automated Testing](https://wiki.botiumbox.com/how-to-guides/voice-app-testing/) of Voice services with [Botium](https://botium.ai)
30
30
31
31
## Installation
32
32
33
33
### Software and Hardware Requirements
34
34
35
-
* 8GB of RAM (accessible for Docker) and 40GB free HD space
35
+
* 8GB of RAM (accessible for Docker) and 40GB free HD space (for full installation)
For the major cloud providers there are additional docker-compose files. If using those, the installation is more slim, as there is only the *frontend*-service required. For instance, add your Azure subscription key and Azure region key to the file *docker-compose-azure.yml* and start the services:
57
+
58
+
> docker-compose -f docker-compose-azure.yml up -d
59
+
54
60
### Optional: Build Docker Images
55
61
56
62
You can optionally built your own docker images (if you made any changes in this repository, for instance to download the latest version of a model). Clone or download this repository and run docker-compose:
@@ -74,6 +80,15 @@ Configuration changes with [environment variables](./frontend/resources/.env). S
74
80
75
81
**Recommendation:** Do not change the _.env_ file but create a _.env.local_ file to overwrite the default settings. This will prevent troubles on future _git pull_
76
82
83
+
### Request-Specific Configuration
84
+
85
+
If there is a JSON-formatted request body, or a multipart request body, certain sections are considered:
86
+
87
+
***credentials** to override the server default credentials for cloud services
88
+
***config** to override the server default settings for the cloud API calls
89
+
90
+
*See samples below*
91
+
77
92
### Securing the API
78
93
79
94
The environment variable _BOTIUM_API_TOKENS_ contains a list of valid API Tokens accepted by the server (separated by whitespace or comma). The HTTP Header _BOTIUM_API_TOKEN_ is validated on each call to the API.
@@ -96,14 +111,12 @@ _Attention: in Google Chrome this only works with services published as HTTPS, y
96
111
Point your browser to http://127.0.0.1/tts to open a MaryTTS interface for testing speech synthesis.
97
112
98
113
### Real Time API
99
-
_Available for Kaldi only_
100
-
101
-
There are Websocket endpoints exposed for real-time audio decoding. Find the API description in the [Kaldi GStreamer Server documentation](https://github.com/alumae/kaldi-gstreamer-server#websocket-based-client-server-protocol).
102
114
103
-
The Websocket endpoints are:
115
+
It is possible to stream audio from real-time audio decoding: Call the **/api/sttstream/{language}** endpoint to open a websocket stream, it will return three urls:
104
116
105
-
* English: ws://127.0.0.1/stt-en/client/ws/speech
106
-
* German: ws://127.0.0.1/stt-de/client/ws/speech
117
+
* wsUri - the Websocket uri to stream your audio to. By default, it accepts wav-formatted audio-chunks
118
+
* statusUri - check if the stream is still open
119
+
* endUri - end audio streaming and close websocket
107
120
108
121
## File System Watcher
109
122
@@ -125,6 +138,18 @@ See [swagger.json](./frontend/src/swagger.json):
125
138
126
139
> curl -X POST "http://127.0.0.1/api/stt/en" -H "Content-Type: audio/wav" -T sample.wav
127
140
141
+
* HTTP POST to **/api/stt/{language}** for Speech-To-Text with Google, including credentials
0 commit comments