codeforequity-at
diff --git a/‎Makefile‎
Lines changed: 4 additions & 0 deletions b/‎Makefile‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 54 additions & 10 deletions b/‎README.md‎
Lines changed: 54 additions & 10 deletions
diff --git a/‎connectors/sapcai/server/index.js‎
Lines changed: 3 additions & 2 deletions b/‎connectors/sapcai/server/index.js‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎connectors/ws/.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎connectors/ws/.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎connectors/ws/file.js‎
Lines changed: 31 additions & 0 deletions b/‎connectors/ws/file.js‎
Lines changed: 31 additions & 0 deletions
diff --git a/‎connectors/ws/package.json‎
Lines changed: 14 additions & 0 deletions b/‎connectors/ws/package.json‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎connectors/ws/record.js‎
Lines changed: 27 additions & 0 deletions b/‎connectors/ws/record.js‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎connectors/ws/sample.raw‎
344 KB b/‎connectors/ws/sample.raw‎
344 KB
diff --git a/‎connectors/ws/sample.wav‎
345 KB b/‎connectors/ws/sample.wav‎
345 KB
diff --git a/‎connectors/ws/simple.js‎
Lines changed: 27 additions & 0 deletions b/‎connectors/ws/simple.js‎
Lines changed: 27 additions & 0 deletions
@@ -51,3 +51,7 @@ docker_latest_release:
 
 	docker tag botium/botium-speech-dictate:$(VERSION) botium/botium-speech-dictate:latest
 	docker push botium/botium-speech-dictate:latest
+
+develop: docker_build_develop docker_publish_develop
+
+release: docker_build_release docker_publish_release docker_latest_release
@@ -26,20 +26,20 @@ Some examples what you can do with this:
 * Build voice-enabled chatbot services (for example, IVR systems)
   * see the [Rasa Custom Voice Channel](./connectors/rasa)
 * Classification of audio file transcriptions
-* [Automated Testing](https://chatbotslife.com/testing-alexa-skills-with-avs-mocha-and-botium-f6c22549f66e) of Voice services with [Botium](https://medium.com/@floriantreml/botium-in-a-nutshell-part-1-overview-f8d0ceaf8fb4)
+* [Automated Testing](https://wiki.botiumbox.com/how-to-guides/voice-app-testing/) of Voice services with [Botium](https://botium.ai)
 
 ## Installation
 
 ### Software and Hardware Requirements
 
-* 8GB of RAM (accessible for Docker) and 40GB free HD space
+* 8GB of RAM (accessible for Docker) and 40GB free HD space (for full installation)
 * Internet connectivity
 * [docker](https://docs.docker.com/)
 * [docker-compose](https://docs.docker.com/compose/)
 
-_Note: memory usage can be reduced if only one language is required - default configuration comes with two languages._
+_Note: memory usage can be reduced if only one language for Kaldi is required - default configuration comes with two languages._
 
-### Use Prebuilt Docker Images
+### Full Installation (Prebuilt Docker Images)
 
 Clone or download this repository and start with docker-compose:
 
@@ -51,6 +51,12 @@ This will download the latest released prebuilt images from Dockerhub. To downlo
 
 Point your browser to http://127.0.0.1 to open the [Swagger UI](https://swagger.io/tools/swagger-ui/) and browse/use the API definition.
 
+### Slim Cloud-Specific Installation (Prebuilt Docker Images)
+
+For the major cloud providers there are additional docker-compose files. If using those, the installation is more slim, as there is only the *frontend*-service required. For instance, add your Azure subscription key and Azure region key to the file *docker-compose-azure.yml* and start the services:
+
+    > docker-compose -f docker-compose-azure.yml up -d
+
 ### Optional: Build Docker Images
 
 You can optionally built your own docker images (if you made any changes in this repository, for instance to download the latest version of a model). Clone or download this repository and run docker-compose:
@@ -74,6 +80,15 @@ Configuration changes with [environment variables](./frontend/resources/.env). S
 
 **Recommendation:** Do not change the _.env_ file but create a _.env.local_ file to overwrite the default settings. This will prevent troubles on future _git pull_
 
+### Request-Specific Configuration
+
+If there is a JSON-formatted request body, or a multipart request body, certain sections are considered:
+
+* **credentials** to override the server default credentials for cloud services
+* **config** to override the server default settings for the cloud API calls
+
+*See samples below*
+
 ### Securing the API
 
 The environment variable _BOTIUM_API_TOKENS_ contains a list of valid API Tokens accepted by the server (separated by whitespace or comma). The HTTP Header _BOTIUM_API_TOKEN_ is validated on each call to the API.
@@ -96,14 +111,12 @@ _Attention: in Google Chrome this only works with services published as HTTPS, y
 Point your browser to http://127.0.0.1/tts to open a MaryTTS interface for testing speech synthesis.
 
 ### Real Time API
-_Available for Kaldi only_
-
-There are Websocket endpoints exposed for real-time audio decoding. Find the API description in the [Kaldi GStreamer Server documentation](https://github.com/alumae/kaldi-gstreamer-server#websocket-based-client-server-protocol).
 
-The Websocket endpoints are:
+It is possible to stream audio from real-time audio decoding: Call the **/api/sttstream/{language}** endpoint to open a websocket stream, it will return three urls:
 
-* English: ws://127.0.0.1/stt-en/client/ws/speech
-* German: ws://127.0.0.1/stt-de/client/ws/speech
+* wsUri - the Websocket uri to stream your audio to. By default, it accepts wav-formatted audio-chunks
+* statusUri - check if the stream is still open
+* endUri - end audio streaming and close websocket
 
 ## File System Watcher
 
@@ -125,6 +138,18 @@ See [swagger.json](./frontend/src/swagger.json):
 
     > curl -X POST "http://127.0.0.1/api/stt/en" -H "Content-Type: audio/wav" -T sample.wav
 
+* HTTP POST to **/api/stt/{language}** for Speech-To-Text with Google, including credentials
+
+    > curl -X POST "http://127.0.0.1/api/stt/en-US?stt=google" -F "google={\"credentials\": {\"private_key\": \"xxx\", \"client_email\": \"xxx\"}}" -F content=@sample.wav
+
+* HTTP POST to **/api/stt/{language}** for Speech-To-Text with Google, including switch to MP3 encoding
+
+    > curl -X POST "http://127.0.0.1/api/stt/en-US?stt=google" -F "google={\"config\": {\"encoding\": \"MP3\"}}" -F content=@sample.mp3
+
+* HTTP POST to **/api/stt/{language}** for Speech-To-Text with IBM, including credentials
+
+    > curl -X POST "http://127.0.0.1/api/stt/en-US?stt=ibm" -F "google={\"credentials\": {\"apikey\": \"xxx\", \"serviceUrl\": \"xxx\"}}" -F content=@sample.wav
+
 * HTTP GET to **/api/tts/{language}?text=...** for Text-To-Speech
 
     > curl -X GET "http://127.0.0.1/api/tts/en?text=hello%20world" -o tts.wav
@@ -155,6 +180,25 @@ This project is standing on the shoulders of giants.
 
 ## Changelog
 
+### 2022-03-06
+* Voice effects to consider audio file length
+
+### 2022-02-28
+
+* Applied Security Best Practices (not run as root user)
+
+### 2022-01-12
+
+* Added support for Azure Speech Services
+
+### 2021-12-07
+
+* Added endpoints for streaming audio and responses
+
+### 2021-12-01
+
+* Added option to hand over cloud credentials in request body
+
 ### 2021-01-26
 
 * Added several profiles for adding noise or other audio artifacts to your files
 
@@ -6,7 +6,7 @@ const app = require('express')()
 const http = require('http').Server(app)
 const io = require('socket.io')(http)
 
-const SAPCAI_TOKEN = '5ee1b84709db76f5bbff8ea14dc9ad85'
+const SAPCAI_TOKEN = process.env.SAPCAI_TOKEN
 
 app.use(cors())
 
@@ -94,7 +94,8 @@ io.on('connection', (socket) => {
             method: 'GET',
             url: 'https://speech.botiumbox.com/api/tts/en',
             params: {
-              text: message.content
+              text: message.content,
+              voice: 'dfki-poppy-hsmm'
             },
             responseType: 'arraybuffer'
           }
 
@@ -0,0 +1,3 @@
+node_modules
+package-lock.json
+test.js
@@ -0,0 +1,31 @@
+const fs = require('fs')
+const _ = require('lodash')
+const axios = require('axios').default
+const { WebSocket } = require('ws')
+
+const sampleBuffer = fs.readFileSync('sample.raw')
+const playCount = 1
+const showInterim = false
+
+const main = async () => {
+
+  const { data } = await axios.get('http://localhost:56000/api/sttstream/en?stt=kaldi')
+  const ws = new WebSocket(data.wsUri)
+
+  ws.on('open', () => {
+    for (let i = 0; i < playCount; i++) {
+      setTimeout(() => _.chunk(sampleBuffer, 10000).forEach(c => ws.send(Buffer.from(c))), i * 1000)
+    }
+    setTimeout(() => axios.get(data.endUri), 3000 + playCount * 1000)
+    setTimeout(() => ws.close(), 5000 + playCount * 1000)
+  })
+
+  ws.on('message', (data) => {
+    try {
+      const dj = JSON.parse(data)
+      if (showInterim || dj.final) console.log('received: %s', dj.text)
+    } catch (err) {
+    }
+  })
+}
+main()
@@ -0,0 +1,14 @@
+{
+  "name": "ws-sample",
+  "version": "1.0.0",
+  "scripts": {
+    "file": "node file.js",
+    "record": "node record.js"
+  },
+  "dependencies": {
+    "axios": "^0.24.0",
+    "lodash": "^4.17.21",
+    "node-record-lpcm16": "^1.0.1",
+    "ws": "^8.3.0"
+  }
+}
@@ -0,0 +1,27 @@
+const recorder = require('node-record-lpcm16')
+const axios = require('axios').default
+const { WebSocket } = require('ws')
+
+const main = async () => {
+
+  const { data } = await axios.get('http://localhost:56000/api/sttstream/en?stt=kaldi')
+  const ws = new WebSocket(data.wsUri)
+
+  ws.on('open', () => {
+    recorder
+    .record({
+      sampleRateHertz: 16000,
+      threshold: 0, //silence threshold
+      recordProgram: 'rec', // Try also "arecord" or "sox"
+      silence: '5.0', //seconds of silence before ending
+    })
+    .stream()
+    .on('error', console.error)
+    .on('data', (data) => ws.send(data))
+  })
+
+  ws.on('message', (data) => {
+    console.log('received: %s', data);
+  })
+}
+main()
@@ -0,0 +1,27 @@
+const fs = require('fs')
+const _ = require('lodash')
+const axios = require('axios').default
+const { WebSocket } = require('ws')
+
+const sampleBuffer = fs.readFileSync('sample.wav')
+
+const main = async () => {
+
+  const { data } = await axios.get('http://localhost:56000/api/sttstream/en-US?stt=google')
+  const ws = new WebSocket(data.wsUri)
+
+  ws.on('open', () => {
+    ws.send(sampleBuffer)
+    setTimeout(() => axios.get(data.endUri), 3000)
+    setTimeout(() => ws.close(), 5000)
+  })
+
+  ws.on('message', (data) => {
+    try {
+      const dj = JSON.parse(data)
+      if (dj.final) console.log('received %s-%s: %s ', dj.start, dj.end, dj.text)
+    } catch (err) {
+    }
+  })
+}
+main().catch(err => console.error(err.message))
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+node_modules`
	`2`	`+package-lock.json`
	`3`	`+test.js`