How to use text to speech in OpenAI via Integray

So the first thing first. You need to have Node.js connector that is part of Integray premium license. Next you need to have account in OpenAI platform with some credit available - for testing is enough to have few $.

Here are two links that describes how the API for text to speech works.

I used mainly the second that is API reference and contains even short example for Node.js that is not fully of a use in Integray but some parts can be used. Here is short manual how to put all together:

Create endpoint with input schema of: input, voice, speed if you want to have the last two configurable and not fixed.

Create task linked to this endpoint with same input schema. As a first step I used JS mapped in case I would like to make some input validation prior passing it to Node.js:

The whole magic lies in the Node.js step that does all the application logic of calling OpenAI API:

const AudioVoice = inputData[0].AudioVoice
const AudioSpeed = inputData[0].AudioSpeed

const Body = {
    "model": "tts-1-hd",
    "input": InputText,
    "voice": AudioVoice,
    "response_format": "mp3",
    "speed": AudioSpeed,
}

const response = await fetch("https://api.openai.com/v1/audio/speech", {
  method: "POST",
  body: JSON.stringify(Body), 
  headers: {
    "Content-Type": "application/json", 
    "organization": "${#OpenAI_Organization}",
    "Authorization": "Bearer ${#OpenAI_Token}"
    }
});

// get response
if(!response.ok){
  const responseText = await response.json()
  error.warn(responseText.error.message);
  return []
} 

// extract file from api
const fileData = Buffer.from(await response.arrayBuffer()).toString('base64');

return [
  {
    FileName: "speech-" + AudioVoice + ".mp3",
    FileData: fileData 
  }
]

My endpoint is setup in the way that it returns that JSON with base64 in it as a string. If you want to use it as link where file will be downloaded once that endpoint is called you would have to change it to GET method and change the out schema to: Endpoint File Output vX.X.X.

In my case I used this endpoint as internal endpoint where I send the output file (name + data) into Xeelo application where it is displayed as <audio> HTML element to play it directly from the browser.