Warning
Please note that the documentation you are currently viewing is for an older version of our technology. While it is still functional, we recommend upgrading to our latest and more efficient option to take advantage of all the improvements we’ve made.
Convert Audio File with IBM API
This article contains an overview of IBM APIs, and a short example using the Speech-to-Text API.
Introduction to IBM APIs
IBM is known for their hardware and software, but they also provide a lot of APIs on the IBM Cloud. This API provides a variety of services, some free and some paid.
Their IBM Cloud Docs: APIs provides all the specifications for each API.
You may want to start with reading the Creating Apps tutorial.
The HTTP request
To use these APIs, you will need to formulate HTTP requests. The documentation provides you some request examples for each API. To see a complete example about formulating CURL requests in AIMMS, see Use the IBM Image Recognition API.
Authentication
To access these APIs, it’s required to authenticate in the request.
The system used is Identity and Access Management (IAM) authentication, a token-based system. The authentication is then done through the Authentication
request header using a token or an API key.
If you use an API key, you must follow the format of a Basic Access Authentication.
Answers
The result sent back from a IBM API is usually a JSON file. .. You’ll need to convert this JSON file into an XML file to extract the data into AIMMS. Such a JSON file can be read in using the Data Exchange Library.
Example
We will here use the Speech-To-Text API from IBM. By sending an audio file, we’ll be able to obtain the script of this video.
Prerequisites
Obtain an API key: Go to IBM Cloud Docs: Speech-to-Text API and click Sign up in the top corner to create an IBM account.
Install the AIMMS HTTP Client Library: See AIMMS Documentation: Adding the HTTP Client Library for details.
Example project
You can find the completed project example below. Note that you’ll need to provide your own API key.
You’ll also need to download the audio file used in the example and unzip it at the root of your project folder.
Speech-to-Text conversion code
The final code is shown below:
1! indicate source and destination file
2! Ensure a valid path to SP_requestFileName ;
3SP_responseFileName := "Answer.json"; ! the name of the JSON file containing the text spoken.
4! Ensure you have a valid SP_apikey ;
5
6
7!given on the IBMCloud website
8SP_requestURI := "https://gateway-lon.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true";
9
10web::request_create(SP_requestId);
11web::request_setURL(SP_requestId, SP_requestURI);
12web::request_setMethod(SP_requestId, "POST");
13web::request_getHeaders(SP_requestId, SP_myHttpHeaders);
14!Authentication for the server
15web::base64_encode( "apikey" + ":" + SP_apikey, SP_authorization);
16SP_myHttpHeaders[ 'Authorization' ] := "Basic " + SP_authorization;
17web::request_setHeaders(SP_requestId, SP_myHttpHeaders);
18web::request_setRequestBody(SP_requestId, 'File', SP_requestFileName);
19web::request_setResponseBody(SP_requestId, 'File', SP_responseFileName);
20web::request_getOptions(SP_requestId,SP_Coption);
21SP_Coption['requestTimeout'] := "50";
22web::request_setOptions(SP_requestId, SP_Coption);
23web::request_invoke(SP_requestId, P_responseCode);
We’ll need to set the identifiers:
1Parameter P_responseCode;
2StringParameter SP_Coption {
3 IndexDomain: op;
4}
5Set S_Clientop {
6 Index: op;
7}
8StringParameter SP_requestId;
9StringParameter SP_requestURI;
10StringParameter SP_myHttpHeaders {
11 IndexDomain: web::httpHeader;
12}
13StringParameter SP_responseFileName;
14StringParameter SP_requestFileName;
15StringParameter SP_apikey;
16StringParameter SP_authorization;
In this article, we will analyze only selections of the code. You can read more generally about HTTP requests in AIMMS in Extract XML File from a Server with the HTTP Library.
Authentication header
Following Basic Access Authentication, we need to set our Authentication
header to basic username:password
. Here, the username is “apikey” and the password the key value. Both of these strings must be base64-encoded.
To do so, we use the following code:
1! Ensure you have a valid SP_apikey ;
2
3!getting the headers
4web::request_getHeaders(SP_requestId, SP_myHttpHeaders);
5
6!encoding the string "apikey : {API_KEY}" in base64
7web::base64_encode( "apikey" + ":" + SP_apikey, SP_authorization);
8
9!setting the Authorization header to "basic"+ encoded string
10SP_myHttpHeaders[ 'Authorization' ] := "Basic " + SP_authorization;
11
12!set back the new header for the request
13web::request_setHeaders(SP_requestId, SP_myHttpHeaders);
Options
You can also use options to set characteristics for the request.
From AIMMS Documentation: HTTP Client Library we learn that we can set requestTimeout
.
In some cases, like in this example, the API treatment is too long for the requestTimeout
to be respected. In that case, you can set more time for the request to execute using this option.
1web::request_getOptions(SP_requestId,SP_Coption);
2SP_Coption['requestTimeout'] := "50";
3web::request_setOptions(SP_requestId, SP_Coption);
By executing the complete code you should be able to retrieve your JSON file in the SP_responseFileName
direction or at the root of your project.
Reading JSON
An example JSON file sent back is:
1{
2 "results": [
3 {
4 "alternatives": [
5 {
6 "confidence": 0.99,
7 "transcript": "the space shuttle ... seven forty seven "
8 }
9 ],
10 "final": "true"
11 }
12 ],
13 "result_index": 0,
14 "warnings": [
15 "Unknown arguments: continuous."
16 ]
17}
The actual transcript is contained on line 7.
We can map this data to AIMMS identifiers using the following XML mapping file:
1<AimmsJSONMapping>
2 <ObjectMapping>
3 <ArrayMapping name="results">
4 <ObjectMapping iterative-binds-to="i0" >
5 <ArrayMapping name="alternatives">
6 <ObjectMapping iterative-binds-to="i1" >
7 <ValueMapping name="confidence" maps-to="p_confidence(i0,i1)"/>
8 <ValueMapping name="transcript" maps-to="sp_transcript(i0,i1)"/>
9 </ObjectMapping>
10 </ArrayMapping>
11 <ValueMapping name="final" maps-to="sp_final(i0)"/>
12 </ObjectMapping>
13 </ArrayMapping>
14 <ValueMapping name="result_index" maps-to="p_resultIndex"/>
15 <ArrayMapping name="warnings">
16 <ValueMapping iterative-binds-to="i_msg" maps-to="sp_mgs(i_msg)"/>
17 </ArrayMapping>
18 </ObjectMapping>
19</AimmsJSONMapping>
Here the transcript is mapped to the AIMMS string parameter sp_transcript
on line 8.
To read in this data, we use the following procedure:
1Procedure pr_ReadJSON {
2 Body: {
3 empty Declaration_data ;
4 dex::AddMapping("map", "map.xml");
5 dex::ReadFromFile(
6 dataFile : "Answer.json",
7 mappingName : "map",
8 emptyIdentifiers : 0,
9 resetCounters : 1);
10 }
11}
Finally, we can select the one non-empty element from sp_transcript
by a summation (adding strings is concatenation).
1StringParameter sp_FinalTranscript {
2 Definition: sum( (i0,i1), sp_transcript(i0, i1) );
3}