Warning

Please note that the documentation you are currently viewing is for an older version of our technology. While it is still functional, we recommend upgrading to our latest and more efficient option to take advantage of all the improvements we’ve made.

Convert Audio File with IBM API

This article contains an overview of IBM APIs, and a short example using the Speech-to-Text API.

Introduction to IBM APIs

IBM is known for their hardware and software, but they also provide a lot of APIs on the IBM Cloud. This API provides a variety of services, some free and some paid.

Their IBM Cloud Docs: APIs provides all the specifications for each API.

../../_images/apiinfo.png

You may want to start with reading the Creating Apps tutorial.

The HTTP Request

To use these APIs, you will need to formulate HTTP requests. The documentation provides you some request examples for each API. To see a complete example about formulating CURL requests in AIMMS, see Use the IBM Image Recognition API.

Authentication

To access these APIs, it’s required to authenticate in the request.

../../_images/authentication.png

The system used is Identity and Access Management (IAM) authentication, a token-based system. The authentication is then done through the Authentication request header using a token or an API key.

If you use an API key, you must follow the format of a Basic Access Authentication.

Answers

The result sent back from a IBM API is usually a JSON file. Such a JSON file can be read in using the Data Exchange Library.

Example

We will here use the Speech-To-Text API from IBM. By sending an audio file, we’ll be able to obtain the script of this video.

Prerequisites

Example Project

You can find the completed project example below. Note that you’ll need to provide your own API key.

You’ll also need to download the audio file used in the example and unzip it at the root of your project folder.

Speech-to-Text Conversion Code

The final code is shown below:

 1! indicate source and destination file
 2! Ensure a valid path to SP_requestFileName ;
 3SP_responseFileName := "Answer.json"; ! the name of the JSON file containing the text spoken.
 4! Ensure you have a valid SP_apikey ;
 5
 6
 7!given on the IBMCloud website
 8SP_requestURI := "https://gateway-lon.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true";
 9
10web::request_create(SP_requestId);
11web::request_setURL(SP_requestId, SP_requestURI);
12web::request_setMethod(SP_requestId, "POST");
13web::request_getHeaders(SP_requestId, SP_myHttpHeaders);
14!Authentication for the server
15web::base64_encode( "apikey" + ":" + SP_apikey, SP_authorization);
16SP_myHttpHeaders[ 'Authorization' ] := "Basic " + SP_authorization;
17web::request_setHeaders(SP_requestId, SP_myHttpHeaders);
18web::request_setRequestBody(SP_requestId, 'File', SP_requestFileName);
19web::request_setResponseBody(SP_requestId, 'File', SP_responseFileName);
20web::request_getOptions(SP_requestId,SP_Coption);
21SP_Coption['requestTimeout'] := "50";
22web::request_setOptions(SP_requestId, SP_Coption);
23web::request_invoke(SP_requestId, P_responseCode);

We’ll need to set the identifiers:

Parameter P_responseCode;
StringParameter SP_Coption {
    IndexDomain: op;
}
Set S_Clientop {
    Index: op;
}
StringParameter SP_requestId;
StringParameter SP_requestURI;
StringParameter SP_myHttpHeaders {
    IndexDomain: web::httpHeader;
}
StringParameter SP_responseFileName;
StringParameter SP_requestFileName;
StringParameter SP_apikey;
StringParameter SP_authorization;

In this article, we will analyze only selections of the code. You can read more generally about HTTP requests in AIMMS in Extract XML File from a Server with the HTTP Library.

Authentication Header

Following Basic Access Authentication, we need to set our Authentication header to basic username:password. Here, the username is “apikey” and the password the key value. Both of these strings must be base64-encoded.

To do so, we use the following code:

 1! Ensure you have a valid SP_apikey ;
 2
 3!getting the headers
 4web::request_getHeaders(SP_requestId, SP_myHttpHeaders);
 5
 6!encoding the string "apikey : {API_KEY}" in base64
 7web::base64_encode( "apikey" + ":" + SP_apikey, SP_authorization);
 8
 9!setting the Authorization header to "basic"+ encoded string
10SP_myHttpHeaders[ 'Authorization' ] := "Basic " + SP_authorization;
11
12!set back the new header for the request
13web::request_setHeaders(SP_requestId, SP_myHttpHeaders);

Options

You can also use options to set characteristics for the request.

From AIMMS Documentation: HTTP Client Library we learn that we can set requestTimeout. In some cases, like in this example, the API treatment is too long for the requestTimeout to be respected. In that case, you can set more time for the request to execute using this option.

1web::request_getOptions(SP_requestId,SP_Coption);
2SP_Coption['requestTimeout'] := "50";
3web::request_setOptions(SP_requestId, SP_Coption);

By executing the complete code you should be able to retrieve your JSON file in the SP_responseFileName direction or at the root of your project.

Reading JSON

An example JSON file sent back is:

 1{
 2   "results": [
 3      {
 4         "alternatives": [
 5            {
 6               "confidence": 0.99,
 7               "transcript": "the space shuttle ... seven forty seven "
 8            }
 9         ],
10         "final": "true"
11      }
12   ],
13   "result_index": 0,
14   "warnings": [
15      "Unknown arguments: continuous."
16   ]
17}

The actual transcript is contained on line 7.

We can map this data to AIMMS identifiers using the following XML mapping file:

 1<AimmsJSONMapping>
 2    <ObjectMapping>
 3        <ArrayMapping name="results">
 4            <ObjectMapping iterative-binds-to="i0" >
 5                <ArrayMapping name="alternatives">
 6                    <ObjectMapping iterative-binds-to="i1" >
 7                        <ValueMapping name="confidence" maps-to="p_confidence(i0,i1)"/>
 8                        <ValueMapping name="transcript" maps-to="sp_transcript(i0,i1)"/>
 9                    </ObjectMapping>
10                </ArrayMapping>
11                <ValueMapping name="final" maps-to="sp_final(i0)"/>
12            </ObjectMapping>
13        </ArrayMapping>
14        <ValueMapping name="result_index" maps-to="p_resultIndex"/>
15        <ArrayMapping name="warnings">
16            <ValueMapping iterative-binds-to="i_msg" maps-to="sp_mgs(i_msg)"/>
17        </ArrayMapping>
18    </ObjectMapping>
19</AimmsJSONMapping>

Here the transcript is mapped to the AIMMS string parameter sp_transcript on line 8. To read in this data, we use the following procedure:

Procedure pr_ReadJSON {
    Body: {
        empty Declaration_data ;
        dex::AddMapping("map", "map.xml");
        dex::ReadFromFile(
            dataFile         :  "Answer.json",
            mappingName      :  "map",
            emptyIdentifiers :  0,
            resetCounters    :  1);
    }
}

Finally, we can select the one non-empty element from sp_transcript by a summation (adding strings is concatenation).

StringParameter sp_FinalTranscript {
    Definition: sum( (i0,i1), sp_transcript(i0, i1) );
}