Voice capability means the modem can digitize the incoming voice message for the computer to store and forward. It also means the modem can playback the recorded digitized voice either off-line for local message listening or on-line for message announcement.
The main issue in digitized voice is the amount of storage required. A good phone-quality voice digitization will produce about 64 Kbits of data for each second of voice. The hard disk will quickly be filled up by digitized voice at this digitization rate. Speech compression is needed to reduce the digitization data rate. A relatively simple ADPCM (Adaptive Delta Pulse Code Modulation) algorithm can reduce the speech data rate to half and maintain about the same voice quality. This algorithm can also be used to reduce the speech data rate to 1/3 or 1/4 of the original rate but with voice quality degradation. Reducing the speech data rate further and maintaining good voice quality requires a sophisticated and complicated signal processing algorithm. It also requires a lot of digital signal processing computation power. We call this sophisticated speech data compression capability advanced voice capability.
The U-1496 series modems support five voice digitization schemes. Four schemes use the ADPCM algorithm and one uses the advanced CELP (Code Excited Linear Prediction) algorithm to achieve near-phone-quality voice at 9.6 kbps speech data rate. A summary of these four schemes is listed below:
where the sampling rate used is 9600 samples per second.
Digitization Speech Compression Data Rate Scheme Algorithm
CELP Code Excited Linear Prediction 9600 bps 2-ADPCM ADPCM, 2 bits/sample 19200 bps 3-ADPCM ADPCM, 3 bits/sample 28800 bps 3-ADPCM(NEW) ADPCM. 3 bits/sample 30720 bps 4-ADPCM ADPCM, 4 bits/sample 38400 bps
The ADPCM voice modes are supported on all U-1496 series modems. However, the advanced CELP scheme is supported only on the plus version and the U-1496 LCD model.
Silence detection is implemented to eliminate voice data coding in silence period.
The modem also supports simultaneous DTMF (Dual Tone Multiple Frequency) tone detection that allows a user to use the keypad of a tone-dialing, push-button telephone set to control, instruct, or answer the modem.
Since there is no standard and there is no way to specify how human voice should behave in telephone calling and answering, it is difficult, if not impossible to automatically detect a voice call from a fax or data call. A common method to do this detection and switch is to have it answer with a human voice announcement first and then give the options to the calling party to push a number from the push-button pad to activate different features. In the meantime, if a faxtone (fax calling tone) is detected, the software and modem will then switch into fax mode. If no tone is detected then the modem will assume that this is a voice call and will continue playing the announcement message until it gives the caller the option to leave a message. After that it will wait for the caller to leave a message and if it detects that there is no energy level present during a certain amount of time it will timeout and the modem will decide the call is a data call and start modem handshaking. The shortcoming of this method is that some modems will get confused by the initial voice message announcement and will not get connected. Even if the data call is connected successfully, the longer handshake delay may not be acceptable in some applications. The modem can decide a fax or data call quicker if a data or fax calling tone is received. The voice announcement can be omitted in this case. ZyXEL has moved in this direction by having the capability of not only detecting and generating fax tones but also data calling tones.
The U-1496 series modems support a set of voice AT commands. These commands are basically consistent with the TIA TR29.2 committee IS-101 document. The command implementation is not final as of this writing. It is certain that it will be continually updated during the initial stage. Please refer to the manual amendment or firmware release notes in the software disk accompanying this modem for updated detail.
The following is a summary of the supported voice AT commands:
AT is used as the command line prefix, the voice command takes one of the following forms:
+V<CM>? read current setting +V<CM>=? read permissable setting +V<CM>=<single value> set single-valued parameter +V<CM>=<value string> set compound parameterwhere <CM> represents a two-letter command syntax, a value string consists of values separated by comma or semicolon. The first two comand forms are for read action, the last two are for write action. A command may have both action types or just read or write type only.
For each command line received, the modem issues a response to each command in the command line followed by a final response.
Each command response is of the form:
<CR><LF> <value> or <value range> response of "?" or "=?" command <CR><LF>
The final response is
<CR><LF> OK or ERROR Command line response <CR><LF>
The final response is "OK" if all the commands in the command line have been successfully executed, otherwise it is "ERROR".
Result codes:
0,2,2.0,6,8 OKwith <CR><LF> between each line.
Result codes:
The range of the <value> is from 1 to 2.
Result codes:
The response is:
0-2 OKwith <CR><LF> between each line.
The range of the <value> is from 0 to 255. The units are 0.1 seconds. The default value is 10.
Result codes:
0-255 OKwith <CR><LF> between each line.
The range of the <value> is from 0 to 255. Default value is 6.
Result codes:
The response is:
0-255 OKwith <CR><LF> between each line.
**Note: This is not defined in IS-101. Once again this was requested by customers who were using the modem in overseas countries where the duration of a DTMF tone is not standard.
The range of the <value> is from 0 to 31. Default value is 16.
Result codes:
The response is:
0-31 OKwith <CR><LF> between each line.
**Note: This is not defined in IS-101. This command was added to enable the DCE to either increase or decrease its sensitivity to DTMF tones by defining the range of the threshold. This feature was added due to requests from customers.
The range of the <value> is from 0 to 255. The default value is 192.
Result codes:
0-255 OKwith <CR><LF> between each line.
Result codes:
**Note: Complies with the IS-101
The range of the <value> is from 0 to 255. The units are 1 second. Default value is 7 sec.
Result codes:
The response is:
0-255 OKwith <CR><LF> between each line.
**Note: This does comply with IS-101
The permitted <device> is as follows:
Result codes:
0,2,8,16 OKwith <CR><LF> between each line.
The range of the <value> is from 0 to 2. The default value is 0.
The detail description of each value is :
Result codes:
0-2 OKwith <CR><LF> between each line.
**Note: This does comply with IS-101
The range of the <value> is from 0 to 255. The units are 0.1 second. A value of zero force the DCE to return VCON immediately after the ATD command is received.
Result codes:
The response is:
0-255 OKwith <CR><LF> between each line.
**Note: This does comply with IS-101 except that ZyXEL default value is set at 70. We derived this value after conducting numerous tests and found it to be ideal with majority of the phone systems.
The range of the <value> is from 0 to 255. The units are 0.1 second. A value of zero force the DCE to return VCON immediately after the ATD command is received.
Result codes:
The response is:
0-255 OKwith <CR><LF> between each line.
**Note: This does comply with IS-101 except that ZyXEL default value is set to 57. Once again we derived this value from numerous tests that were conducted on different phone systems.
The DCE begins the voice receive mode by returning the CONNECT result code to the DTE. After this report, the DCE sends <DLE> shielded voice data to the DTE.
The DTE can abort the voice receive state by sending a character other than <XON> and <XOFF>. On termination of the voice receive state, the DCE will append a <DLE><ETX> character pair, followed by the VCON result code.
See the description of the <DLE> shielded code for the difference between SILENCE and QUIET report.
The range of the sensitivity <sds> is from 0 to 31. The higher the value, the higher the sensitivity threshold level. The magnitude difference in sensitivity level has a unit of dB. <sds> = 0 means silence detection is disabled or not supported. The range of the interval <sdi> is from 0 to 255 in unit of 0.1 second.
Result codes:
<sds>,<sdi> OKwith <CR><LF> between each line.
(0-31),(0-255) OKwith <CR><LF> between each line.
**Note: It does comply with the IS-101 except for the default values. Our default values are 15 for <sds> and 70 for <sdi>. These default values were derived to work with the majority of the phone systems found in the U.S. For example the reason we increased the required silence period from 5 secs to 7 secs we found that people tend to pause while leaving a message and it could cause a false alarm by reporting DLE<s> or DLE<q> and disconect a caller. On the other hand increasing it further could cause the phone company to intervene and therefore continue the recording.
**Note: Only PLUS enhanced models support CELP mode.
<cml>;<scs>;<vsr> OKwith <CR><LF> between each line. Where
1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) OKwith <CR><LF> between each line.
The first item in each line is the compression method label; the second item is the compression scheme; the third item is number of bits per sample; the fourth item is the silence detection threshold level; the fifth item is the sampling rate.
The range of the <value> is from 0 to 255. The units are 50ms. Default value is 1 (50ms).
Result codes:
The response is:
0-255 OKwith <CR><LF> between each line.
**Note: Not defined in IS-101. This timer when set will allow the DTE to send this resync code(<DLE><FS>) to signify the start of a new voice data stream with the same parameters as the last stream without first returning to the Voice Command State. The IS-101 does mention about "Timing Marks" but does not specify what interval you could set, nor a method of adjusting it. This will give the user the flexibility to start a new voice data stream at different points in the data stream.
The tone generation string shall consist of elements in a list with each element seperated by commas. Each element can be:
(0,200-3300),(0,200-3300),0-9,A-D,*,# OKwith <CR><LF> between each line.
**Note: Does comply with IS-101.
The DCE begins the voice transmit mode by returning the CONNECT result code to the DTE. After this report, the DCE accepts <DLE> shielded Voice data from the DTE. The DTE can abort the voice transmit state by sending a <DLE><ETX> character pair to the DCE. On termination of the voice transmit state, the DCE will send the VCON result code.
**Note: This does comply with IS-101.
The range of the <value> is from 0 to 255. The default value is 16.
Result codes:
0-255 OKwith <CR><LF> between each line.
**Note: Not defined in the IS-101.
In voice transmit/receive mode, BISYNC protocol should be applied to the data stream to/from DCE. During this period, commands and responses will be in <DLE> shielded form. The supported <DLE> shielded codes are as follows (all <DLE> shield codes are case sensitive):
Code Simple Action Command Description
<NUL> This is something I would recommend <DLE> Complies with IS-101 p Pause Transmit Data State("Immediate Command"). Complies with IS-101 r Resume Transmit Data State("Immediate Command"). Complies with IS-101 <ETX> End Transmit Data State ("Stream Command"). Complies with IS-101. <CAN> Clear transmit buffer of voice data("Immediate Command"). Complies with IS-101. <FS>or<DC2> Concatenate transmit data streams("Stream Command"). Complies with IS-101. <DC4> Clear transmit buffer of voice data and return to command mode("Immediate Command"). Not defined in IS-101.
Code Event Report Description
<DLE> Two contiguous <DLE><DLE> codes indicate a single <DLE> in the data stream. Complies with IS-101 <ETX> End Data State. Complies with IS-101 0 DTMF '0' received. Complies with IS-101 1 DTMF '1' received. Complies with IS-101 2 DTMF '2' received. Complies with IS-101 3 DTMF '3' received. Complies with IS-101 4 DTMF '4' received. Complies with IS-101 5 DTMF '5' received. Complies with IS-101 6 DTMF '6' received. Complies with IS-101 7 DTMF '7' received. Complies with IS-101 8 DTMF '8' received. Complies with IS-101 9 DTMF '9' received. Complies with IS-101 # DTMF '#' received. Complies with IS-101 * DTMF '*' received. Complies with IS-101 c T.30 Facsimile Calling Tone. Complies with IS-101 e Data Calling tone. Complies with IS-101 s "Presumed Hangup"(Silence) Time-out. Silence detected. The DCE has determined that there was no voice energy present at the the beginning of the voice recording session followed by a period of silence greater than the amount of time selected by AT+VSD command. Complies with IS-101 q "Presumed End of Message" (Quiet) Time-out. Quiet detected. The DCE has determined that there was voice energy present at the the beginning of the voice recording session followed by a period of silence greater than the amount of time selected by AT+VSD command. Complies with IS-101 b Busy tone. Complies with IS-101 d Dialtone tone. Complies with IS-101.
DTE to DCE stream, The DCE will filter the data stream from the DTE and remove all character pairs beginning with <DLE>. The DCE will recognize <DLE><DLE> and reinsert a single <DLE> in its place. The DTE must filter stream data to the DCE, and insert extra <DLE> characters ahead of <DLE> data.
DCE to DTE stream, The DTE must filter the data stream from the DCE and remove all character pairs beginning with <DLE>. The DTE must recognize <DLE><ETX> as the stream terminator. The DTE must recognize the <DLE><DLE> and reinsert a single <DLE> in its place. The DCE will filter stream data to the DTE, and insert extra <DLE> characters ahead of <DLE> data.
+--------------+ | RDA=Rcv Data | +--------------+ | +------------+ | PREVDLE=1? | +------------+ | +---------------------------+ YES| |NO +-----------+ +-----------+ | PREVDLE=0 | | PREVDLE=0 | +-----------+ +-----------+ | | +----------+ +-----------+ | RDA=DLE? | | RDA=DLE ? | +----------+ +-----------+ | | +-----------+ +-----------------+ YES| |NO YES| |NO +----------+ +-----------+ +-----------+ +-----------------+ | Put RDA | | Check for | | PREVDLE=1 | | Is it Tx mode ? | | into Buf | | Command | +-----------+ +-----------------+ +----------+ +-----------+ |YES +-------------------+ | Check if XON/XOFF | +-------------------+ | +--------------+ YES| |NO +-----------+ +----------+ | Handle | | Put RDA | | Flow Cntl | | into Buf | +-----------+ +----------+
+--------------+ | TDA=Tx Data | +--------------+ | +------------+ | TDA=DLE ? | +------------+ | +--------------------+ YES| |NO +--------------+ +--------------+ | Send two DLE | | Sned TDA to | | to UART | | UART | +--------------+ +--------------+
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=8 ---> Activate external MIC on Line Jack <--- VCON AT+VRX ---> Start to record <--- CONNECT <--- <DATA> AT ---> Stop recording <--- <DATA> <--- <DLE><ETX> <--- VCON Return to commnad state AT+VLS=0 ----> Deactivate external MIC on Line Jack <--- OK AT+FCLASS=0 Return to data mode <--- OK
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=16 ---> Activate internal speaker <--- VCON AT+VTX ---> Start to play <--- CONNECT <DATA> ---> <DLE><ETX> ---> <--- VCON Return to commnad state AT+VLS=0 ---> Deactivate internal speaker <--- OK AT+FCLASS=0 Return to data mode <--- OK
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=2 ---> Connect to line <--- VCON AT+VTX ---> Start to play <--- CONNECT <DATA> ---> <DLE><ETX> ---> <--- VCON Return to commnad state AT+VLS=0 ---> Deactivate line connection <--- OK AT+FCLASS=0 Return to data mode <--- OK
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=2 ---> Connect to line <--- VCON AT+VTX ---> Start to play greeting message <--- CONNECT <DATA> ---> <DLE><ETX> ---> <--- VCON return to commnad state AT+VRX ---> Start to record <--- CONNECT <--- <DATA> <--- <DLE>b DCE detects busy tone or or <DLE>q long period of quiet AT ----> Stop recording <--- <DATA> DCE deliver remaining data <--- <DLE><ETX> <--- VCON Return to commnad state AT+VLS=0 ---> Deactivate line connection <--- OK AT+FCLASS=0 Return to data mode <--- OK
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=2 ---> Connect to line <--- VCON AT+VTX ---> Start to play greeting message <--- CONNECT <DATA> ---> <--- <DLE>c T.30 fax calling tone detected or or <DLE>5 DTMF digit '5' detected <DATA> ---> <DLE><ETX> ---> <--- VCON Return to commnad state AT+FCLASS=2 ---> Try to handshake Fax mode <--- OK ATA ---> Switch to fax mode and answer fax call . . .
DTE DCE comments
AT+FCLASS=8 ---> Switch to voice mode <--- OK AT+VSM=? ---> 1;CELP;1;0;(9600) 2;ADPCM;2;0;(9600) 3;ADPCM;3;0;(9600) 30;ADPCM;3;0;(9600) 4;ADPCM;4;0;(9600) <--- OK AT+VSM=1 ---> Select CELP compression method <--- OK AT+VLS=2 ---> Connect to line <--- VCON AT+VTX ---> Start to play greeting message <--- CONNECT <DATA> ---> <DLE><ETX> ---> <--- VCON Return to command state AT+VRX ---> Start to record <--- CONNECT <--- <DATA> <--- <DLE>s DCE detects silence AT ----> Stop recording <--- <DATA> DCE deliver remaining data <--- <DLE><ETX> <--- VCON return to command state (DTE delete this silence file) AT+FCLASS=0 ---> <--- OK ATA ---> Try to handshake data mode Switch to data mode and answer data call . . .
If you have a telephone set with a handset that has a cable with a RJ11C connector plugged into the phone set main body, you can unplug the handset and plug it into the modem Line Jack and then use the handset as both a microphone and speaker.
The central two wires in the Line Jack are the signal wires to be connected to the phone company. If you connect the handset to this Line Jack, normally the earphone in the handset will be connected to the modem's two-wire line terminals. You can hear the voice when the modem is in playback mode and you can speak to the earphone (remember "earphone") in recording. In recording mode, the earphone is used as a microphone.
CAUTION: Never plug the handset into the modem's Phone Jack because it will be connected to the phone line when modem is on-hook and it will be damaged by the phone line's DC voltage and current.