You are on page 1of 28

Chaitanya Institute of Engineering and Technology Technical seminar

Panini D

Definition of SALT
Speech Application Language Tags (SALT) is a small number of XML elements that may be embedded into host programming languages to speech-enable applications. SALT (Speech Application Language Tags) is an extension of HTML and other markup languages (cHTML, XHTML, WML) that adds a powerful speech interface to Web pages,

SALT is a lightweight set of XML elements that enhance existing markup languages with a speech interface

Evolution
The SALT Forum was announced on October 15, 2001. Cisco, Comverse, Intel, Microsoft, Philips, and SpeechWorks founded the SALT Forum as a joint initiative for the development of 'Speech Application Language Tags' to be embedded in other markup languages The SALT Forum [http://www.saltforum.org/] originally consisting of Cisco, Comverse, Intel, Microsoft, Philips, and SpeechWorks (now ScanSoft), published the initial specification in June 2002.

On July 15, 2002 the SALT Forum announced the public release of SALT Version 1.0. Version 1.0 of the SALT specification covers three broad areas of capabilities: speech output, speech input and call control.
The specification was contributed to the World Wide Web Consortium (W3C) in August 2002. The full specification for SALT is currently being developed by the Salt Forum,

Overview of SALT
The main top-level elements of SALT are <prompt > for speech synthesis configuration and prompt playing <listen > for speech recognizer configuration, recognition execution and postprocessing, and recording

<dtmf> for configuration and control of DTMF collection <smex> for general-purpose communication with platform components

The input elements <listen> and <dtmf> also contain grammars and binding controls <grammar> for specifying input grammar resources <bind ..> for processing of recognition results <listen> also contains the facility to record audio input <record > for recording audio input

All top-elements contain the platform configuration element <param>.

Microsoft provides an <audiometer> element.

prompt
The prompt element is used to specify the content of audio output. Prompts are queued and played using a prompt queue The prompt element contains the resources for system output, as text or as references to audio files, or both. It also permits platform-specific configuration using the <param> element.

Program
<html xmlns:salt = http://www.saltforum.org/2002/SALT"> <body onload = "hello.Start()"> <salt:prompt id = "hello"> Hello World </salt:prompt> </body> </html>

SALT tags have been added to the HTML document: <xmlns:salt> defines a namespace, <salt:prompt> defines a speech prompt.

Document needs to be loaded in a SALT 1.0 compatible browser.


Methods such as Start() initiate SALT tags. It would say "Hello World" using a text-to-speech engine

Listen
The listen element is used to specify possible user inputs and a means for dealing with the input results. Listen can also be used for recording speech input, and its <record> subelement is used to configure this process, the activation of listen elements is controlled using Start, Stop and Cancel methods. The listen element contains one or more grammars and/or a record element, and (optionally) a set of bind elements which inspect the results of the speech input and copy the relevant portions to values in the containing page

Program using listen tag: <salt:listen id = "listenEmployeeName"> <grammar src = "MyGrammar.grxml"/> <bind targetelement = "txtName" value = "//employee_name"/> </salt:listen>

once recognized "//employee_name" is bound to "txtName".

Dtmf
The <dtmf> element is used in telephony applications to specify possible DTMF inputs and a means of dealing with the collected results and other DMTF events. Like <listen>, its main elements are <grammar> and <bind>, and it holds resources for configuring the DTMF collection process and handling DTMF events. The dtmf element is designed so that type-ahead scenarios are enabled by default. That is, for applications to ignore input entered ahead of time, flushing of the dtmf buffer has to be explicitly authored.

DTMF event timeline

Smex
smex, short for Simple Messaging extension, is a SALT element that communicates with the external component of the SALT platform. It can be used to implement any application control of platform functionality such as telephony control. As such, smex represents a useful mechanism for extensibility in SALT, since it allows any new functionality be added through this messaging layer.

Principles of SALT
Modes of execution Object mode Declarative mode Dynamic manipulation of SALT elements Events and error handling

Speech enabled Web application

Implementation Architecture
There are four possible components in implementing a speech enabled Web application using SALT

a a a a

Web server telephony server speech server client device.

Web Server The Web server "generates" Web pages containing HTML, SALT, and embedded scripts. The script controls the dialog flow for voice-only interactions. Telephony Server Telephony server connects to the telephone network. The server incorporates a voice browser interpreter interpreting the HTML, SALT, and script.

Speech Server The speech server recognizes speech, plays audio prompts, responses back to the user. Client Device Clients include,a pocket, a tablet, or desktop PC running a version of a browser (e.g. Internet Explorer) that is capable of interpreting HTML and SALT.

Major scenarios
For multimodal applications, SALT can be added to a visual page to support speech input and/or output. This is a way to speechenable individual HTML controls for push-to-talk form-filling scenarios, or to add more complex mixed initiative capabilities if necessary.

Multimodal

Telephony
For applications without a visual display, SALT manages the interactional flow of the dialog and the extent of user initiative by using the HTML eventing and scripting model. In this way, the full programmatic control of client-side (or server-side) code is available

Multimodal Dialog Management in Robotics

Advantages
Businesses will be able to offer common Web-based applications across multiple presentation media, resulting in reduced complexity and costs. In addition, they will be able to use their existing Web investments Developers will be able to seamlessly embed speech enhancements in existing HTML, xHTML and XML pages, using familiar languages, technologies and toolkits. Service providers will be able to deploy a broad range of applications using standards that enable the widest range of services, offering new business opportunities and revenue streams that better serve both consumers and business customers.

Disadvantages
Requires you to maintain Visual Studio .NET 2003 and SASDK 1.1. Requires recompilation and redeployment.

You must write your application, excluding prompts and grammars.


Network delays Don't provide beneficial result if vocabularies are small or grammars are well defined

conclusion
As this S A L T is not only help full for the business and many other applications it is also useful for the blind people in many ways. So development in this technology not only for the helps in easing the task but also helpful for the blind people.

Thank you

Any queries

You might also like