Professional Documents
Culture Documents
SUMMARY SHEET
Title of Project: Development Of Pronunciation Lexicon and Based on
Experimental Study of Phonetics And Phonemics of Indian
Languages.
Languages: Assamese, Manipuri, Bodo, Punjabi and Marathi
1.
2. Organizations
(a)
Creation of the Pronunciation Lexicon (PLS) of around 3,00,000 words for each of
the 5 Indian Languages namely Assamese , Manipuri, Bodo, Punjabi and Marathi as
per W3C specification.
The PLS (Pronunciation Lexicon) specification is about how to pronounce words and
phrases and how to deal with the variability of pronunciations by country, region, person,
etc. The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is
designed to enable interoperable specification of pronunciation information for both speech
recognition and speech synthesis engines within voice browsing applications.
The Speech Synthesis Markup Language specification (SSML) is a W3C markup language
specification that defines directives in the form of XML tags that can be used along with
Text-to-Speech synthesis systems (TTS) to control different speech parameters (e.g.
pronunciation, prosody) and also provide additional information such as language and
metadata for enhancing the quality of synthetic speech output in voice based applications.
Speech Recognition Grammar Specification (SRGS) is the preferred markup for grammar
syntax used in speech recognition.
The W3C PLS specification thus demands development of Pronunciation Lexicon for
Indian Languages.
The PLS can be created with the help of detailed experimental study of phonetic and
acoustic analysis pertaining to specificities of Indian Languages. Such experimental study
leads to standardization of Phonemic inventory and modeling for acoustic and phonetic
features for each of Indic Languages; a crucial and essential requirement for IPA and W3C
PLS, SSML and SGRS standards.
7. Expected outcome in physical terms (As applicable): The outcome of the project
will be:
Creation of the Pronunciation Lexicon (PLS) of around 300,000 words for 5 Indian
Not applicable
6-12 Months
12-18 Months
18-24 Months
The task involving the creation of recording material and creation of the PLS as per
the specification of W3C will be carried out by the associated institutes under the
supervision of C-DAC Kolkata and include the following:
0-6 Months
Collection of phonetic reader, publication and palatogram for the above said five
languages.
Preparation of recording materials for acoustic and articulatory verification.
1. All possible combination of CVC and VCV for articulatory study.
2. Around 250 sentences for acoustic study.
3. Around 600 words for diphthong and glide study.
Selection of informants for recording through listening test.
Audio recording of selected text material for acoustic study (contd...).
1. For the study of manner of articulation of consonant, consonant taken in VCV
(V Vowel & C Consonant) context is going to be recorded 3 times by 6
speakers.
2. Around 250 sentences will be recorded by at least 6 speakers.
6-12 months
Tagging of the recordings at the phone level.
12-18 months
Creation of the PLS as per the specification of W3C (contd).
18-24 months
Creation of the PLS as per the specification of W3C.
2 The tasks involved in the development of IDE for creation of PLS and analysis of
the results for all the languages will be carried out at C-DAC, Kolkata and include the
following:
0-6 Months
Acquisition of EPG, EGG, NASOMETER AND VIDEO CAMERA and setting of lab
which is going to be used by other centers.
Development of IDE for creation of PLS.
Training of Manpower for acoustic tagging (contd...).
Preparation of palates for informants for each of the 5 languages.
6-12 months
Recording on EPG, EGG and Nasometer (contd).
Video-graphy (contd).
Preparation of intermediate report.
Acoustic analysis.
7
12-18 months
Analysis of observations of instrumental outputs.
Validation Meeting.
18-24 months
Publication of PLS.
Conference.
13. Likely end user(s)
As the PLS is going to provide the standard pronunciation of the lexicon, it will be
used by both the researcher and professional for the development of the speech
related application.
14. Name of other Organizations jointly participating in the project:
(i)
C-DAC, Kolkata
(ii) KIIT , Gurgaon
(iii) IIT Guwahati
(iv) TIFR Mumbai
(v) Gauhati University
(vi) Manipur University
Total Budget (in lakhs of Rs) Outlay (2 years distributed across institutions)
Serial No
Heads
Year - 1
Year - 2
Capital Equipment
25
Consumables
3.5
3.5
Manpower
70.76
70.76
19
17
Travel
14
8.5
Contingencies
2.5
2.5
18.89
16.9
169.65
130.16
10
Over Head
Sub Total
Total
FE Component : nil
299.81
1.
-
Consortium Composed of
C-DAC, Kolkata (Overall Coordination)
IIT Guwahati
Manipur University
KIIT Gurgaon
TIFR, Mumbai
Gauhati University
1)C-DAC,Kolkata
Principal Investigator:
Arup Saha, Senior Engineer,
C-DAC,Kolkata,
arup.saha@cdac.in
Co- Principal Investigators:
Mr. Milton S Bepari
Senior Engineer
C-DAC, Kolkata.
Ms. Tulika Basu
Project Assisstant II
CDAC,Kolkata,
Mr. Rajib Roy
Project Engineer
C-DAC, Kolkata.
10
2)KIIT Gurgaon
Prof. S.S. Agrawal
Ms. Shweta Bansal
3)IIT -Guwahati
Dr. Pradip Kr Das
Associate Professor, Dept of Computer Science,
4)TIFR Mumbai
Dr. Nandini Bondale, Scientist-F
5)Gauhati University
Prof. Pranhari Talukdar
USIC , Guahati University
6)Manipur University
(i)Potsangbam Madhubala
(ii)Principal Investigator: Dr. Sarbajit Singh
Sr. Lecturer
5. Nature of Project :
Research on Technology Development
6. Background Information:
Man machine communication in speech mode involves the integration of all technologies
needed for both speech input, as well as output, as per all the attributes demanded by the
discipline of associated language. The success of development of such a complex
technology as speech technology depends largely on the accuracy of the pronunciation
lexicon of the language. This is because it represents the interface between speech analysis
on the acoustic level and speech interpretation. For example in automatic speech
recognition (ASR), the search module relies on phonetic transcriptions to select appropriate
acoustic models against which to score the input utterance. Likewise, in text-to-speech
(TTS) synthesis, phonemic transcriptions are required for the selection of the proper units
from which to generate the desired waveform.
The Pronunciation Lexicon Specification (PLS) is designed to enable interoperable
specification of pronunciation information for both ASR and TTS engines within voice
browsing applications. This technology has to be based on the current field status of
pronunciation being used by the people of the dialect. It is to be noted that the available
resources on acoustic and articulatory phonetics in the languages of north east are not
adequate for technology development.
This project aims at the development of the pronunciation lexicon of five Indian languages
namely Assamese, Manipuri, Bodo, Marathi and Punjabi. For the proper representation of
the lexicon in the PLS, a categorization of the phones and the allophones currently in use is
needed and hence the articulatory experiment together with the acoustic experiment is
necessary.
The following diagram gives a methodology of creation of PLS
11
Collection of phonetic
reader, publication and
palatogram (if
available) for the
Language
Collection
of
Lexicon
Preparation of recording
materials for Acoustic and
Articulatory verification
Exploration of
phonemes
Selection of
informants for
recording.
Preparation of
Palates for
informants
Development of
IDE for creation
of PLS
Audio recording of
selected text material.
Tagging of the
recordings
Acoustic
analysis of the
recorded data.
Creation of
pronunciation lexicon as
per W3C
12
Publication of PLS
in XML format.
13
Rajib Roy
PROFESSIONAL EXPERIENCE
More than seven years of experience in speech processing and technology
development in Indian language. Around 14 publications are made in the field of
Speech Processing. A few of the noteworthy developments and achievements are:
1. Soma Khan, Rajib Roy, CREATION OF ACOUSTIC SIGNAL
DICTIONARY FOR ESNOLA BASED CONCATENATED BANGLA AND
NEPALI TTS SYSTEM, Oriental COCOSDA 2011, Hsinchu, Taiwan
2. Rajib Roy, Tulika Basu, Joyanta Basu, Arup Saha, Study of Nucleus Vowel
Duration and its Role in Prosody of Bangla, Oriental COCOSDA-2007, pp. 181184, Hanoi, Vietnam.
3. Biman Ghosh, Rajib Roy, Tulika Basu and Arup Saha, Study of Syllable and
Pause Duration in Relation to Speech Synthesis, FRSM-2007, pp. 304-305,
AIISH, Mysore.
14
Creation of the Pronunciation Lexicon (PLS) for the said languages as per W3C
specification.
Verification of place and manner of articulation of all phones and allophones in
regard to above.
C-DAC, Kolkata
KIIT , Gurgaon
IIT Guwahati
TIFR Mumbai
Gauhati University
Manipur University
(b)
C-DAC, Kolkata
Acquisition of equipment and setting up of lab which will be used by other
participating institute.
Development of IDE for creation of PLS.
Training of Manpower for acoustic tagging.
Preparation of palates for informants for each of the language.
Recording on EPG, EGG and Nasometer.
Video Recording of lip movement.
Analysis of the palatograph.
Preparation of intermediate report.
Publication of PLS.
IIT-Guwahati
Collection of phonetic reader, publication and palatogram (if available) for the
Assamese Language.
Selection of informants for recording and listening test.
Preparation of the recording materials for acoustic and articulatory analysis of
phoneme.
1. All possible combination of CVC and VCV for articulatory study.
2. Around 250 sentences for acoustic study.
3. Around 600 words for diphthong and glide.
Audio recording of selected text material.
1. For the study of manner of articulation of consonant, consonant taken in VCV
(V Vowel & C Consonant) context is going to be recorded 3 times by 6
speakers.
2. Around 250 sentences will be recorded by at least 6 speakers.
15
Gauhati University
Collection of phonetic reader, publication and palatogram (if available) for the
Bodo Language.
Selection of informants for recording and listening test.
Preparation of recording materials for acoustic and articulatory analysis of phoneme.
1. All possible combination of CVC and VCV for articulatory study.
2. Around 250 sentences for acoustic study.
3. Around 600 words for diphthong and glide study.
Audio recording of selected text material.
1. For the study of manner of articulation of consonant, consonant taken in VCV
(V Vowel & C Consonant) context is going to be recorded 3 times by 6
speakers.
2. Around 250 sentences will be recorded by at least 6 speakers.
3. Rest of the recording list.
Tagging of the recordings
Acoustic analysis.
Creation of the PLS as per the specification of W3C
(d)
Manipur University
Collection of phonetic reader, publication and palatogram (if available) for the
Manipuri Language.
Selection of informants for recording and listening test.
Preparation of recording materials for acoustic and articulatory analysis of phoneme.
1. All possible combination of CVC and VCV for articulatory study.
2. Around 250 sentences for acoustic study.
3. Around 600 words for diphthong and glide study.
Audio recording of selected text material.
1. For the study of manner of articulation of consonant, consonant taken in VCV
(V Vowel & C Consonant) context is going to be recorded 3 times by 6
speakers.
2. Around 250 sentences will be recorded by at least 6 speakers.
3. Rest of the recording list.
Tagging of the recordings
Acoustic analysis.
Creation of the PLS as per the specification of W3C.
(e)
KIIT, Gurgaon
Collection of phonetic reader, publication and palatogram (if available) for the
Punjabi Language.
Selection of informants for recording and listening test.
16
TIFR, Mumbai
Collection of phonetic reader, publication and palatogram (if available) for the
Marathi Language.
Selection of informants for recording and listening test.
Preparation of recording materials for acoustic and articulatory analysis of phoneme.
1. All possible combination of CVC and VCV for articulatory study.
2. Around 250 sentences for acoustic study.
3. Around 600 words for diphthong and glide study.
Audio recording of selected text material.
1. For the study of manner of articulation of consonant, consonant taken in VCV
(V Vowel & C Consonant) context is going to be recorded 3 times by 6
speakers.
2. Around 250 sentences will be recorded by at least 6 speakers.
3. Rest of the recording list.
Tagging of the recordings
Acoustic analysis.
Creation of the PLS as per the specification of W3C.
18
19
Year - 2
1
25.76
2
3.5
1
2.5
3
2
6
46.76
111.51
Heads
Capital Equipment
Consumables
Manpower
Data Collection
Travel
Workshop
Coordination and Management
Over Head
Sub Total
Total
Year - 1
2
0.5
9
3
2
1
1
2.48
20.98
Year - 2
0.5
9
3
1
1
2.18
16.68
37.66
Heads
Year - 1
Capital Equipment
Consumables
0.5
0.5
Manpower
Data Collection
Travel
Workshop
7
8
21
Year - 2
2.48
2.18
20.98
16.68
37.66
22
Sl. No.
Name of Institution
1st Year
2nd Year
Total
C-DAC, Kolkata
64.75
46.76
111.51
IIT Guwahati
20.98
16.68
37.66
Gauhati University
20.98
16.68
37.66
Manipur University
20.98
16.68
37.66
KIIT- Gurgaon
20.98
16.68
37.66
TIFR Mumbai
20.98
16.68
37.66
169.65
130.16
23
Part IV
Endorsement by the Head of the Institution
1.
I have read the terms & conditions (including special terms & conditions for cofinancing) governing the grant-in-aid and I agree to abide by them.
2.
I certify that I have no objection to the submission of this research proposal for
consideration by the Ministry of Information Technology
3.
In case the project is approved, I undertake to make available facilities to carry it out,
to arrange for the submission of periodic progress reports and other information that
may be required by the Ministry of Information Technology and In general to ensure
that the conditions attached to the award of such grant are fulfilled by my
institution/organization.
4.
I certify that in case present chief investigator is not available for any reason to
continue work on this project, the following co-principal investigator will be available
to carry it throughout to completion:
Sl.No. Name
1. Milton S Bepari
Designation
Senior Engineer, C-DAC,Kolkata
5.
I certify that the facilities mentioned in the body of this report are available at my
institution.
6.
I certify that I shall ensure that accounts will be ept of the funds received and spent and
made available on demand, as specified and required by the Ministry of Information
Technology.
7.
I certify that I am the competent authority, the virtue of the administrative and financial
powers vested in me by to undertake the above stated commitments on behalf of my
institution.
24
Annexure - VA
TERMS AND CONDITIONS GOVERNING GRANT-IN-AID
i) The grant is for the specific project as approved by DIT and shall be subject to the
following conditions:
(a) The grant amount shall be spent for the project within the specified time,
(b) Any portion of the grant which is not ultimately required for expenditure for the
approved purposes shall be duly surrendered to DIT.
ii) The grantee institution shall maintain an audited record in the form of a register in the
prescribed proforma for permanent, semi-permanent assets acquired solely or mainly out
of DIT grant;
iii) The assets referred to in (ii) above will be property of DIT and should not, without
prior sanction of DIT, be disposed off or encumbered or utilized for the purposes other
than those for which the grant has been sanctioned. An undertaking shall be given by the
grantee institution that they agree to be governed by these conditions;
iv) At the conclusion of the project, DIT will be free to sell or otherwise dispose of the
assets which are the property of DIT and grantee institution shall render to DIT the
necessary support for facilitating the sale of these assets;
v) The grantee institution shall send to the Department of Information Technology at the
end of each financial year as well as at the time of seeking further installments of the grant
a list of assets referred to in (ii) above;
vi) Should at any time grantee institution cease to exist, such assets etc., shall revert to
DIT;
vii) The grantee institution shall render progress-cum-achievement reports at interval of
not exceeding six months on the progress made on all aspects of the project including
expenditure incurred on various approved items during the period.
viii) The grantee institution shall render an audited statement of accounts to DIT.
ix) The audited statement of accounts relating to grants given during financial year
together with the comments of the auditor regarding the observance of the conditions
governing the grant should be forwarded to the Department of Information Technology
within six months following the end of the relevant financial year;
x) The utilization of grant for the intended purposes will be looked into by the Auditor of
grantee institution according to the directives issued by the Government of India at the
instance of the Comptroller and Auditor General and the specific mention about it will be
made in the audit report;
25
xi) DIT or its nominee/s will have the right of access to the books and accounts of the
grantee institution for which a reasonable prior notice would be given;
xii) The grantee institution should maintain separate audited account for the project. If it is
found expedient to keep a part or whole of the grant in a bank account earning interest, the
interest, thus earned should be reported to this Department. The interest so earned will be
treated as a credit to the grantee to be adjusted towards future installment of the grant;
xiii) Sale proceeds of components, prototype, pilot project etc. fabricated as a result of the
development of the project arising directly from funds granted by Department of
Information Technology. shall be remitted to DIT;
xiv) The know-how generated by the project, shall be property of DIT. Any receipt by way
of sale of know-how transfer, royalties training etc., shall accure to DIT. DIT may, in its
discretion, allow or direct a portion of such receipts to be retained by the grantee
organization.
xv) DIT will have the right to call for drawings, specifications and other data necessary to
enable the transfer of know-how to other parties and the grantee shall supply all the needed
data at the request of DIT;
xvi) Application by grantee institution for any other financial assistance or receipt of
grant/loan from any other Agency/Ministry/Department for this project should have the
prior approval of Department of Information Technology.
xvii) The Grantee institution is not allowed to entrust the implementation of this project
for which grant-in-aid is received to another institution and to divert the grant-in-aid
received from Ministry of Information Technology as assistance to the later institution.
xviii) DIT shall appoint a Project Review and Steering Group (PRSG) comprising of
representatives from DIT and other experts. PRSG will periodically monitor the project in
all respects including technical and financial.
xix) The Grantee institution will first make all efforts to protect intellectual property
generated out of the project. The grantee institution will examine Intellectual Property
Rights (IPR) protection issues in consultation with IPR Cell, DIT to file patents, register
the copyrights etc. before making it public by publishing in the technical journals and
books, presenting findings in Conferences etc.
xx) The Intellectual property and the rights associated with it shall be assigned to DIT. In
cases where the fundings have been done jointly with other organizations, the IP rights
would be appropriately shared.
xxi) In case of any dispute on any matter, related to the project during the course of its
implementation, the decision of Secretary, DIT, shall be final and binding on the institute.
26
A certificate of acceptance of terms and conditions as above needs to be given by the chief
investigator/ endorsed by the head of the institute while submitting the project proposal.
Signatures :
27
Annexure -VB
Specific Terms & Conditions governing grant-in-aid relating to the consortium mode
project:
Responsibilities of Consortium Leader
i)
The DIT Grant-in-Aid will be released to the consortia leader who in turn
will release the assigned funds to the consortia members based on the
requirements and satisfactory performance of the consortia member and the
recommendations of PRSG/ DIT.
ii)
Consortia Leader would ensure the delivery of the end product as per
specified Software Requirement Specifications (SRS).
iii)
iv)
v)
vi)
vii)
28
for
modules/sub-
Development of
assigned to them
modules/sub-systems/components
/defined
ii)
iii)
iv)
Signatures :
Consortium Leader
Head of Institution
29
interface