کلمات کلیدی مربوط به کتاب و غیره کتاب HAPI. توصیف رابط برنامه نویسی برنامه HTK، نسخه 1.4: علوم و مهندسی کامپیوتر، پردازش داده رسانه، پردازش صدا، پردازش گفتار
در صورت تبدیل فایل کتاب etc. The HAPI Book. A Description of the HTK Application Programming Interface, Version 1.4 به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب و غیره کتاب HAPI. توصیف رابط برنامه نویسی برنامه HTK، نسخه 1.4 نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Entropics Ltd., 1999, -667 pp.
The HTK Application Programming
Interface (HAPI) is a library of functions providing the
programmer with an interface to any speech recognition system
supplied by Entropic or developed using the Hidden Markov Model
Toolkit (HTK). HTK is a set of UNIX tools which are used to
construct all the components of a modern speech recogniser. One
of the principal components which can be produced with HTK is a
set of Hidden Markov Model (HMM) based acoustic speech models.
Other components include the pronunciation dictionary, language
models and grammar used for recognition. These can be produced
using HTK or Entropic's grapHvite Speech Recognition Developer
System. HAPI encapsulates these components and provides the
programmer with a simple and consistent interface through which
they can integrate speech recognition into their
applications.
Given a set of acoustic models, a dictionary, and a grammar,
each unknown utterance is recognised using a decoder. This is
the search engine at the heart of every speech recognition
system. The core HAPI package may be shipped with a variety of
decoders, for example the standard HTK decoder, the
professional MVX decoder for medium vocabulary tasks, or the
professional LVX decoder for large vocabulary tasks. In
addition to the decoder, the recogniser also requires an
appropriate dictionary and set of HMMs. These HMMs should be
trained on speech data which is similar to that which will be
recognised. For example, HMMs trained on wide band data
collected through a high quality desk top microphone will not
perform well if used to recognise speech over the telephone.
The dictionary used should provide pronunciations, in terms of
models, for all words in the intended recognition
vocabulary.
The developer has the choice of either generating their own
recognition components or licensing components from the wide
range that Entropic has to offer. Entropic supplies pre-trained
models for a number of languages and environments together with
the corresponding dictionaries. For other requirements, custom
models and dictionaries can be built using HTK, which provides
a framework in which to produce and evaluate the accuracy and
performance of all major types of recognition system.
The particular route taken to produce the recogniser components
is irrelevant as far as HAPI is concerned. Having acquired the
necessary components, HAPI is all that is required to produce
both prototype and commercial applications. HAPI provides the
programmer with a simple programming interface to the chosen
components allowing them to incorporate speech recognition into
their application with the minimum of effort. Although HAPI can
be viewed as an extension to HTK (since it uses components
produced using HTK and shares the same code libraries) it is
better viewed as a stand alone interface. The underlying
recogniser components of a HAPI application can be upgraded
with no need to modify any existing application code.
The combination of HAPI, HTK and Entropic's off the shelf
decoders provides the application developer with the complete
set of tools and components required to produce and utilise
state-of-the-art technology in applications spanning the entire
spectrum of current uses of speech recognition.
This book is divided into four main parts. This first part
describes the components of a speech recognition system and
provides a high level overview of the capabilities of HAPI in
the form of a tutorial on how to build a simple application - a
phone dialer. Part 2 provides an in depth description of the
facilities available within HAPI and how these facilities
relate to HTK for those familiar with the HTK toolkit and who
use it to produce their own recognition systems. Part 3
describes a few example programs and touches on areas not
directly related to HAPI but which are nonetheless important
when producing a recognition system for a real application
(such as semantic parsing and dialogue management). Finally the
appendices describe the differences between the various
flavours of HAPI (due to the different programming languages
being used to implement the specification) and provides a
complete reference section.
This book does not cover the production of the recognition
components (such as the acoustic models or word networks).
These processes are described in some detail in the HTK Book
and the grapHvite manual. Although there is a brief description
of speech recognition systems in this chapter it is neither
detailed nor comprehensive. For more information on the
principles and algorithms used in speech recognisers the reader
is advised to read the HTK Book as well as general literature
from the field.
Index
Part I: Tutorial Overview
An Introduction to HAPI
An Overview of HAPI
Using HAPI: An Example Application
Using HAPI: Improving the Dialer
Using HAPI: Speaker Adaptation
Interfacing to a source driver
Part II: HAPI in Depth
Developing HAPI Applications
HAPI, Objects and Configuration
hapiHMMSetObject
hapiTransformObject
hapiSource/CoderObject
hapiDictObject
hapiLatObject
hapiNetObject
hapiRecObject
hapiResObject
Part III: Application Issues
System Design
Extended Results Processing
Part IV: Decoders
Decoder Variations
The Core HTK Decoder
The LVX Decoder
Part V: Appendices
A HAPI Reference
B HAPI from JAVA (JHAPI)
C Error and Warning Codes