You are on page 1of 13

FINITE STATE

TRANSDUCERS

INTRODUCTION
Finite

state transducers (or FSTs);


another finite state machine,
allows to produce output
recording the structure of the
input

FST
FST

as recognizer Accept /
Reject
FST as generator Yes / No & o/p
string
FST as translator Read and o/p
string
FST as set relater Compute
relations b/w sets

Morphological Parsing with FSTs


Using

a finite-state automata (FSA) to


recognize a morphological realization of a
word is useful
But what if we also want to analyze that word?
e.g. given cats, tell us that its cat + N + PL
A

finite-state transducer (FST) can give us the


necessary technology to do this
Two-level morphology:
Lexical level: stem plus affixes
Surface level: actual spelling/realization of the word
Roughly,

well have the following for cats:

c:c a:a t:t :+N s:+PL

Example

Finite-State Transducers
While

an FSA recognizes (accept/reject) an input


expression, it doesnt produce any other output
An FST, on the other hand, in addition produces
an output expression we define this in terms
of relations
So, an FSA is a recognizer, whereas an FST
translates from one expression to another
So, it reads from one tape, and writes to another tape
Actually,

it can also read from the output tape


and write to the input tape
So, FSTs can be used for both analysis and
generation (they are bidirectional)

FSAs and FSTs


FSTs,

then, are almost identical to FSAs Both

have:

Q: a finite set of states


q0: a designated start state
F: a set of final states
: a transition function

The

difference: the alphabet () for an FST is now


comprised of complex symbols (e.g., X:Y)
FSA: = a finite alphabet of symbols
FST: = a finite alphabet of complex symbols, or pairs

As

a shorthand, if we have X:X, we can write this


as X

Example

Spelling Rules

Application

Combining FSTs: Spelling Rules


So

far, we have gone from a lexical level


(e.g., cat+N+PL) to a surface level (e.g.,
cats)
But this surface level is actually an
intermediate level it doesnt take
spelling into account
So, the lexical level of fox+N+PL corresponds
to fox^s
We will use ^ to refer to a morpheme
boundary
We

need another level to account for


spelling rules

Lexicon FST
The

lexicon FST will convert a lexical


level to an intermediate form

dog+N+PL dog^s
fox+N+PL fox^s
mouse+N+PL mouse^s
dog+V+SG dog^s

This

will be of the form:

0-> f ->1 3-> +N:^ ->4


1-> o ->2 4-> +PL:s ->5
2-> x ->3 4-> +SG: ->6
And

so on

Lexicon rules
Insert an e on the surface tape just when the lexical
tape has a morpheme ending in x(or z, etc) and the
next morpheme is s. The formalization of the rule is

You might also like