In the post I will show how to add user defined Fst type to OpenFst. openfst.org doesn't go into detail on the process to add a new Fst type. The Fst type is memory mappable immutable transducers and this should allow for the same Fst instance to be shared across processes that load the same Fst.

All the code for the mmap-fst is avaliable in this gist. The implementation of the *mmap-fst *is contained in the file mmap-fst.h, the wrapper class around the mmap functions is taken from the mozc project. The wrapper supports both *nix and Windows oses, although only the former support the shared objects described in the article. I changed the length type in the Mmap class to allow larger than 2GB files. The MmapFst itself is basically a cut n' paste job on the ConstFst, the main difference is the MmapFst uses the mmap wrapper instead of array to access the state and arc information.

Given the Fst code we want to make the OpenFst library and existing OpenFst command line tools aware of of the MmapFst type. To achieve this we need the c++ code and the Makefile to construct a shared object. The registration code that is shown below and is very simple

The makefile below will compile the code to a shared object.

The final step is to make sure the shared object is on the LD_LIBRARY_PATH. We can then convert fsts to the mmap type and use it in all the command line tools. For example we can convert an Fst to MmapFst type by:

fstconvert --fst_typo=mmap < some.fst > mmap.fst

and view fst information by:

fstinfo mmap.fst

fst type mmap

arc type standard

input symbol table none

.....

....

# Archive for 2012

# Shortest Paths in Weighted Graphs using a Stack

When computing shortest paths in graph (for example as partof a beam search) it is tempting to apply a pruned recursive algorithm. This is equivalent to performing a generic shortest path algorithm with

$\hspace{3em}\text{Decrease-key}(Q, e.v, d[e.v])$

$\hspace{2em}\text{else}$

Consider the below graph, the vertex (state) marked 0 is the initial or start state and the double circle is the final or accept vertex (state), although the algorithm is computing the shortest path from vertex zero to every other vertex we are only interested in the shortest path to state three. On inspection it easy to see the shortest path is to follow all edges with cost 0 that is the path $0\rightarrow1\rightarrow2\rightarrow3$

the second column the element popped at that time step, the queue column is the items in the queue at the end of the current iteration and the substituent columns are the vertex costs. Here we see that for our graph that eight pops are required for the this graph as each path that has more edges but a lower

overall costs requires us to re-visit all the subsequent states again.

Time | Pop | Queue | $d[0]$ | $d[1]$ | $d[2]$ | $d[3]$ |

0 | - | 0 | $0$ | $\infty$ | $\infty$ | $\infty$ |

1 | 0 | 1,2,3 | $0$ | $0$ | $2$ | $3$ |

2 | 3 | 1,2 | $0$ | $0$ | $2$ | $3$ |

3 | 2 | 1,3 | $0$ | $0$ | $2$ | $2$ |

4 | 3 | 1 | $0$ | $0$ | $2$ | $2$ |

5 | 1 | 2,3 | $0$ | $0$ | $0$ | $1$ |

6 | 3 | 2 | $0$ | $0$ | $0$ | $1$ |

7 | 2 | 3 | $0$ | $0$ | $0$ | $0$ |

8 | 3 | $\emptyset$ | $0$ | $0$ | $0$ | $0$ |

Repeating the analysis for a FIFO queue we see that it only necessary to visit each state once.

Time | Pop | Queue | $d[0]$ | $d[1]$ | $d[2]$ | $d[3]$ |

0 | - | 0 | $0$ | $\infty$ | $\infty$ | $\infty$ |

1 | 0 | 1,2,3 | $0$ | $0$ | $2$ | $3$ |

1 | 1 | 2,3 | $0$ | $0$ | $0$ | $1$ |

1 | 2 | 3 | $0$ | $0$ | $0$ | $0$ |

1 | 3 | $\emptyset$ | $0$ | $0$ | $0$ | $0$ |

In summary when computing shortest paths the LIFO queue is fine if the graph has no weights each state only needs to be visited once. For weighted graphs either a FIFO or priority queue is a better choice, the later having better running time. Although in practise I have observed the FIFO approach consistently performing better. For graphs with negative weighted edges a priority queue can also exhibit exponential worst case running time.

[1] Jeff Erickson, "Algorithms Lecture 19: Shortest Paths", 2011, http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/19-sssp.pdf

# C++11x Range Based for loops

Here is a first pass at using C++11x range based for loops to over states and arcs in an OpenFst Fst. It also uses type inference (new use of the auto keyword.)

Requires the compiler has the relevant C++11x support, this code was tested using Visual Studio 11 Beta

**#include <iostream>**

**#include <fst/vector-fst.h>**

** **

**using namespace **fst;

**using namespace **std;

** **

**template**<**class **F>

**class **StateRange** **{

** public**:

** explicit **StateRange(**const **F&** **f)** **:** **f_(f)** **{** **}** **

** template**<**class **F>** **

** class **Iterator** **{** **

** typedef typename **F::StateId** **StateId;

** public**:

** explicit **Iterator** **(**const **F&** **f)** **:** **si_(f)** **{** **}

** **

** bool **operator !=** **(**const **Iterator** **&** **other)** ****const **{** **

** return **!si_.Done();

** **}** **

** **

** **StateId** **operator*** **()** ****const **{

** return **si_.Value();

** **}

** **

** const **Iterator&** **operator++** **()** **{** **

** **si_.Next();

** return *****this**;

** **}** **

** **StateIterator<F>** **si_;

** **};

**public**:

** **Iterator<F>** **begin()** ****const **{** ****return **Iterator<F>(f_);** **}

** **Iterator<F>** **end()** ****const **{** ****return **Iterator<F>(f_);** **}

** const **F&** **f_;** **

};

** **

**template**<**class **F>

StateRange<F>** **States(**const **F&** **f)** **{

** return **StateRange<F>(f);

}

** **

** **

**template**<**class **F>

**class **ArcRange** **{

**public**:

** typedef typename **F::StateId** **StateId;

** **ArcRange(**const **F&** **f,** **StateId** **s)** **:** **f_(f),** **s_(s)** **{** **}

** template**<**class **F>** **

** class **Iterator** **{** **

** typedef typename **F::Arc** **Arc;

** typedef typename **F::StateId** **StateId;

** public**:

** **Iterator** **(**const **F&** **f,** **StateId** **s)** **:** **ai_(f,** **s)** **{** **}

** bool **operator !=** **(**const **Iterator** **&** **other)** ****const **{** **

** return **!ai_.Done();

** **}** **

** **

** const **Arc&** **operator*** **()** ****const **{

** return **ai_.Value();

** **}

** **

** const **Iterator&** **operator++** **()** **{** **

** **ai_.Next();

** return *****this**;

** **}** **

** **ArcIterator<F>** **ai_;

** **};

**public**:

** **Iterator<F>** **begin()** ****const **{** ****return **Iterator<F>(f_,** **s_);** **}

** **Iterator<F>** **end()** ****const **{** ****return **Iterator<F>(f_,** **s_);** **}

** const **F&** **f_;

** const **StateId** **s_;

};

** **

**template**<**class **F>

ArcRange<F>** **Arcs(**const **F&** **f,** ****typename **F::StateId** **s)** **{

** return **ArcRange<F>(f,** **s);

}

** **

** **

**int **main()** **{** **

** **StdVectorFst** **fst;

** **fst.AddState();** **

** **fst.AddState();

** **fst.SetStart(**0**);

** **fst.SetFinal(**1**,** **TropicalWeight::One());

** **fst.AddArc(**0**,** **StdArc(**0**,** ****0**,** ****0**,** ****1**));

** **fst.AddArc(**0**,** **StdArc(**1**,** ****0**,** ****0**,** ****1**));

** **fst.AddArc(**0**,** **StdArc(**2**,** ****0**,** ****0**,** ****1**));

** **

** for **(**auto **s** **:** States**(**fst**))

** for **(**auto **a** **:** Arcs**(**fst**,** s**))** **

** **printf(**"%d %d %d %d %f\n"**,** **s,** **a.**nextstate**,** **a.**ilabel**,** **a.**olabel**,** **a.**weight**.**Value**());** **

}