Thursday, January 19, 2012

Moses phrase-based decoder analysis

(1). from the moses-cmd/src/Main.cpp (int main(int argc, char* argv[]))

(2). Main.cpp first calls parameter->LoadParam(argc, argv) to load and check the parameters in the moses.ini configuration file and command line, where the model files are not loaded

(3). Main.cpp then calls StaticData::LoadDataStatic(parameter) to load weights and models according to the parameters of (2)
(3.1) StaticData::LoadDataStatic(parameter) calls StaticData::LoadData(Parameter *parameter)
(3.1.1) in StaticData::LoadData(Parameter *parameter), we load the weights and models by calling, e.g., StaticData::LoadLanguageModels(), LoadPhraseTables()
( in StaticData::LoadLanguageModels() calls LanguageModel* CreateLanguageModel(LMImplementation lmImplementation, const std::vector &factorTypes, size_t nGramOrder, const std::string &languageModelFile, float weight, ScoreIndexManager &scoreIndexManager , int dub) to create LM instances, where the highest level LM class is class LanguageModel : public StatefulFeatureFunction; LanguageModel is the parent class of LanguageModelSingleFactor and LanguageModelMultiFactor; LanguageModelInternal is a subclass of LanguageModelSingleFactor;
In Moses, the major specific interfaces of LM classes like LanguageModelInternal are: bool load(...) and float GetValue(const std::vector &contextFactor, State* finalState = 0, unsigned int* len = 0) const, where the former one is used to load a LM file while the later one calculates the probability for an n-gram saved in contextFactor; the class LanguageModel implements the general interface for a feature function, e.g., Evaluate(..)

(4). Main.cpp uses IOWrapper *ioWrapper = GetIODevice(staticData) to setup the input device (an input file or standard input)

(5). Main.cpp uses vector weights = staticData.GetAllWeights() to check on weights

(6). Main.cpp starts the main loop of translating input instances (text, confusion network, or lattice):
(6.1). use ReadInput(*ioWrapper,staticData.GetInputType(),source) to load an input, which is saved in source
(6.2). setup the translation manager by calling Manager manager(*source, staticData.GetSearchAlgorithm()), where by calling staticData.InitializeBeforeSentenceProcessing(source) we initialize the translation/language models for this sentence; the language model list is StaticDate.m_languageModel; the default search algorithm is SearchNormal;
(6.3). expand translation hypotheses stack by stack until the end of the input sentence using manager.ProcessSentence()
(6.3.1). ProcessSentence() first reset the statistics using staticData.ResetSentenceStats(m_source)
(6.3.2). ProcessSentence() then collects translation options for the input sentence
(6.3.3). ProcessSentence() calls the search algorithm to process the input using m_search->ProcessSentence()
(6.4). pick the best translation (maximum a posteriori decoding)

Sunday, January 15, 2012

How to install Ruby in your local directory from source code

tar -xzvf ruby-1.9.3-p0.tar.gz
cd ruby-1.9.3-p0
./configure --prefix=$HOME
make install

Saturday, January 14, 2012

Drawing figures with GNUplot

Recently, I have a need to draw a curve for a paper using latex. One of my friends suggests using gnuplot, which is really a nice tool to draw curves and other figures.
There are a lot of helpful examples on wikimedia:

How to burn CN image onto a DVD disc using NERO


2. in the popup Window, click on the tab whose title is ISO.

3. in the ISO tab, click the button OPEN to select the image that you want to burn to the disc.

4. after your selection, it will come back to the original Window;
on the TOP LEFT corner of the Window it says CD;
It has a drop down menu, and you need to click it and select DVD instead.

5. finish the burning process as usual.