FreeLing  3.0
Public Member Functions | Private Member Functions | Private Attributes
splitter Class Reference

Class splitter implements a sentence splitter, which accumulates lists of words until a sentence is completed, and then returns a list of sentence objects. More...

#include <splitter.h>

Collaboration diagram for splitter:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 splitter (const std::wstring &)
 Constructor.
void split (const std::list< word > &, bool, std::list< sentence > &ls)
 split sentences with default options
std::list< sentencesplit (const std::list< word > &, bool)
 Split and return a copy of the sentences.

Private Member Functions

bool end_of_sentence (std::list< word >::const_iterator, const std::list< word > &) const
 check for sentence markers

Private Attributes

bool SPLIT_AllowBetweenMarkers
 configuration options
int SPLIT_MaxWords
std::set< std::wstring > starters
 Sentence delimiters.
std::map< std::wstring, boolenders
std::map< std::wstring, int > markers
 Open-close marker pairs (parenthesis, etc)
bool betweenMrk
int no_split_count
std::list< int > mark_type
std::list< std::wstring > mark_form
sentence buffer
 accumulated list of returned sentences

Detailed Description

Class splitter implements a sentence splitter, which accumulates lists of words until a sentence is completed, and then returns a list of sentence objects.


Constructor & Destructor Documentation

splitter::splitter ( const std::wstring &  splitFile)

Constructor.

Create a sentence splitter.

References ERROR_CRASH, util::open_utf8_file(), and SAME.


Member Function Documentation

bool splitter::end_of_sentence ( std::list< word >::const_iterator  w,
const std::list< word > &  v 
) const [private]

check for sentence markers

Check whether a word is a sentence end (eg a dot followed by a capitalized word).

References util::is_capitalized().

void splitter::split ( const std::list< word > &  ,
bool  ,
std::list< sentence > &  ls 
)

split sentences with default options

list< sentence > splitter::split ( const std::list< word > &  v,
bool  flush 
)

Split and return a copy of the sentences.


Member Data Documentation

accumulated list of returned sentences

accumulated words of current sentence

std::map<std::wstring,bool> splitter::enders [private]
std::list<std::wstring> splitter::mark_form [private]
std::list<int> splitter::mark_type [private]
std::map<std::wstring,int> splitter::markers [private]

Open-close marker pairs (parenthesis, etc)

int splitter::no_split_count [private]

configuration options

int splitter::SPLIT_MaxWords [private]
std::set<std::wstring> splitter::starters [private]

Sentence delimiters.


The documentation for this class was generated from the following files: