Document Fragment. More...
#include <gdocfragment.h>
Classes | |
struct | Search |
Public Member Functions | |
GDocFragment (GDoc *doc, const GConceptRecord *root, size_t pos, size_t spos, size_t begin, size_t end, const R::RDate &proposed=R::RDate::Null) | |
GDocFragment (GDoc *doc, size_t begin, size_t end, const R::RDate &proposed=R::RDate::Null) | |
int | Compare (const GDocFragment &d) const |
int | Compare (const Search &search) const |
GDoc * | GetDoc (void) const |
bool | IsFlat (void) const |
const GConceptRecord * | GetRoot (void) const |
size_t | GetNbChildren (void) const |
R::RCursor< const GConceptRecord > | GetChildren (void) const |
R::RDate | GetProposed (void) const |
size_t | GetPos (void) const |
size_t | GetSyntacticPos (void) const |
size_t | GetBegin (void) const |
size_t | GetEnd (void) const |
R::RString | GetFragment (size_t max=0) |
void | AddChild (const GConceptRecord *rec) |
bool | Overlap (const GDocFragment *fragment) const |
void | Merge (const GDocFragment *fragment) |
void | Print (void) const |
virtual | ~GDocFragment (void) |
Private Attributes | |
GDoc * | Doc |
const GConceptRecord * | Root |
R::RString | Fragment |
size_t | Pos |
size_t | SyntacticPos |
size_t | Begin |
size_t | End |
R::RDate | Proposed |
bool | WholeDoc |
R::RContainer< const GConceptRecord, false, false > | Children |
Detailed Description
Document Fragment.
The GDocFragment class provides a representation for a document fragment. In practice, a fragment is anchored at a position and is defined by a text window.
Each fragment is associated to different (concept) nodes that are responsible for its selection in a query:
- a root node (Root).
- a set of child nodes (Children).
There are three kinds of fragments :
- A fragment that represents a whole document. The root node is always null, and the child nodes are those responsible for the selection.
- A fragment that represents a single node (the root node). This is the case of a fragment in a flat document selected by a given word.
- A fragment that is rooted in a node (the root one) and was selected by a set of child nodes. This can be the case of a XML fragment selected by two tags. The root node is then the deepest common parent of those child nodes.
Each search engine defines, eventually based on the document type, what a window is. To extract the text fragment of the document, the corresponding filter (GFilter class) is used.
Two document fragments are considered as identical if they are related to the same document and if they start at the same position
- Warning
- The GDdocFragment class manages pointers to GConceptRecord. It is never responsible for their deallocation.
Constructor & Destructor Documentation
GDocFragment | ( | GDoc * | doc, |
const GConceptRecord * | root, | ||
size_t | pos, | ||
size_t | spos, | ||
size_t | begin, | ||
size_t | end, | ||
const R::RDate & | proposed = R::RDate::Null |
||
) |
Constructor of a document fragment.
- Parameters
-
doc Document. root Root concept record. pos Position in the fragment centre. spos Syntactic position of the fragment centre. begin Beginning position of the window. end End position of the window. info Information.
GDocFragment | ( | GDoc * | doc, |
size_t | begin, | ||
size_t | end, | ||
const R::RDate & | proposed = R::RDate::Null |
||
) |
Constructor of a document fragment representing the whole document. A window must be specified (but it can be an empty one).
- Parameters
-
doc Document. begin Beginning position of the window. end End position of the window. info Information.
|
virtual |
Destruct.
Member Function Documentation
int Compare | ( | const GDocFragment & | d | ) | const |
Method to compare document fragments.
- Parameters
-
d Document retrieved to compare with.
int Compare | ( | const Search & | search | ) | const |
Method to compare a document fragment and a document fragment signature.
- Parameters
-
search Search.
GDoc* GetDoc | ( | void | ) | const |
Get the the document. If it is null, the URI is considered as unknown in the session
- Returns
- the pointer to the document.
bool IsFlat | ( | void | ) | const |
Look of the document fragment is a flat one. There are several cases where it is considered as flat :
- It has no selected concept node.
- The selected concept node has no parent.
- The fragment represents a whole document.
- Returns
- true if it is flat or false if not.
const GConceptRecord* GetRoot | ( | void | ) | const |
Get the root concept node corresponding to the fragment.
- Returns
- a pointer to a GConceptRecord.
- Warning
- The pointer may be null if the fragment corresponds to the whole document or if the structure trees are not built during the analysis.
size_t GetNbChildren | ( | void | ) | const |
- Returns
- the number of children.
R::RCursor<const GConceptRecord> GetChildren | ( | void | ) | const |
- Returns
- a cursor over the children.
R::RDate GetProposed | ( | void | ) | const |
- Returns
- the date of the suggestion.
size_t GetPos | ( | void | ) | const |
Get the position of the fragment centre.
- Returns
- a size_t.
size_t GetSyntacticPos | ( | void | ) | const |
Get the syntactic position of the fragment centre.
- Returns
- a size_t.
size_t GetBegin | ( | void | ) | const |
Get the beginning of the window fragment.
- Returns
- a size_t.
size_t GetEnd | ( | void | ) | const |
Get the end of the window fragment.
- Returns
- a size_t.
R::RString GetFragment | ( | size_t | max = 0 | ) |
Get the text fragment. If necessary, it is extracted from the file.
- Parameters
-
max Maximum number of character to extract. If zero, the whole fragment is extracted.
- Returns
- a R::RString.
void AddChild | ( | const GConceptRecord * | rec | ) |
Add a child record to the document fragment. The interval of the fragment is adjusted if necessary in order to contain the child (except if the fragment represents the whole document).
- Parameters
-
rec Concept record to add.
bool Overlap | ( | const GDocFragment * | fragment | ) | const |
Look if two fragments overlaps. In practice, the method follows different steps :
- It looks if at least one fragment represents the whole document.
- It looks if both fragments have the same selected node or no parent nodes nodes (for flat documents).
- It looks if the two intervals overlap.
- Parameters
-
fragment Fragment to compare with.
- Returns
- true if overlap.
void Merge | ( | const GDocFragment * | fragment | ) |
Merge the children of a fragment. The interval of the fragment is adjusted if necessary in order to contain all the children (except if the fragment represents the whole document).
- Parameters
-
fragment Fragment to compare with.
void Print | ( | void | ) | const |
Print some information related to the document fragment.
Member Data Documentation
|
private |
Reference to the document.
|
private |
Root concept record.
|
private |
The fragment.
|
private |
Position of the fragment.
|
private |
Syntactic position of the fragment.
|
private |
Beginning position of the fragment window.
|
private |
End position of the fragment window.
|
private |
Date where the fragment was proposed.
|
private |
Does the fragment correspond to the whole document ?
|
private |
Child concept records used by the query to select the node.