Matrix Measure. More...

#include <gmatrixmeasure.h>

Inheritance diagram for GMatrixMeasure:
[legend]

Public Types

enum  tType { Full, Sparse, NearestNeighbours }
 

Public Member Functions

 GMatrixMeasure (GSession *session, GPlugInFactory *fac, tObjType lines, tObjType cols, bool sym)
 
virtual R::RCString GetClassName (void) const
 
void SetElementsType (bool sym, tObjType lines, tObjType cols)
 
tObjType GetLinesType (void) const
 
tObjType GetColsType (void) const
 
double GetCutoffFrequency (void) const
 
virtual R::RString GetRootDir (void) const
 
virtual R::RString GetFilesName (void) const
 
void SetForceCompute (bool compute)
 
bool MustForceCompute (void) const
 
virtual void ApplyConfig (void)
 
virtual void Init (void)
 
virtual void Reset (void)
 
virtual void Measure (size_t measure,...)
 
virtual void Info (size_t info,...)
 
virtual double Compute (GObject *obj1, GObject *obj2)=0
 
virtual size_t GetId (void *obj, bool line)=0
 
size_t GetNbDiffElements (void)
 
virtual void CreateConfig (void)
 
virtual ~GMatrixMeasure (void)
 
- Public Member Functions inherited from RObject
 RObject (const RString &name=RString::Null)
 
int Compare (const RObject &obj) const
 
RString GetName (void) const
 
virtual void HandlerNotFound (const RNotification &notification)
 
void PostNotification (const hNotification handle)
 
void PostNotification (const RCString &name)
 
void PostNotification (const hNotification handle, T data)
 
void PostNotification (const RCString &name, T data)
 
void InsertObserver (tNotificationHandler handler, const hNotification handle, RObject *object)
 
void InsertObserver (tNotificationHandler handler, const RCString &name, RObject *object)
 
void InsertObserver (tNotificationHandler handler, const hNotification handle)
 
void InsertObserver (tNotificationHandler handler, const RCString &name)
 
void InsertObserver (tNotificationHandler handler, RObject *object)
 
void InsertObserver (tNotificationHandler handler)
 
void DeleteObserver (void)
 
void DeleteObserver (const hNotification handle, RObject *object)
 
void DeleteObserver (const RCString &name, RObject *object)
 
hNotification GetNotificationHandle (const RCString &name) const
 
RCString GetNotificationName (const hNotification handle) const
 
virtual ~RObject (void)
 
- Public Member Functions inherited from GMeasure
 GMeasure (GSession *session, GPlugInFactory *fac)
 
virtual ~GMeasure (void)
 
- Public Member Functions inherited from GPlugIn
 GPlugIn (GSession *session, GPlugInFactory *fac)
 
void InsertParam (R::RParam *param)
 
template<class T >
T * FindParam (const R::RString &name)
 
R::RCursor< R::RParamGetParams (const R::RString &cat=R::RString::Null)
 
void GetCategories (R::RContainer< R::RString, true, false > &cats)
 
GPlugInFactoryGetFactory (void) const
 
int Compare (const GPlugIn &plugin) const
 
int Compare (const R::RString &plugin) const
 
R::RString GetName (void) const
 
R::RString GetDesc (void) const
 
GSessionGetSession (void) const
 
virtual void Done (void)
 
virtual ~GPlugIn (void)
 

Private Member Functions

void InitMatrix (void)
 
void ChangeMemSize (void)
 
void ChangeStorageSize (void)
 
void AddIdentificator (size_t id, bool line)
 
void DirtyIdentificator (size_t id, bool line, bool file)
 
void DeleteIdentificator (size_t id, bool line)
 
void DestroyIdentificator (size_t id, bool line)
 
void HandleLineNew (const R::RNotification &notification)
 
void HandleLineModified (const R::RNotification &notification)
 
void HandleLineDel (const R::RNotification &notification)
 
void HandleLineDestroy (const R::RNotification &notification)
 
void HandleColNew (const R::RNotification &notification)
 
void HandleColModified (const R::RNotification &notification)
 
void HandleColDel (const R::RNotification &notification)
 
void HandleColDestroy (const R::RNotification &notification)
 
void UpdateSparse (void)
 
void UpdateNearestNeighborsRAM (void)
 
void UpdateNearestNeighborsFast (void)
 
void UpdateMem (void)
 
void UpdateStorage (void)
 
void AddValue (double val)
 
void DeleteValue (double &val)
 

Private Attributes

tType Type
 
bool Symmetric
 
R::RGenericMatrixMatrix
 
R::RMatrixStorage Storage
 
size_t MaxIdLine
 
size_t MaxIdCol
 
bool ChangeSize
 
bool DirtyMem
 
bool DirtyFile
 
double Mean
 
size_t NbValues
 
double CutoffFrequency
 
double Deviation
 
double DeviationRate
 
double MinMeasure
 
bool AutomaticMinMeasure
 
bool InMemory
 
bool InStorage
 
R::RString Dir
 
size_t NbNearest
 
size_t NbSamples
 
bool FastNN
 
tObjType Lines
 
tObjType Cols
 
bool ForceCompute
 

Additional Inherited Members

- Protected Attributes inherited from RObject
RString Name
 
- Protected Attributes inherited from GPlugIn
GPlugInFactoryFactory
 
GSessionSession
 
size_t Id
 

Detailed Description

Matrix Measure.

The GMatrixMeasure class provides a representation for a measure, $ [M]=m_{i,j} $, represented by a matrix of values, such as the similarity between two elements. A cutoff frequency can be specified.

The class maintains the mean and the deviation of the values computed. A minimum value is computed using:

$ \min([M])=\overline{[M]}+DeviationRate \cdot \sigma_{[M]} $

The measure may be symmetric: $ m_{i,j}=m_{j,i} \; \forall i,j $

The user has three choices to manage the matrix :

  1. The full matrix is maintained. It is the fastest access but it uses also the most amount of memory. This is particularly useful if most values are not null.
  2. A sparse matrix is maintained. It uses less memory that the full matrix, but is also slower. This is particularly useful if most values are null. If most values are not null, this is never a good choice.
  3. A list of nearest neighbors is maintained for each element (line). The class tries to compute, for each value $ i $, a given number of minimal values $ m_{i,\bullet} $. This type needs to have the matrix stored in memory or in a file. In this case, The Measure function returns the "on the fly" computed similarity between two objects. The Info method should be used to get a pointer to all nearest neighbors of a given object.

The user may also specified if the matrix is maintained in memory, in a file or both. In this latest mode, the matrix is loaded the first time and saved at the end. The matrix is created in memory at the first call to it.

If the full matrix is managed, changes (such as a modification of an element) imply only the re-computation of the corresponding line and column. In the other cases, the whole matrix is re-computed (also if only one element changed).

See the documentation related to GPlugIn for more general information.

Remarks
The class supposes that the identifiers of the elements are continuous and that the first identifier is one.

Member Enumeration Documentation

enum tType

Type of the matrix managing the measure.

Enumerator
Full 
Sparse 

Full matrix.

NearestNeighbours 

Sparse matrix. Nearest neighbors matrix.

Constructor & Destructor Documentation

GMatrixMeasure ( GSession session,
GPlugInFactory fac,
tObjType  lines,
tObjType  cols,
bool  sym 
)

Constructor of the measures between two elements of different types.

Parameters
sessionSession.
facFactory of the plug-in.
linesType of the elements in the lines.
colsType of the elements in the columns.
symSymmetric measure?
virtual ~GMatrixMeasure ( void  )
virtual

Destructor.

Member Function Documentation

virtual R::RCString GetClassName ( void  ) const
virtual

Virtual method inherits from R::RObject and that must be re-implemented in all child classes.

Returns
Name of the class.

Reimplemented from RObject.

void SetElementsType ( bool  sym,
tObjType  lines,
tObjType  cols 
)

Set the type of the elements.

Parameters
symSymmetric measure?
linesType of the elements in the lines.
colsType of the elements in the columns.
tObjType GetLinesType ( void  ) const

Get the type of the objects of the lines.

tObjType GetColsType ( void  ) const

Get the type of the objects of the columns.

double GetCutoffFrequency ( void  ) const

Get the value corresponding to a value that must be considered as null.

virtual R::RString GetRootDir ( void  ) const
virtual

Get the root directory where the measures are stored. By default, it correspond to "Dir/World/Cat" where :

  • Dir is the parameter specified by the user.
  • World is the name of the session.
  • Cat is the name of the corresponding plug-ins category (where all "/" characters in the names are replaced by "-").
virtual R::RString GetFilesName ( void  ) const
virtual

Get the name of the files that will be used. By default, it is the name of the plug-in.

void SetForceCompute ( bool  compute)

Specify if the values must be recomputed or if it is the stored value that is return.

Parameters
computeSpecify the recomputation mode.
bool MustForceCompute ( void  ) const

Inform if the values are recomputed or not.

virtual void ApplyConfig ( void  )
virtual

Configurations were applied from the factory.

Reimplemented from GPlugIn.

virtual void Init ( void  )
virtual

Initialize the measure

The storage of the matrix is opened (if necessary).

Reimplemented from GPlugIn.

virtual void Reset ( void  )
virtual

The measure must be re-initialized, i.e. all values must be considered as dirty. If the matrix is allocated, it is cleared. If the measures are only managed through the storage, the storage is also cleared.

Reimplemented from GPlugIn.

virtual void Measure ( size_t  measure,
  ... 
)
virtual

Get a measure between two elements. If the values are in memory, the method verifies if the matrix must be allocated or modified in memory. If the values are only in files, the storage is modified if necessary.

If the matrix manages the nearest neighbors, if the second element is not a nearest neighbor of the first one, the measure is null. If the matrix is neither stored in memory nor in a file, the result is always the measure between the two elements.

Parameters
measureType of the measure (not used since only one value is stored for each element). The class supposes that three more parameters are passed:
  • Identifier of the first element (size_t).
  • Identifier of the second element (size_t).
  • A pointer to a variable of type double that will contain the result.

Implements GMeasure.

virtual void Info ( size_t  info,
  ... 
)
virtual

Access to several information related to the matrix of measure. The matrix is eventually updated (in memory or in file) if necessary.

If the matrix is not stored in memory or in file, the statistics are computed on the whole matrix. If the matrix manages the nearest neighbors, the statistics are computed on the pair tested to detect the nearest neighbors.

Parameters
infoType of the information to take. Three values are accepted:
  • case '0': The minimum value.
  • case '1': The mean value.
  • case '2': The deviation of the values.
  • case '3': The nearest neighbors of a given object.

For cases '0' to '2', The class supposes that one more parameter is passed:

  • A pointer to a variable of type double that will contain the result.

For case '3', The class supposes that two more parameters are passed:

  • The identifier of the object.
  • A pointer to a pointer of type RSparseVector that will contain the result.

Reimplemented from GMeasure.

virtual double Compute ( GObject obj1,
GObject obj2 
)
pure virtual

Compute the measure for two elements. This method must be overloaded.

Parameters
obj1Pointer to the first element.
obj2Pointer to the second element.
double MyMeasure::Compute(void* obj1,void* obj2)
{
return(static_cast<GProfile*>(obj1)->SimilarityIFF(*static_cast<GProfile*>(obj2),otSubProfile));
}
virtual size_t GetId ( void *  obj,
bool  line 
)
pure virtual

Get the identifier of an object of a line or a column.

Parameters
objPointer to the object.
lineObject in a line ?
Returns
the identifier.
size_t GetNbDiffElements ( void  )

Return the total number of different elements. If the measure is related to one type only of elements (for example profiles), the implementation of the method looks like:

size_t MyMeasure::GetNbDiffElements(void)
{
return(Session->GetNbProfiles());
}
If the measure is related to two types of elements (for example profiles
and documents), the implementation of the method looks like:
@code
size_t MyMeasure::GetNbDiffElements(void)
{
return(Session->GetNbDocs()+Session->GetNbProfiles());
}
void InitMatrix ( void  )
private

Creates the matrix in memory. If necessary, the matrix is loaded from the storage

void ChangeMemSize ( void  )
private

change the size in the internal structure.

void ChangeStorageSize ( void  )
private

change the size in the storage.

void AddIdentificator ( size_t  id,
bool  line 
)
private

An element is added. Look and remember if the memory or the storage must be changed. If the matrix is managed in memory, the storage is not modified until the de-connection from the session.

Parameters
idIdentifier of the element.
lineElement is a line or a column?
void DirtyIdentificator ( size_t  id,
bool  line,
bool  file 
)
private

An element is changed. If the matrix is fully managed, the values corresponding to the line and column are setting to NAN and the matrix is declared dirty. In the other cases, the matrix is just declare dirty.

If the matrix is managed in memory and in files, the storage is not modified until the de-connection from the session.

Parameters
idIdentifier of the element.
lineElement is a line?
fileFile must be affected too?
void DeleteIdentificator ( size_t  id,
bool  line 
)
private

An element is deleted and all the measures related to it are modified. In practice, it calls DirtyIdentificator.

Parameters
idIdentifier of the element.
lineElement is a line?
void DestroyIdentificator ( size_t  id,
bool  line 
)
private

An element is destroyed and all the measure related to it are modified. In practice, it calls DirtyIdentificator.

Parameters
idIdentifier of the element.
lineElement is a line?
void HandleLineNew ( const R::RNotification notification)
private

This method handles a notification that an object was created in memory.

Parameters
notificationNotification received.
void HandleLineModified ( const R::RNotification notification)
private

This method handles a notification that an object was modified.

Parameters
notificationNotification received.
void HandleLineDel ( const R::RNotification notification)
private

This method handles a notification that an object was deleted from memory.

Parameters
notificationNotification received.
void HandleLineDestroy ( const R::RNotification notification)
private

This method handles a notification that an object was destroyed from the system.

Parameters
notificationNotification received.
void HandleColNew ( const R::RNotification notification)
private

This method handles a notification that an object was created in memory.

Parameters
notificationNotification received.
void HandleColModified ( const R::RNotification notification)
private

This method handles a notification that an object was modified.

Parameters
notificationNotification received.
void HandleColDel ( const R::RNotification notification)
private

This method handles a notification that an object was deleted from memory.

Parameters
notificationNotification received.
void HandleColDestroy ( const R::RNotification notification)
private

This method handles a notification that an object was destroyed from the system.

Parameters
notificationNotification received.
void UpdateSparse ( void  )
private

Update the sparse matrix. All the measures are re-computed and stored either in memory or in files (depending the options).

void UpdateNearestNeighborsRAM ( void  )
private

Update the nearest neighbors. All the measures are re-computed and stored either in memory or in files (depending the options).

This method tries to limit the amount of RAM used to compute the nearest neighbors.

void UpdateNearestNeighborsFast ( void  )
private

Update the nearest neighbors. All the measures are re-computed and stored either in memory or in files (depending the options).

This method tries to optimize the computation of the nearest neighbors.

void UpdateMem ( void  )
private

All the measures must be updated in memory. If the matrix is fully managed, only the changed values are re-computed. Else, everything is re-computed.

void UpdateStorage ( void  )
private

All the measures must be updated in the storage. If the matrix is fully managed, only the changed values are re-computed. Else, everything is re-computed.

void AddValue ( double  val)
private

A value is added to the statistics.

Parameters
valValue to add.
void DeleteValue ( double &  val)
private

A value must be removed from the statistics. The value is set to NAN.

Parameters
valValue to delete.
virtual void CreateConfig ( void  )
virtual

Create the parameters.

Reimplemented from GPlugIn.

Member Data Documentation

tType Type
private

Type of the measure.

bool Symmetric
private

Is the measure symmetric, i.e. measure(i,j)=measure(j,i) ?

R::RGenericMatrix* Matrix
private

Matrix in memory (if needed).

R::RMatrixStorage Storage
private

Storage of the matrix (if needed).

size_t MaxIdLine
private

Maximal identifier of the lines.

size_t MaxIdCol
private

Maximal identifier of the columns.

bool ChangeSize
private

Matrix size has changed (in memory or in file).

bool DirtyMem
private

Matrix must be recomputed in memory.

bool DirtyFile
private

Matrix must be recomputed on file.

double Mean
private

Mean of the measures.

size_t NbValues
private

Number of values computed.

double CutoffFrequency
private

Cutoff frequency;

double Deviation
private

Deviation of the measures.

double DeviationRate
private

Used to compute the default minimum of the measure.

double MinMeasure
private

Static minimum of measure.

bool AutomaticMinMeasure
private

Compute automatically minimum of measure.

bool InMemory
private

Measures in memory.

bool InStorage
private

Measures in a storage.

R::RString Dir
private

Directory containing the binary files.

size_t NbNearest
private

Number of nearest neighbors.

size_t NbSamples
private

Number of samples used to computed the nearest neighbors.

bool FastNN
private

Fast computation of the nearest neighbors.

tObjType Lines
private

Type of the elements representing the lines.

tObjType Cols
private

Type of the elements representing the columns.

bool ForceCompute
private

Force computing.