Negation and Frequencies

One of the most important aspects to recording knowledge is deciding how to record "negative" information:  relationships that are known not to hold;  features that are absent; things that are not true.

LegendBurster does this for instances by recording the "truth status" of every unit of information recorded.  The words currently used for this purpose are "present" (equivalent to true) and "absent" (equivalent to false) -  referring to the presence or absence of an attribute.

An "entity-attribute-value" triple, together with its "truth status", is therefore considered a unit of information.

Descriptions of classes, however, require a more sophisticated representation of "truth status", as their attributes may be present over a range of frequencies.

LegendBurster therefore records the "truth status" of every piece of information recorded about a class by recording its "frequency of being present" or "frequency of being true".  In some contexts this would more correctly be described as an "expected frequency of being present".

Five frequency values are used, with an additional two which carry special constraints::

(1) Always

'Always' is the strongest frequency for indicating the presence of an attribute value, indicating that its associated attribute value must be present.

(2) Usually

'Usually' is a strong frequency for indicating the presence of a node item.  It means that its associated attribute value is likely to be present.  In other words, it would be quite surprising if the attribute value is not present in the class. However, there is still a possibility that it might not be present.

(3) Sometimes

'Sometimes' is the weakest type of frequency.  An attribute value with a 'sometimes' frequency may or may not be present in a class.

There are subtleties to the weighting given to "sometimes" attributes when undertaking similarity ranking procedures.  These derive from the possible use of a the special attribute value "<other values>" in conjunction with the "truth status" frequency value of "never", which will not be expanded on in this document.

(4) Rarely

'Rarely' is a strong frequency for indicating the absence of an attribute value. It means that its associated attribute value is likely to be absent in the model.

(5) Never

'Never' is the strongest frequency for indicating the absence of an attribute value.

(6) "Must Have"

"Must Have" is equivalent to an "Always" frequency expectation, with the additional effect of excluding from the similarity-ranking exercise any descriptions which do not have an attribute value matching the "Must Have" value.  This is implemented by (a) aborting the similarity-ranking score calculation for the mapObject under consideration as soon as an un-matched "Must Have" value is encountered, and (b) according the disqualified description a score equal to the minimum score for the comparison run.

(7) "Must Exclude"

"Must Exclude" is equivalent to a "Never" frequency expectation, with the additional effect of excluding from the similarity-ranking exercise any descriptions which have an attribute value matching the "Must Exclude" value.  This is implemented by (a) aborting the similarity-ranking score calculation for the mapObject under consideration as soon as a matched "Must Exclude" value is encountered, and (b) according the disqualified description a score equal to the minimum score for the comparison run.

Proceed to the Ontology Editor